What are you giving away?
Meta data is very useful on a number of levels. But, as Geoff Meads points out, if it falls into the wrong hands, it can tell whoever has intercepted it a lot about the sender…
Back when I ran marketing for an AV manufacturer, I had the good fortune to get to know a lot of the journalists in our industry. Many remain good friends and many more as Facebook contacts.
One of these posted a rather fascinating question yesterday ‘Why can’t I make a true digital copy?’.
His question was prompted by another post by a well-known mastering engineer out of the USA. The upshot was, when he opened a file in a music program and then saved it again without making any changes to the content of the file the new file had a different file size.
Although the question was originally about music, my friend tried the same exercise in Photoshop and got the same result. Even with no edits having been made, the file size of the newly saved file was different! So, what’s going on here?
Weirdly, the answer lies, or can be explained, using the recent lawsuit between Johnny Depp and Amber Heard.
Yes, really! It’s all down to the meta data.
Not making judgements
Let me start by saying I’m not going to discuss the legal case, or it’s result per se. However, one element of the proceedings related to several images, supplied by Ms. Heard, that allegedly showed injuries she suffered at Mr Depp’s hand.
Mr Depp’s lawyers called in a digital forensics expert called Bryan Neumeister as an expert witness who looked at and reported on the images. Among other remarks, he was able to show that the images submitted were not the originals from the camera (an iPhone).
In other words, they may have been edited and so could not be taken as an accurate record.
How did he prove this? He showed the EXIF meta data stored within the image. This data revealed the source of the submitted image was an Apple photo editing program called ‘Photos’ and not the iPhone the photos were originally taken on.
What is meta data?
Now I know the term ‘meta data’ will not be new to the readers of the magazine. If you’re reading this, you’ve probably come across it before. Maybe you have heard it in reference to music files that contains detail on the artist, track name, track duration and even cover artwork.
But do you know just how much meta data there is in today’s network traffic? To miss-quote the ‘Hitchhikers Guide to the Galaxy’ for a moment: “It’s big. You just won’t believe how vastly, hugely, mind-bogglingly big it is”.
Taking our music file as an example, we’ve already mentioned artist and track names, track duration and cover art. But you’ll also find other information embedded in the file and the volume of it is growing over time.
Most MP3 files, as a minimum, use the ID3 standard for meta data. ID3 uses a maximum of 30 characters per data field. In addition to the fields already noted (artist, track name, etc.) you’ll also find the track information as it relates to the album it’s included in (i.e. ‘track 1 of 10’) and the name of the album the track comes from.
Also, there is the genre the artist considers the music to be in (i.e. ‘Blues’) and the year in which the track was released.
The later ID3v2 format extends this system with longer fields using a maximum of 60 characters. It also includes a host of new fields such as composer, conductor, lyricist, recording location and even a star-based rating system.
Is all this stuff useful?
It’s easy to see how useful this information is for applications like music streaming. Displaying a subset of this information has become standard for devices streaming music from remote sources over networks. Since the meta data is already embedded in the file being streamed, there are no additional network requests needed to get the data.
In fact, we’re now in a world where this data is the expected norm. People have become so accustomed to it that they get frustrated that using the WAV file format for music (WAV is uncompressed and sounds better to most ears than compressed files like MP3s) you cannot natively add cover art as it’s not supported. Cover art is expected yet absent from a native WAV file.
It’s not just audio though…
It will be no surprise to hear that video files have similar arrangements to carry and deliver meta data. However, music and movies are just the tip of the iceberg for meta data. Let’s look at another system we use every day – email.
Straight away there are a few data fields that we know must be present in an email message. Let’s start with the sender’s address, destination address and subject line.
These are the default fields we create when composing an email so they must be included. The message body and any attached files must be there too, right? Yes, but there’s more…
Email is normally sent using the Multipurpose Internet Mail Extensions or ‘MIME’ format.
MIME is an extension of the Simple Mail Transfer Protocol or ‘SMTP’ system and allows for all sorts of additional data to be sent over what SMTP provides for, one of these items being the status of a DNS record for the domain sending the email. This helps spam filters detect if a domain is being used to send email but doesn’t have authority to do so. Useful stuff!
Other email meta data within the protocol includes a reply-to address (this can be different from the sender’s address), the time the email was sent and a message ID so that subsequent replies can be grouped together in your email program.
The meta data can also carry the character format, details of the sending program (i.e., Outlook or Gmail), the sender’s IP address and a host of other things. In fact, it’s not uncommon for the size of the meta data being sent to be far larger than the size of the message itself!
The takeaway here is twofold. Firstly, even simple email messages can place quite a bit of data on our networks and secondly, anyone intercepting your email messages can learn quite a bit about you, your location and equipment without even decrypting the message itself!