Understanding The Life Of Video Metadata From Production To Publishing & Why It’s So Important

Understanding The Life Of Video Metadata From Production To Publishing & Why It’s So Important

Share on

When we usually refer to “metadata” with regard to video, we’re talking about all the textual information that accompanies a video and helps search engines better understand the content as well as helps users find a video on the internet.  As an example, in the YouTube world, that means during the publishing of your video, you should take advantage of the appropriate areas for keywords, tags, titles, descriptions, and even annotations and captions.  All of that “metadata” assists search engines in serving your video as a relevant result when a user types in specific words.

However, in the true sense of the word, metadata goes far beyond on-page textual information when it comes to video.  As Wikipedia puts it,

Metadata is particularly useful in video, where information about its contents (such as transcripts of conversations and text descriptions of its scenes) are not directly understandable by a computer, but where efficient search is desirable.

While metadata is often associated with on-page texual information (titles, descriptions, etc…), it can and should be leveraged to provide much deeper information.  Throughout the process of creating and publishing a video, there are many opportunities to provide video metadata.

For example, Direct Temporal Metadata, or time-based metadata is information that is tied to the timeline within a video.  A good example of one form of temporal metadata is closed captions.  However, timed-text can also include information regarding when and what music is being played, where the scene is, when there is laughter, who is speaking, etc…  There is a ton of information or metadata that can be helpful in relating the content of a video.

Metadata Provides Much Needed Context for Search

While voice recognition software has come a long way over the years, finding the appropriate context for a video is going to be difficult for a computer to understand, much less find what a user needs when they type in certain keywords.  A computer isn’t going to scan The Godfather video file and tell you, “The rise of Michael Corleone to power in a New York crime family,” unless someone tells the computer that’s what it is.  And it’s especially not going to be able to tell you specifics, nuances, or context without help.

That’s why from the very beginning, every take of video should have metadata associated with it to make it easier to process, provide context, and manage archives.

Metadata Is Not Only Great For Internet Search, But All Search

Whether you’re an editor looking for a particular take of video or an archivist trying to find a specific video or footage, the process of finding exactly what you need should be simplified by having detailed descriptions and time stamps.  A title might be able to tell you a general overview, but a user is going to have to watch the video, however long it is, to see if it’s exactly what they need.  That’s inefficient.

Again, I find Albert Brooks’ Modern Romance to be a pretty good indicator of what I mean.  It’s the exact same clip I used for this article about foley artists, but it also shows why writing good descriptions helps when you are looking for something specific:

The person responsible for finding “Hulk Running” went with whatever he could find in the archives, but as Albert Brooks points out, “That should be ‘Hulk Screaming.’  That’s the effect!”  Describing sight and sound requires detail.  It means being able to find what you want without having to go through tons of footage.  “Like finding a needle in a haystack” is apt here.

Where can you use metadata?  In almost every phase of the production.


Cameras record pertinent technical information about the take: aperture, frame rate, shutter speed, etc.  Also, your clapperboard you use before each take has information about the scene, the take number, and camera angle.  And, it is probably a good idea to highlight parts of the script and keep track of all the different aspects of the take and what accurately describes that take: all the technical information, all the clapperboard info, and whether a line was flubbed or an airplane flew overhead and destroyed the sound, and at what time did these highlights occur?

All of this is so that when the video gets passed on to the editor, they can easily find the footage.  You can even rate the take to give the editor a better idea of which ones are best.


This is where descriptions become easier to manage, as many of the top editing software programs allow you to give context to each piece of footage.  Hopefully, by this time, they have a good idea of what takes are good and which ones might not be useable.  Each piece of footage dumped into an editing program can be given a title and keywords.  For instance, Final Cut Pro X will sort clips with the same keywords into the same bins.  How accurately you want your editing software to sort the clips into their various bins and sub-bins depends on how detailed you want to get with the metadata for each clip.

One of the other things an editing software like Final Cut Pro X has is a facial recognition and shot detection technology that helps in describing the video.  This type of metadata is good but it still doesn’t tell you all you need to know about a shot, although it does get us slightly closer to getting a computer to derive metadata from video all on its own without text.

For the most part, the descriptions and keywords on these clips should be as close to the original descriptions as possible, so as there is no confusion between production and post-production.  This is no time for semantics.  Everything should be defined: if you think it’s a close-up and the editor thinks it’s an extreme-close up and names it something else, then it might be hard to communicate what is needed.  So the editor might want to describe something as the creators described it, but possibly provide their own notes to better understand how to piece the footage together.

Completed Video: Publishing

When the video is complete, it’s important for it to have a mix of general and specific keywords that describe it.  And having a full transcript with specific times set for highlights will help the people publishing your video to understand what it contains, and gives a search engine a better idea of what it’s looking for.  So if your video is about your “Fun Experience At TED,” it should give specific metadata that narrows down how your experience is different from everyone else publishing a video about their fun experience at TED, and that includes some general and specific terms in the title itself: “Mark Robertson, ReelSEO Founder, Fun Experience At TED” is way better than the general, tells-us-nothing example previously.

We know that YouTube provides all sorts of metadata fields for you to describe what your video is about.  But what if you are publishing a non-YouTube video to your own site?  Well, HTML allows you to place metadata into the description of your video and page.  It’s important for the people who maintain your website to know what these keywords are so that they can enter it into the page properly.

Metadata for Archiving

Transferring any valuable video into the digital space, and being able to get that video seen, will require some work.  Hopefully someone gave you a whole bunch of notes that describe each video, but if you’re transferring from the VHS days the chances are you won’t have such a luxury.  You might have a label that gives a general idea of what’s on the tape, but not much more, if that.  But once each video is watched and given the proper description, that hard work can pay off nicely.  However, it shows the importance of being able to describe the video as accurately as possible while shooting and editing so that this collection of videos doesn’t pile up and then you can’t find anything anymore.

Leverage Metadata for Once the video is published and/or archived, now it’s a lot easier to find: whether it’s stored on a hard drive or on a website, being able to use text-based search powers will make your video that much more valuable.


Video Industry

Share on

Read More Insights

©2021 Tubular Insights & Tubular Labs, Inc.