Stuff. And nonsense.

File Metadata Parsing Library

I built this library as a framework for extracting and caching useful metadata (including icons & thumbnails) from common file types. The API is designed to be extensible so new file type parsers can be plugged-in easily, either within the library itself (I'll be adding support for more file types over time) or from external assemblies (i.e. your code) if you don't mind a bit of D.I.Y...

Download: Metadata.1036 version Revision 1036, 17 May 2009
Compressed (zipped) file, 733.2 KB

How does it work?

Given a path to a file you want to extract metadata from, the library first checks to see if there is a previously-created metadata cache (a "sidecar" *.dat file in the same directory). If it finds one, a quick check is done to see if it's up-to-date - if so, the cached data is loaded into memory and you're good to go.

If no cache file exists, or it is out of date, the file is parsed and all supported metadata is extracted and cached. Icons (thumbnails where relevant) are also generated at various sizes & stored in the cache. Parsing can take some time, depending on the type and size of the file.

The API also provides the ability to bypass the cache and re-parse every time - obviously this can be both a processor and memory hog, so use with caution.

What file types are supported?

Good question. I should probably do some clever assembly-reflection here so this list is always up-to-date, but I can't be bothered. So, here's a possibly out-of-date list:

Documents

  • Microsoft Office 2007/Office Open XML (docx, xlsx, pptx etc)
  • Portable Document Format (pdf)
  • XML Paper Specification (xps)
  • Rich Text Format (rtf)
  • Plain text (txt)

Audio

  • MP3
  • Windows Media Audio (wma)

Video

  • Flash Video (flv)
  • Windows Media Video (wmv)

Note: thumbnailing of video files is only available if there's an executable binary of FFmpeg on the same machine that code is running on...

Other

  • Various common image formats (jpeg, png, gif, bmp, tif)
  • Flash (swf)
  • ZIP archives
  • Anything you want to add yourself...

Does it work under Mono?

Not sure. Should do.

Show me some code!

using System;
using Mark.Metadata;

public class MyApp
{
    public static void Main (string[] args)
    {
        // A PDF document
        PdfMetadata pdfMetadata = (PdfMetadata) FileMetadata.Load(@"c:\path\to\a\doc.pdf");
        Console.WriteLine("Title: " + pdfMetadata.Title);
        Console.WriteLine("Creator: " + pdfMetadata.Creator);
        Console.WriteLine("File size: " + pdfMetadata.FriendlyFileSize);
    
        // An MP3 file
        AudioMetadata songMetadata = (AudioMetadata) FileMetadata.Load(@"c:\path\to\an\audiofile.mp3");
        Console.WriteLine("Artist: " + songMetadata.Artist);
        Console.WriteLine("Duration: " + songMetadata.Duration);
        if (songMetadata.HasArtwork)
        {
            using (System.Drawing.Bitmap artwork = songMetadata.GetArtwork())
            {
                // Do something crazy with the artwork...
            }
        }
    }
}

Standing on the shoulders, etc

The library leverages functionality from a couple of open-source libraries - the ubiquitous #ZipLib de/compression library, and for parsing data from audio/video files, the excellent TagLib# (which is in turn a port of the C++ TagLib library).

Note: thanks to the TagLib# library, it would be pretty straightforward to add support for other audio/video file types. Sub-class AudioMetadata or VideoMetadata, and away you go...

Latest Releases