On tagging, especially for photos

Print Friendly

Several years ago after my mother died, I started a big project to get all the family photos digitized and online. Since then I’ve done the photos for our immediate family, but I still have more to do, especially on my wife’s side.

Because I wanted these photos to be self contained and not require extra files for metadata such as tags, I devised a naming scheme that included the basic information about the photo.

Let me make up an example for discussion:

20091123-101Main-RSS-Gideon-00001.jpg

The first field is the date, and the day or the day/month are optional. Indeed, I even allow things like “1940s” for old photos where I’m not sure of the date except for the decade.

The next field might stand for the location, and is a shortcut code for a street address. I know this because I do keep a file of mappings of tag abbreviations to full tag names. This is the only extra file for the whole photo library.

The next is also a shortcut tag, my initials, which expand to my name “Bob Sutor”.

The next field is longer, and happens to be the name of one of my cats. The final is ignored and is simply used to allow multiple photos that would otherwise have the same name.

I wrote an online photo album in PHP that processes the photos and the tags, and builds databases of both. Here is how the above file name might be processed:

  • Expand “20091123” to “2000s,” “2009,” “November.”
  • Look up “101Main” and expand into the tags “101 Main Street,” “San Francisco,” “California.”
  • Look up “RSS” and expand into “Bob Sutor,” “Person”.
  • Look up “Gideon” and expand into “Gideon,” “Cats.”

Any shortcut tags in the name are expanded and there is only one iteration. Multiple people can be included and the tags are completely arbitrary. The could just be flowers, for example. The whole key to this is the file of mappings.

The photo software I wrote displays the tags for each photo and then allows me to check some boxes to get all photos that include those tags. For example, I can get all photos that contain at least my two children and one of the cats in a July. This gives very nice views into the collection which is now about 3500 images. While I have not yet modified my software to put keywords directly into the image file, that is a likely feature I will add soon.

I’ve yet to find any online photo service that provides similar capabilities. Moreover, while software like Picasa does do some tagging, it has a long way to go to have the boolean calculus necessary to nicely view big albums.

I want to be able to check some boxes and see each photo that matches, for example, “Bob is present; Judith is not; it is not June, July, or August; exactly one of Katie or William is in it.”

Given all the tags and computed tags, this is not that difficult to do, though the result list might end up being empty. So while I like some of the features of photo software like face identification and online syncing, for me they are not quite good enough to replace what I hacked together a few years ago. I suspect they will add these features eventually, but it is taking a very long time.

A final note, Google Picasa is great at face recognition for people, but not cats. Going that next step for more general recognition and suggested tagging would also be worthwhile.


5 Comments

  1. Bob:

    Rather than trying to encompass all of the “tags” regarding the contents of an image within a filename, why not use standardized embedded photo metadata? There are many applications that can be used to write textual information to JPEG files. There are even a number of others that can write this highly structured information to TIFF, PSD and other image files. Even Picasa (which you mention above) can be used to write to the Caption and Keyword fields using the IPTC metadata standard, at least for Jpeg files.

    There are already a number of PHP web gallery software options (many are LAMP – Linux Apache MySQL PHP) open source applications that can extract this IPTC information and make it visible on a website. Some can even use it for searching a databases of images. Coppermine and Menalto Gallery are two popular options that you might want to check out.

    More info about the various photo metadata standards, including tutorials on how to embed metadata in detail are all available from the http://www.photometadata.org/ website.

    Hope that helps.

    David

  2. Thanks, David, I’ll take a look.

  3. @sutor Most people don’t seem to rename their photo file names. I like to batch rename the file based on EXIF data using Flexible Renamer (on Windows). With the date stamp in place, I then go back and hand rename each file name with location and subject. (I’ve been trying to move to multi-platform based software, so if you find one, please bookmark it, and I’ll see that!)

    On my photoblog, I’ve been geocoding the EXIF data in the downsampled images with Geosetter for Windows. I’m not yet sure what search engines might do with latitude and longitude in EXIF information, so this is an experiment.

    Finally, within the past few weeks, I’ve added Zenphoto to the suite of tools on my domain. Zenphoto is not as popular as (Menalto) Gallery, but I like that automated view resizing, caching, slide shows and feeds are supported just by sending files via FTP to the right directories. (Menalto Gallery requires storing the images into MySQL, which adds an extra import step, and creates issues with the 2MB PHP upload constraint).

    I have mentally written up a blog post about various ways to separate out photo archiving, publishing/sharing and printing … so that may hit my professional blog over the next few weeks.

  4. David, thanks, I’ll take a look at ZenPhoto. I always rename the images because I just can’t imagine trying to make sense of gigabytes of files all with names like IMG7568.jpg.

  5. I’ve been trying to perfect my photo handling for some time now. I finally settled on using date-time for filenames as sometimes my family has been out with more than one camera and, as long as the time is correct in the cameras, this puts them in chronological order. As far as processing, I ended up using a couple of dumb scripts. The first one downloads the photos to a folder, renames them, then auto-rotates them. The second one prompts for a half-dozen IPTC fields then sets the whole folder. I use digiKam to add or customer specific IPTC fields on individual photos. It will upload to my Flickr account (backup and sharing) retaining the metadata. It’s been a fairly clean arrangement.

Comments are closed