Request for comments on a JPEG metadata Perl module

Discussion in 'Perl Misc' started by Stefano Bettelli, Jun 26, 2004.

  1. Hi,

    I got recently interested in the possibility of designing a Perl
    library for reading and modifying JPEG image metadata (with Exif
    info, IPTC info, comments, thumbnails and so on). This kind of
    additional data stored in the image itself is very useful for
    organising digital photo collections. For various reasons, the
    existing Perl libraries and programs do not fully satisfy me,
    so I decided to enter the arena and write a Perl module (this is
    also a good way to learn the language better ...).

    I would like to ask you some suggestions
    on how to design this module:

    1) Do you think that submitting this module to CPAN is worth of,
    or do you think that what is available is already sufficient?
    2) What is the best name for the module?
    I am currently using Image::MetaInfo::JPEG.
    3) How can I decide what is the minimum Perl version
    for running the module?
    4) Do you have any idea on how it could be extended? Whether
    there are interesting functionalities I did not think about?
    Do you have any suggestion on code style?

    Every other suggestion is of course welcome. You can download
    (FOR THE TIME BEING, let's say for next two weeks) the module
    at the following address:

    http://82.229.136.165/IMJ/

    In the following I am listing the main functionalities supplied
    by my module. The purpose of this module is to read/modify/
    rewrite meta-data segments in JPEG files, which can contain
    comments, thumbnails, Exif information (photographic parameters),
    IPTC information (editorial parameters) and similar data.

    Each JPEG file is made of consecutive segments (data blocks
    prefixed by a 2 byte segment code and a 2 byte segment length),
    exception made for the actual picture data (the so called entropy
    coded segment(s), which are indeed row data). Most of these
    segments specify parameters for decoding the picture data into a
    bitmap; some of them, namely the COMment and APPlication segments,
    contain however meta-data, i.e., information about how the photo
    was shot (usually added by a digital camera) and additional notes
    from the photograph. These additional pieces of information are
    especially valuable for picture databases, since the meta-data
    can be saved together with the picture without resorting to
    additional database structures.

    This module works by breaking a JPEG file into individual segments.
    Each file is associated to an Image::MetaInfo::JPEG structure
    object, which contains one Image::MetaInfo::JPEG::Segment object
    for each segment. Segments with a known format are then parsed,
    and their content can be accessed in a structured way for display.
    Some of them can even be modified and then rewritten to disk.

    The current state is the following:

    Segment Possible content Status
    ------------------------------------------
    COM User comments parse/read/write
    APP0 JFIF data (+ thumbnail) parse/read
    APP1 Exif or XMP data parse
    APP2 FPXR data or ICC profiles parse
    APP3 additional EXIF-like data parse
    APP4 HPSC nothing
    APP12 PreExif ASCII meta parse[devel.]
    APP13 IPTC and PhotoShop data parse/read/(write IPTC [devel.])
    APP14 Adobe tags parse

    "Parse" means that the segment content is decoded and stored
    in low-level records. "Read" means that these data are available
    in a more organised way at a higher level. The package contains
    a quite detailed perldoc page, which you can read for further
    info. This is the index:

    1) STRUCTURE OF JPEG PICTURES
    2) MANAGING A JPEG STRUCTURE OBJECT
    3) MANAGING A JPEG SEGMENT OBJECT
    4) MANAGING A JPEG RECORD OBJECT
    5) COMMENTS ("COM" segments)
    6) JFIF data ("APP0" segments)
    7) IPTC DATA (from "APP13" segments)
    8) CURRENT STATUS
    -) Known Problems
    -) References
    -) OTHER PACKAGES (the "competitors")

    I plan to add read/write support for Exif data in a few weeks.
    The module contains already a test suite with 67 tests.

    Thank you in advance for every suggestions,
    best regards,
    Stefano Bettelli
    Stefano Bettelli, Jun 26, 2004
    #1
    1. Advertising

  2. Stefano Bettelli

    GreenLight Guest

    Stefano Bettelli <> wrote in message news:<>...
    > Hi,
    >
    > I got recently interested in the possibility of designing a Perl
    > library for reading and modifying JPEG image metadata (with Exif
    > info, IPTC info, comments, thumbnails and so on). This kind of
    > additional data stored in the image itself is very useful for
    > organising digital photo collections. For various reasons, the
    > existing Perl libraries and programs do not fully satisfy me,
    > so I decided to enter the arena and write a Perl module (this is
    > also a good way to learn the language better ...).


    I have been wanting to create a catalog of my photos for quite some
    time. I have thousands of photos that I have taken over the past five
    years, and it looks like your module could help me quite a bit.

    > 4) Do you have any idea on how it could be extended? Whether
    > there are interesting functionalities I did not think about?


    I guess that I should read the information regarding the JPEG format
    to get the answer, but maybe you know this: would it be possible to
    add segments of information to the file that were of my own design? I
    would like to add some flags to each photo that would show that I have
    completed cataloging it.

    I used you module to parse a file from my camera (Casio QV-2000UX).
    Here is part of the info that was returned:

    ********** APP1 --> IFD0 ********** (11 records)
    [ Make]<0x010f> = [ ASCII] "CASIO\00"
    [ Model]<0x0110> = [ ASCII] "QV-2000UX\00"
    [ Orientation]<0x0112> = [ SHORT] 1
    [ XResolution]<0x011a> = [ RATIONAL] 72/1
    [ YResolution]<0x011b> = [ RATIONAL] 72/1
    [ ResolutionUnit]<0x0128> = [ SHORT] 2
    [ Software]<0x0131> = [ ASCII] "99.09.07.11.08\00
    \00\00\00\00\00\00\00\00"
    [ DateTime]<0x0132> = [ ASCII] "2001:03:07
    13:53:27\00"
    [ YCbCrPositioning]<0x0213> = [ SHORT] 1
    [ ExifOffset]<0x8769> = [ LONG] 210
    [ SubIFD]<......> = [REFERENCE] --> 19692dc
    ********** APP1 --> IFD0 --> SubIFD ********** (21 records)
    [ ExposureTime]<0x829a> = [ RATIONAL]
    10000/653167
    [ FNumber]<0x829d> = [ RATIONAL] 20/10
    [ ExposureProgram]<0x8822> = [ SHORT] 2
    [ ExifVersion]<0x9000> = [ UNDEF] 30 32 31 30
    [ DateTimeOriginal]<0x9003> = [ ASCII] "2001:03:07
    13:53:27\00"
    [ DateTimeDigitized]<0x9004> = [ ASCII] "2001:03:07
    13:53:27\00"
    [ ComponentsConfiguration]<0x9101> = [ UNDEF] 01 02 03 00
    [ CompressedBitsPerPixel]<0x9102> = [ RATIONAL]
    2048000/480000
    [ ExposureBiasValue]<0x9204> = [SRATIONAL] 0/3
    [ MaxApertureValue]<0x9205> = [ RATIONAL] 20/10
    [ MeteringMode]<0x9207> = [ SHORT] 5
    [ Flash]<0x9209> = [ SHORT] 1
    [ FocalLength]<0x920a> = [ RATIONAL]
    126865/10000
    [ MakerNote]<0x927c> = [ UNDEF] 00 14 00 01
    00 03 00 00 ... (238 more values)
    [ FlashPixVersion]<0xa000> = [ UNDEF] 30 31 30 30
    [ ColorSpace]<0xa001> = [ SHORT] 1
    [ PixelXDimension]<0xa002> = [ LONG] 800
    [ PixelYDimension]<0xa003> = [ LONG] 600
    [ InteroperabilityOffset]<0xa005> = [ LONG] 790
    [ FileSource]<0xa300> = [ UNDEF] 03
    [ Interop]<......> = [REFERENCE] --> 196a728

    This is just what I need: the date & time of the photo, etc. I can use
    this info to stick a record in a database that holds basic info & the
    filesystem location of the photo. I would like to be able to set some
    kind of flag in the file, then, so that when I did a subsequent sweep
    of the disk for image files, I could easily skip photos that had
    already been processed.
    GreenLight, Jul 1, 2004
    #2
    1. Advertising

  3. GreenLight wrote:
    > Stefano Bettelli <> wrote in message news:<>...


    > I guess that I should read the information regarding the JPEG format
    > to get the answer, but maybe you know this: would it be possible to
    > add segments of information to the file that were of my own design? I
    > would like to add some flags to each photo that would show that I have
    > completed cataloging it.


    It depends. Obviously the standard is designed such that any application
    can skip those tags that it doesn't know. However, you cannot guarantee
    that all software is written properly.

    > I used you module to parse a file from my camera (Casio QV-2000UX).
    > Here is part of the info that was returned:
    >

    [ ... ]
    >
    > This is just what I need: the date & time of the photo, etc. I can use
    > this info to stick a record in a database that holds basic info & the
    > filesystem location of the photo. I would like to be able to set some
    > kind of flag in the file, then, so that when I did a subsequent sweep
    > of the disk for image files, I could easily skip photos that had
    > already been processed.


    I, too, did some work on this subject.
    I have a Kodac DC240 which stores the images in EXIF format.

    I store all my photos in the "Exif" directory.
    Also, there exist "Photo", "Info" and "Thumb" directories.
    When I scan the photos, I scan the Exif directory (File::Find), then, if
    no entry exists in Photo, I extract the large image, if no entry exists
    in Info, I extract the information, if no entry exists in Thumb, I
    extract the thumbnail.
    Then I have .alb files which describe what photos belong together and I
    create html pages with the thumbnails that have links to the large images.

    --
    Josef Möllers (Pinguinpfleger bei FSC)
    If failure had no penalty success would not be a prize
    -- T. Pratchett
    Josef Moellers, Jul 1, 2004
    #3
  4. Stefano Bettelli

    Gisle Aas Guest

    Stefano Bettelli <> writes:

    > I got recently interested in the possibility of designing a Perl
    > library for reading and modifying JPEG image metadata (with Exif
    > info, IPTC info, comments, thumbnails and so on). This kind of
    > additional data stored in the image itself is very useful for
    > organising digital photo collections. For various reasons, the
    > existing Perl libraries and programs do not fully satisfy me,
    > so I decided to enter the arena and write a Perl module (this is
    > also a good way to learn the language better ...).


    Could you name what existing Perl libraries you have looked at and why
    they don't satisfy you?

    I'm the author of Image::Info which seems to already do a lot of the
    same as you try to do. One difference is that I don't plan to make
    Image::Info able to update the meta info. I think that would
    complicate the module too much and I don't have that need personally.

    --
    Gisle Aas
    Gisle Aas, Jul 1, 2004
    #4
  5. (GreenLight) wrote in message news:<>...
    > Stefano Bettelli <> wrote in message news:<>...
    > > Hi,
    > >
    > > I got recently interested in the possibility of designing a Perl
    > > library for reading and modifying JPEG image metadata (with Exif
    > > info, IPTC info, comments, thumbnails and so on). This kind of
    > > additional data stored in the image itself is very useful for
    > > organising digital photo collections. For various reasons, the
    > > existing Perl libraries and programs do not fully satisfy me,
    > > so I decided to enter the arena and write a Perl module (this is
    > > also a good way to learn the language better ...).

    >
    > I have been wanting to create a catalog of my photos for quite some
    > time. I have thousands of photos that I have taken over the past five
    > years, and it looks like your module could help me quite a bit.
    >
    > > 4) Do you have any idea on how it could be extended? Whether
    > > there are interesting functionalities I did not think about?

    >
    > I guess that I should read the information regarding the JPEG format
    > to get the answer, but maybe you know this: would it be possible to
    > add segments of information to the file that were of my own design? I
    > would like to add some flags to each photo that would show that I have
    > completed cataloging it.


    That flag is already there:
    I would use the IPTC/edit status for this purpose.

    > This is just what I need: the date & time of the photo, etc. I can use
    > this info to stick a record in a database that holds basic info & the
    > filesystem location of the photo. I would like to be able to set some
    > kind of flag in the file, then, so that when I did a subsequent sweep
    > of the disk for image files, I could easily skip photos that had
    > already been processed.


    If you are looking for a GUI to manage your photos, you may have a
    look at Mapivi (http://mapivi.de.vu), it's free, runs on Windows and
    UNIX and the next version will use Stefanos Bettellis new module
    Image::MetaInfo::JPEG.

    Bye,
    Martin
    Martin Herrmann, Jul 2, 2004
    #5
  6. Josef Moellers <> wrote in message news:<cc16et$oc5$-siemens.com>...
    > GreenLight wrote:
    > > Stefano Bettelli <stefano > wrote in message news:<pan

    > >...
    >
    > > I guess that I should read the information regarding the JPEG format
    > > to get the answer, but maybe you know this: would it be possible to
    > > add segments of information to the file that were of my own design? I
    > > would like to add some flags to each photo that would show that I have
    > > completed cataloging it.

    >
    > It depends. Obviously the standard is designed such that any application
    >
    > can skip those tags that it doesn't know. However, you cannot guarantee
    > that all software is written properly.


    I'm sure that this approach will cause nothing but trouble. There are
    so many picture applications which e.g. will only handle the first
    comment segment and throw away the rest ...

    As noted in the other post, I strongly recomment using "standard"
    segements for storing such informations, like the IPTC info
    (http://www.iptc.org).

    > > I used you module to parse a file from my camera (Casio QV-2000UX).
    > > Here is part of the info that was returned:
    > >

    > [ ... ]
    > >
    > > This is just what I need: the date & time of the photo, etc. I can use
    > > this info to stick a record in a database that holds basic info & the
    > > filesystem location of the photo. I would like to be able to set some
    > > kind of flag in the file, then, so that when I did a subsequent sweep
    > > of the disk for image files, I could easily skip photos that had
    > > already been processed.

    >
    > I, too, did some work on this subject.
    > I have a Kodac DC240 which stores the images in EXIF format.
    >
    > I store all my photos in the "Exif" directory.
    > Also, there exist "Photo", "Info" and "Thumb" directories.
    > When I scan the photos, I scan the Exif directory (File::Find), then, if
    >
    > no entry exists in Photo, I extract the large image, if no entry exists
    > in Info, I extract the information, if no entry exists in Thumb, I
    > extract the thumbnail.
    > Then I have .alb files which describe what photos belong together and I


    What are .alb files?

    > create html pages with the thumbnails that have links to the large images
    > .


    I'm not exacly sure, that I understand everything you wrote, but it
    seems to me, that most of this (including the html export, handling of
    EXIF infos and thumbnails) can be done with Mapivi
    (http://mapivi.de.vu).

    Bye,
    Martin
    Martin Herrmann, Jul 2, 2004
    #6
  7. Martin Herrmann wrote:
    > Josef Moellers <> wrote in message news:<cc16et$oc5$-siemens.com>...
    >
    >>GreenLight wrote:
    >>
    >>>Stefano Bettelli <stefano > wrote in message news:<pan

    >>
    >>>...
    >>
    >>
    >>>I guess that I should read the information regarding the JPEG format
    >>>to get the answer, but maybe you know this: would it be possible to
    >>>add segments of information to the file that were of my own design? I
    >>>would like to add some flags to each photo that would show that I have
    >>>completed cataloging it.

    >>
    >>It depends. Obviously the standard is designed such that any application
    >>
    >>can skip those tags that it doesn't know. However, you cannot guarantee
    >>that all software is written properly.

    >
    >
    > I'm sure that this approach will cause nothing but trouble. There are
    > so many picture applications which e.g. will only handle the first
    > comment segment and throw away the rest ...


    If they throw away the rest, that would be fine, but some applications
    will: "Unknown tag XXXX found in blabla.jpg, terminating".

    >>Then I have .alb files which describe what photos belong together and I

    >
    >
    > What are .alb files?


    Photo _alb_ums, a personal text file format that describes which
    pictures make up an album and should be included in the web pages
    generated (they specify background, title, subtitle and picture ranges).

    > I'm not exacly sure, that I understand everything you wrote, but it
    > seems to me, that most of this (including the html export, handling of
    > EXIF infos and thumbnails) can be done with Mapivi
    > (http://mapivi.de.vu).


    Although I often prefer software that I have written myself (it does
    _exactly_ what I want/need), I'll have a look.

    Thanks.

    --
    Josef Möllers (Pinguinpfleger bei FSC)
    If failure had no penalty success would not be a prize
    -- T. Pratchett
    Josef Moellers, Jul 2, 2004
    #7
  8. Hi Gisle,

    Il giorno Thu, 01 Jul 2004 09:18:44 -0700, Gisle Aas scrisse:
    > Could you name what existing Perl libraries you have
    > looked at and why they don't satisfy you?


    the libraries and scripts which I looked at are listed in the
    perldoc manpage of the module, together with a description and
    my comments: "ExifTool" and "Image::ExifTool" by Phil Harvey,
    "Image::IPTCInfo" by Josh Carter, "JPEG::JFIF" by Marcin
    Krzyzanowski, "Image::Exif" by Sergey Prozhogin and "exiftags"
    by Eric M. Johnston, "Image::Info" and "Image::TIFF" by you,
    "exif" by Martin Krzywinski and "exifdump.py" by Thierry Bousch,
    "exifprobe" by Duane H. Hesser, "libexif" by Lutz Müller,
    "jpegrdf" by Norman Walsh and "OpenExif" by Eastman Kodak
    Company [some of these are not written in Perl].

    > I'm the author of Image::Info which seems to already do a lot
    > of the same as you try to do. One difference is that I don't
    > plan to make Image::Info able to update the meta info. I think
    > that would complicate the module too much and I don't have
    > that need personally.


    Actually, one of my goals is to be able to modify and rewrite
    to disk almost all information I parse from APP* segments.
    I read your library, but modifying it with this goal in mind
    is not an easy task (not easier than starting from scratch, at
    least). One other point is that the goal of your library is to
    read a set of common tags from various graphic formats, while
    I want to read/modify all tags from a specific graphic format
    (namely JPEG, and maybe TIFF in the future).

    What I could not see is how to integrate the "all" both on the
    "format axis" and on the "tag axis". Do you think that there is
    a possibility of integrating the two modules?

    Bye,
    Stefano
    Stefano Bettelli, Jul 2, 2004
    #8
  9. Hi,

    Il giorno Thu, 01 Jul 2004 06:34:25 -0700, GreenLight scrisse:
    > I have been wanting to create a catalog of my photos
    > for quite some time.


    this is exactly my problem :). I believe that in order to
    manage a catalogue with little complication, one should be
    able to enter his comments/additional info directly into the
    image, and then use a perl script to generate dynamic web
    pages with the required fields.

    For the first task, I think that a winning combination is a
    specialised library capable of parsing/modifying the JPEG
    structure together with a GUI program allowing you to interact
    with your photos more easily. Maybe you could have a look at
    the following program:

    http://herrmanns-stern.de/software/mapivi/mapivi.shtml

    Bye,
    Stefano
    Stefano Bettelli, Jul 2, 2004
    #9
  10. Il giorno Sat, 26 Jun 2004 19:57:26 +0200, Stefano Bettelli scrisse:
    > 2) What is the best name for the module?
    > I am currently using Image::MetaInfo::JPEG.


    Since I am getting paranoid about the correct name-space,
    what do you think about the following:

    Physics Meta-physics
    Language Meta-Language

    Meta is a prefix meaning (in current British English)
    "at a level above". In our case, MetaInfo would imply that
    there are Infos somewhere at a lower level. But this does
    not appear to be the case. In fact, Gisle Aas' module is
    name Image::Info, not Image::MetaInfo, and it obviously
    refers to the same level as we do. But Image::Info::JPEG is
    already used. So what about:

    Image::MetaInfo --> Image::MetaData ?
    Stefano Bettelli, Jul 5, 2004
    #10
  11. l v <> writes:

    > Stefano Bettelli wrote:
    > > Hi,
    > > Il giorno Thu, 01 Jul 2004 06:34:25 -0700, GreenLight scrisse:
    > >
    > >>I have been wanting to create a catalog of my photos
    > >>for quite some time.

    > > this is exactly my problem :). I believe that in order to manage a catalogue
    > > with little complication, one should be
    > > able to enter his comments/additional info directly into the
    > > image, and then use a perl script to generate dynamic web
    > > pages with the required fields.
    > > For the first task, I think that a winning combination is a
    > > specialised library capable of parsing/modifying the JPEG
    > > structure together with a GUI program allowing you to interact
    > > with your photos more easily. Maybe you could have a look at
    > > the following program:
    > > http://herrmanns-stern.de/software/mapivi/mapivi.shtml
    > > Bye,
    > > Stefano
    > >

    >
    > The closest I could get was using the Image::IPTCInfo module's SetAttribute,
    > AddKeyword, and Keywords functions to build my catalog. Although I would much
    > rather save the information in EXIF vs IPTC. I then use jhead (
    > http://www.sentex.net/~mwandel/jhead ) to autorotate the image which
    > automatically updates the EXIF orientation flag for me. ImageMagick is also
    > used to create my thumbnails (convert) and a contact sheet (montage)


    If you are using jhead already, check out the -cl string option which allows
    you to update the comment field. I use that for storing my copyright string in
    all of my photos. I suspect there may be a way to set this under perl.

    --
    Michael Meissner
    email:
    http://www.the-meissners.org
    Michael Meissner, Jul 8, 2004
    #11
  12. Stefano Bettelli

    Ben Morrow Guest

    Quoth Stefano Bettelli <>:
    > Il giorno Sat, 26 Jun 2004 19:57:26 +0200, Stefano Bettelli scrisse:
    > > 2) What is the best name for the module?
    > > I am currently using Image::MetaInfo::JPEG.

    >
    > Since I am getting paranoid about the correct name-space,
    > what do you think about the following:
    >
    > Physics Meta-physics
    > Language Meta-Language
    >
    > Meta is a prefix meaning (in current British English)
    > "at a level above". In our case, MetaInfo would imply that
    > there are Infos somewhere at a lower level. But this does
    > not appear to be the case. In fact, Gisle Aas' module is
    > name Image::Info, not Image::MetaInfo, and it obviously
    > refers to the same level as we do. But Image::Info::JPEG is
    > already used. So what about:
    >
    > Image::MetaInfo --> Image::MetaData ?


    Yes, I would say that was good. 'Metainfo' is not an English expression;
    and as you say, 'info' implies that it is describing the data without
    the need for 'meta'.

    Ben

    --
    We do not stop playing because we grow old;
    we grow old because we stop playing.
    Ben Morrow, Jul 15, 2004
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Brett Selleck

    Schema Metadata not a Metadata Schema

    Brett Selleck, Sep 4, 2003, in forum: XML
    Replies:
    1
    Views:
    399
    Andy Dingley
    Sep 4, 2003
  2. Larry Bates

    Re: jpg, jpeg metadata

    Larry Bates, Apr 7, 2004, in forum: Python
    Replies:
    1
    Views:
    388
    Cameron Laird
    Apr 7, 2004
  3. lovaspillando
    Replies:
    0
    Views:
    1,016
    lovaspillando
    Aug 26, 2007
  4. Ivan Alameda Carballo
    Replies:
    0
    Views:
    483
    Ivan Alameda Carballo
    Aug 26, 2007
  5. Reginald Johnson

    Javascript to Access JPEG Metadata XMP, IPTC, EXIF

    Reginald Johnson, Apr 22, 2006, in forum: Javascript
    Replies:
    1
    Views:
    190
    Richard Cornford
    Apr 23, 2006
Loading...

Share This Page