I
Ilmari Heikkinen
Konrad Meyer said:Any chance you could wrap this up as a gem?
I already have a gemspec file, but gem screws up bin/chardet by
plastering it with #!/usr/bin/ruby boilerplate (it's a python file).
And I don't know how to turn it off.
Another bug (Sorry):
$ mdh -p ~/music/Limp\ Bizkit\ -\ Rollin\'\ \(edited\).ogg
sh: -c: line 0: syntax error near unexpected token `('
sh: -c: line 0: `ogginfo '/home/konrad/music/Limp Bizkit - Rollin\'
(edited).ogg''
(Last line was broken up to email length.) You're already escaping single
quotes for the shell, need to escape start-parens and end-parens as well.
Argh, amateurish mistake on my part, thanks for catching that. Fixed.
If in a bit over-engineered way (creating a safely named link to the file.)
Probably impossible to safely pass a filename like "-f -i -l -e -z"
to a shell command that doesn't support "--" in any other way, though.
Also:
For mp3 id3v2 tags, the binary string "\xCB\x99\xC5\xA3" is being inserted
at the front of all the string fields.
[snip]
I *think* this is an id3v2 thing. Also, it happens in more than one file and
amaroK sees the tags "correctly", so I'm thinking it's on the metadata's
end. Thanks!
Right you are. Fixed. No idea what was causing it. Moved to
using id3lib for the tags (it extracts embedded album art as well!) and
mplayer for the rest of the metadata.
Here we go, 0.5:
tarball: http://dark.fhtr.org/repos/metadata/metadata-0.5.tar.gz
git: http://dark.fhtr.org/repos/metadata
Description
-----------
This package `Metadata' comes with a library called `metadata' and
a small program called `mdh'.
The library probes files for their metadata (e.g. jpeg dimensions
and camera make, mp3 artist, pdf word count) and returns the metadata
as a Hash.
Mdh can print out file metadata as YAML and package the metadata
with the file.
This package has many dependencies since there is no single universal
metadata header format that all files use. Blame resource forks, filename
extensions, bags of bytes and mimetypes.
Usage
-----
# print out metadata header
mdh -p myfile.jpg
# create myfile.jpg.mdh, which consists of metadata header + myfile.jpg
mdh myfile.jpg
# print out metadata header from mdh file
mdh -e -p myfile.jpg.mdh
# strip out metadata header from mdh file and save it to myfile.jpg
mdh -e myfile.jpg.mdh
# print out list of flags
mdh -h
irb> Metadata.extract('myfile.jpg')
irb> Metadata.extract_text('myfile.jpg')
irb> Pathname.new("myfile.jpg").metadata
List of supported formats
-------------------------
Audio:
Whatever you manage to make mplayer play.
Plus FLAC, m4a and wma handled specially.
Successfully tested with:
mp3, flac, ogg, wav
Should also work:
wma, m4a
Video:
Whatever you manage to make mplayer play.
Successfully tested with:
wmv, mov, divx, xvid, flv, ogm, mpg
Images:
Should handle pretty much anything (apart from XCF and ORF.)
Successfully tested with:
jpeg, png, gif, nef, dng, crw, pef, psd
Documents:
Successfully tested with:
pdf, ppt, odp, sxi, ps, ps.gz, html, txt
Should work:
- OpenOffice docs work to some degree (personally, I'm using unoconv to
convert OO docs to temp PDFs for the text & dimensions extraction, so
those bits of data are missing.)
- MS Office docs to some degree (ppt at least, doc and xls should work too,
dimensions missing due to the above temp PDF -thing.)
Others:
Whatever extract spits out on the five or six bits of metadata I'm using
from it. Archive contents at least.
Requirements
------------
* Ruby 1.8
* Tons of metadata extraction programs and libs,
list of gems:
flacinfo-rb
wmainfo-rb
MP4info
id3lib-ruby
list of debian packages:
dcraw
libimlib2-ruby
extract
libimage-exiftool-perl
poppler-utils
mplayer
html2text
imagemagick
unhtml
pstotext
antiword
catdoc
shared-mime-info
* You do want to install the latest versions of dcraw and
shared-mime-info to be able to handle camera raw images.
http://cybercom.net/~dcoffin/dcraw/
http://freedesktop.org/wiki/Software/shared-mime-info
* Python + chardet library
http://chardet.feedparser.org/
Install
-------
De-compress archive and enter its top directory.
Then type:
($ su)
# ruby setup.rb
These simple step installs this program under the default
location of Ruby libraries. You can also install files into
your favorite directory by supplying setup.rb some options.
Try "ruby setup.rb --help".
License