EXTRACT(1) General Commands Manual EXTRACT(1)
NAME
extract - determine meta-information about a file
SYNOPSIS
extract [ -bghLnvV ] [ -H hash-algorithm ] [ -i ] [ -l library ] [ -p type ] [ -x type ] file ...
DESCRIPTION
This manual page documents version 0.6.0 of the extract command.
extract tests each file specified in the argument list in an attempt to infer meta-information from it. Each file is subjected to the
meta-data extraction libraries from libextractor.
libextractor classifies meta-information (also referred to as keywords) into types. A list of all types can be obtained with the -L option.
OPTIONS
-b Display the output in BiBTeX format.
-g Use grep-friendly output (all keywords on a single line for each file). Use the verbose option to print the filename first, fol-
lowed by the keywords. Use the verbose option twice to also display the keyword types. This option will not print keyword types
or non-textual metadata.
-h Print a brief summary of the options.
-i Run plugins in-process (for debugging). By default, each plugin is run in its own process.
-l libraries
Use the specified libraries to extract keywords. The general format of libraries is .I [[-]LIBRARYNAME[:[-]LIBRARYNAME]*] where
LIBRARYNAME is a libextractor compatible library and typically of the form .Ijpeg. The minus before the libraryname indicates that
this library should be removed from the existing list. To run only a few selected plugins, use -l in combination with -n.
-L Print a list of all known keyword types.
-n Do not use the default set of extractors (typically all standard extractors, currently mp3, ogg, jpg, gif, png, tiff, real, html,
pdf and mime-types), use only the extractors specified with the .B -l option.
-p type
Print only the keywords matching the specified type. By default, all keywords that are found and not removed as duplicates are
printed.
-v Print the version number and exit.
-V Be verbose. This option can be specified multiple times to increase verbosity further.
-x type
Exclude keywords of the specified type from the output. By default, all keywords that are found and not removed as duplicates are
printed.
SEE ALSO
libextractor(3) - description of the libextractor library
EXAMPLES
$ extract test/test.jpg
comment - (C) 2001 by Christian Grothoff, using gimp 1.2 1
mimetype - image/jpeg
$ extract -V -x comment test/test.jpg
Keywords for file test/test.jpg:
mimetype - image/jpeg
$ extract -p comment test/test.jpg
comment - (C) 2001 by Christian Grothoff, using gimp 1.2 1
$ extract -nV -l png.so -p comment test/test.jpg test/test.png
Keywords for file test/test.jpg:
Keywords for file test/test.png:
comment - Testing keyword extraction
LEGAL NOTICE
libextractor and the extract tool are released under the GPL. libextractor is a GNU package.
BUGS
A couple of file-formats (on the order of 10^3) are not recognized...
AUTHORS
extract was originally written by Christian Grothoff <christian@grothoff.org> and Vidyut Samanta <vids@cs.ucla.edu>. Use <libextrac-
tor@gnu.org> to contact the current maintainer(s).
AVAILABILITY
You can obtain the original author's latest version from http://www.gnu.org/software/libextractor/
libextractor 0.6.0 Dec 20, 2009 EXTRACT(1)