tv_extractinfo_en(1p) debian man page | unix.com

Debian Man Pages Latest Topics

Man Page: tv_extractinfo_en

Operating Environment: debian

Section: 1p

TV_EXTRACTINFO_EN(1p) User Contributed Perl Documentation TV_EXTRACTINFO_EN(1p)

NAME
tv_extractinfo_en - read English-language listings and extract info from programme descriptions.

SYNOPSIS
tv_extractinfo_en [--help] [--output FILE] [FILE...]

DESCRIPTION
Read XMLTV data and attempt to extract information from English-language programme descriptions, putting it into machine-readable form.
For example the human-readable text '(repeat)' in a programme description might be replaced by the XML element <previously-shown>.

--output FILE write to FILE rather than standard output

This tool also attempts to split multipart programmes into their constituents, by looking for a description that seems to contain lots of
times and titles. But this depends on the description following one particular style and is useful only for some listings sources
(Ananova).

If some text is marked with the 'lang' attribute as being some language other than English ('en'), it is ignored.

SEE ALSO
xmltv(5).

AUTHOR
Ed Avis, ed@membled.com

BUGS
Trying to parse human-readable text is always error-prone, more so with the simple regexp-based approach used here. But because TV listing
descriptions usually conform to one of a few set styles, tv_extractinfo_en does reasonably well. It is fairly conservative, trying to
avoid false positives (extracting 'information' which isn't really there) even though this means some false negatives (failing to extract
information and leaving it in the human-readable text).

However, the leftover bits of text after extracting information may not form a meaningful English sentence, or the punctuation may be
wrong.

On the two listings sources currently supported by the XMLTV package, this program does a reasonably good job. But it has not been tested
with every source of anglophone TV listings.

perl v5.14.2 2011-05-07 TV_EXTRACTINFO_EN(1p)

Related Man Pages
tv_cat(1p) - debian
tv_grab_il(1p) - debian
tv_grab_za(1p) - debian
tv_grep(1p) - debian
tv_split(1p) - debian

Similar Topics in the Unix Linux Community
how to run socket programme and file management concurrently
Unable to compile the c programme in unix
c program to extract text between two delimiters from some text file
extract fields from text file using delimiter!!
script to separate bilingual text file