Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

tv_extractinfo_en(1p) [debian man page]

TV_EXTRACTINFO_EN(1p)					User Contributed Perl Documentation				     TV_EXTRACTINFO_EN(1p)

tv_extractinfo_en - read English-language listings and extract info from programme descriptions. SYNOPSIS
tv_extractinfo_en [--help] [--output FILE] [FILE...] DESCRIPTION
Read XMLTV data and attempt to extract information from English-language programme descriptions, putting it into machine-readable form. For example the human-readable text '(repeat)' in a programme description might be replaced by the XML element <previously-shown>. --output FILE write to FILE rather than standard output This tool also attempts to split multipart programmes into their constituents, by looking for a description that seems to contain lots of times and titles. But this depends on the description following one particular style and is useful only for some listings sources (Ananova). If some text is marked with the 'lang' attribute as being some language other than English ('en'), it is ignored. SEE ALSO
xmltv(5). AUTHOR
Ed Avis, BUGS
Trying to parse human-readable text is always error-prone, more so with the simple regexp-based approach used here. But because TV listing descriptions usually conform to one of a few set styles, tv_extractinfo_en does reasonably well. It is fairly conservative, trying to avoid false positives (extracting 'information' which isn't really there) even though this means some false negatives (failing to extract information and leaving it in the human-readable text). However, the leftover bits of text after extracting information may not form a meaningful English sentence, or the punctuation may be wrong. On the two listings sources currently supported by the XMLTV package, this program does a reasonably good job. But it has not been tested with every source of anglophone TV listings. perl v5.14.2 2011-05-07 TV_EXTRACTINFO_EN(1p)

Check Out this Related Man Page

Summarize(3pm)						User Contributed Perl Documentation					    Summarize(3pm)

XMLTV::Summarize - Perl extension to summarize XMLTV data SYNOPSIS
# First get some data from the XMLTV module, eg: use XMLTV; my $data = XMLTV::parsefile('tv_sorted.xml'); my ($encoding, $credits, $ch, $progs) = @$data; # Now turn the sorted programmes into a printable summary. use XMLTV::Summarize qw(summarize); foreach (summarize($ch, $progs)) { if (not ref) { print " Day: $_ "; } else { my ($start, $stop, $title, $sub_title, $channel) = @$_; print "programme starts at $start, "; print "stops at $stop, " if defined $stop; print "has title $title "; print "and episode title $sub_title" if defined $sub_title; print ", on channel $channel. "; } } DESCRIPTION
This module processes programme and channel data from the XMLTV module to help produce a human-readable summary or TV guide. It takes care of choosing the correct language (based on the LANG environment variable) and of looking up the name of channels from their id. There is one public routine, "summarize()". This takes (references to) a channels hash and a programmes list, the same format as those returned by the XMLTV module. It returns a list of 'summary' elements where each element is a list of five items: start time, stop time, title, 'sub-title', and channel name. The stop time and sub-title may be undef. The times are formatted as hh:mm, with a timezone appended when the timezone changes in the middle of listings. For the titles and channel name, the shortest string that is in an acceptable language is chosen. The list of acceptable languages normally contains just one element, taken from LANG, but you can set it manually as @XMLTV::Summarize::PREF_LANGS if wished. AUTHOR
XMLTV(1). perl v5.14.2 2004-01-03 Summarize(3pm)
Man Page