debian man page for tv_extractinfo_ar

Query: tv_extractinfo_ar

OS: debian

Section: 1p

Format: Original Unix Latex Style Formatted with HTML and a Horizontal Scroll Bar

TV_EXTRACTINFO_AR(1p)					User Contributed Perl Documentation				     TV_EXTRACTINFO_AR(1p)

NAME
tv_extractinfo_ar - read Spanish (Argentinean) language listings and extract info from programme descriptions.
SYNOPSIS
tv_extractinfo_ar [--help] [--output FILE] [FILE...]
DESCRIPTION
Read XMLTV data and attempt to extract information from Spanish-language programme descriptions, putting it into machine-readable form. For example the human-readable text '(Repeticion)' in a programme description might be replaced by the XML element <previously-shown>. --output FILE write to FILE rather than standard output This tool also attempts to split multipart programmes into their constituents, by looking for a description that seems to contain lots of times and titles. But this depends on the description following one particular style and is useful only for some listings sources (Ananova). If some text is marked with the 'lang' attribute as being some language other than Spanish ('es'), it is ignored.
SEE ALSO
xmltv(5).
AUTHOR
Mariano Cosentino, Mok@marianok.com.ar
BUGS
Trying to parse human-readable text is always error-prone, more so with the simple regexp-based approach used here. But because TV listing descriptions usually conform to one of a few set styles, tv_extractinfo_en does reasonably well. It is fairly conservative, trying to avoid false positives (extracting 'information' which isn't really there) even though this means some false negatives (failing to extract information and leaving it in the human-readable text). However, the leftover bits of text after extracting information may not form a meaningful Spanish sentence, or the punctuation may be wrong. On the two listings sources currently supported by the XMLTV package, this program does a reasonably good job. But it has not been tested with every source of anglophone TV listings. This Spanish Version is heavily customized for the XML results from tv_grab_ar (developed by Christian A. Rodriguez and postriorily updated by Mariano S. Cosentino). This file should probably be called tv_extractinfo_es, but I have not tested it with any other spanish grabbers, so I don't want to be presumptious. perl v5.14.2 2010-06-01 TV_EXTRACTINFO_AR(1p)
Related Man Pages
tv_grab_ch_search(1p) - debian
tv_grab_il(1p) - debian
tv_grep(1p) - debian
xmltv(3pm) - debian
xmltv::summarize(3pm) - debian
Similar Topics in the Unix Linux Community
need to extract warnings and cautions
extract text from a file
extract a field from a long sentence!
How to extract 3rd,4th and 5th element
Spanish Speaking Forum Members?