Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

apertium-lextor(1) [debian man page]

apertium-lextor(1)														apertium-lextor(1)

NAME
apertium-lextor - This application is part of ( apertium ) This tool is part of the apertium machine translation architecture: http://apertium.org. SYNOPSIS
apertium-lextor --trainwrd stopwords words n left right corpus model [ --weightexp w ] [ --debug ] apertium-lextor --trainlch stopwords lexchoices n left right corpus wordmodel dic bildic model [ --weightexp w ] [ --debug ] apertium-lextor --lextor model dic left right [ --debug ] [ --weightexp w ] DESCRIPTION
apertium-lextor is the application responsible for training and usage of the lexical selector module. OPTIONS
--trainwrd | -t Train word co-occurrences model. It needs the following required parameters: stopwords file containing a list of stop words. Stop words are ignored. words file containing a list of words. For each word a co-occurrence model is built. n number of words per co-occurrence model (for each model, the n most frequent words). left left-side context to take into account (number of words). right right-side context to take into account (number of words). corpus file containing the training corpus. model output file on which the co-occurrence models are saved. --trainlch | -r Train lexical choices co-occurrence models using a target language co-occurrence model and a bilingual dictionary. It needs the following required parameters: stopwords file containing a list of stop words. Stop words are ignored. lexchoices file containing a list of lexical choices. For each lexical choice a co-occurrence model is built. n number of words per co-occurrence model (for each model, the n most frequent words). left left-side context to take into account (number of words). right right-side context to take into account (number of words). corpus file containing the training corpus. wordmodel target-language word co-occurrence model (previously trained by means of the --trainwrd option). dic the lexical-selection dictionary (binary format). bildic the bilingual dictionary (binary format). model output file on which the co-occurrence models are saved. --lextor | -l Perform the lexical selection on the input stream. It needs the following required parameters: model file containing the model to be used for the lexical selection. dic lexical-selection dictionary (binary format). left left-side context to take into account (number of words). right right-side context to take into account (number of words). --weightexp w Specify a weight value to change the influence of surrounding words while training or performing the lexical selection. The parameter w must be a positive value. --debug | -d Show debug information while working. --help | -h Shows this help. --version | -v Shows license information. SEE ALSO
apertium-gen-lextorbil(1), apertium-preprocess-corpus-lextor(1), apertium-gen-stopwords-lextor(1), apertium-gen-wlist-lextor(1), aper- tium-gen-wlist-lextor-translation(1), apertium-lextor-eval(1), apertium-lextor-mono(1). BUGS
Lots of...lurking in the dark and waiting for you! AUTHOR
(c) 2005,2006 Universitat d'Alacant / Universidad de Alicante. All rights reserved. 2006-12-12 apertium-lextor(1)

Check Out this Related Man Page

apertium(1)															       apertium(1)

NAME
apertium - This application is part of ( apertium ) This tool is part of the apertium machine translation architecture: http://apertium.sf.net. SYNOPSIS
apertium [-d datadir] [-f format] [-u] [-a] {language-pair} [infile [outfile]] DESCRIPTION
apertium is the application that most people will be using as it simplifies the use of apertium/lt-toolbox tools for machine translation purposes. This tool tries to ease the use of lt-toolbox (which contains all the lexical processing modules and tools) and apertium (which contains the rest of the engine) by providing a unique front-end to the end-user. The different modules behind the apertium machine translation architecture are in order: o de-formatter: Separates the text to be translated from the format information. o morphological-analyser: Tokenizes the text in surface forms. o part-of-speech tagger: Chooses one surface forms among homographs. o lexical transfer module: Reads each source-language lexical form and delivers a corresponding target-language lexical form. o structural transfer module: Detects fixed-length patterns of lexical forms (chunks or phrases) needing special processing due to grammatical divergences between the two languages and performs the corresponding transformations. o morphological generator: Delivers a target-language surface form for each target-language lexical form, by suitably inflecting it. o post-generator: Performs orthographical operations such as contractions and apostrophations. o re-formatter: Restores the format information encapsulated by the de-formatter into the translated text and removes the encapsula- tion sequences used to protect certain characters in the source text. OPTIONS
-d datadir The directory holding the linguistic data. By default it will used the expected installation path. language-pair The language pair: LANG1-LANG2 (for instance es-ca or ca-es). -f format Specifies the format of the input and output files which can have these values: o txt (default value) Input and output files are in text format. o html Input and output files are in "html" format. This "html" is the one acceptd by the vast majority of web browsers. o rtf Input and output files are in "rtf" format. The accepted "rtf" is the one generated by Microsoft WordPad (C) and Microsoft Office (C) up to and including Office-97. -u Disable marking of unknown words with the '*' character. -a Enable marking of disambiguated words with the '=' character. FILES
These are the two files that can be used with this command: infile Input file (stdin by default). outfile Output file (stdout by default). SEE ALSO
lt-proc(1), lt-comp(1), lt-expand(1), apertium-tagger(1). BUGS
Lots of...lurking in the dark and waiting for you! AUTHOR
(c) 2005,2006 Universitat d'Alacant / Universidad de Alicante. All rights reserved. 2006-03-08 apertium(1)
Man Page