apertium(1) [debian man page]

apertium(1)															       apertium(1)

NAME

       apertium - This application is part of ( apertium )

       This tool is part of the apertium machine translation architecture: http://apertium.sf.net.

SYNOPSIS

       apertium [-d datadir] [-f format] [-u] [-a] {language-pair} [infile [outfile]]

DESCRIPTION

       apertium  is  the  application that most people will be using as it simplifies the use of apertium/lt-toolbox tools for machine translation
       purposes.

       This tool tries to ease the use of lt-toolbox (which contains all the lexical processing modules and tools) and	apertium  (which  contains
       the rest of the engine) by providing a unique front-end to the end-user.

       The different modules behind the apertium machine translation architecture are in order:
	      o de-formatter: Separates the text to be translated from the format information.

	      o morphological-analyser: Tokenizes the text in surface forms.

	      o part-of-speech tagger: Chooses one surface forms among homographs.

	      o lexical transfer module: Reads each source-language lexical form and delivers a corresponding target-language lexical form.

	      o  structural  transfer module: Detects fixed-length patterns of lexical forms (chunks or phrases) needing special processing due to
	      grammatical divergences between the two languages and performs the corresponding transformations.

	      o morphological generator: Delivers a target-language surface form for each target-language lexical form, by suitably inflecting it.

	      o post-generator: Performs orthographical operations such as contractions and apostrophations.

	      o re-formatter: Restores the format information encapsulated by the de-formatter into the translated text and removes the encapsula-
	      tion sequences used to protect certain characters in the source text.

OPTIONS

       -d datadir The directory holding the linguistic data.  By default it will used the expected installation path.

       language-pair The language pair: LANG1-LANG2 (for instance es-ca or ca-es).

       -f format Specifies the format of the input and output files which can have these values:
	      o txt (default value) Input and output files are in text format.

	      o html Input and output files are in "html" format. This "html" is the one acceptd by the vast majority of web browsers.

	      o  rtf  Input  and  output files are in "rtf" format. The accepted "rtf" is the one generated by Microsoft WordPad (C) and Microsoft
	      Office (C) up to and including Office-97.

       -u Disable marking of unknown words with the '*' character.

       -a Enable marking of disambiguated words with the '=' character.

FILES

       These are the two files that can be used with this command:

       infile Input file (stdin by default).

       outfile Output file (stdout by default).

SEE ALSO

       lt-proc(1), lt-comp(1), lt-expand(1), apertium-tagger(1).

BUGS

       Lots of...lurking in the dark and waiting for you!

AUTHOR

       (c) 2005,2006 Universitat d'Alacant / Universidad de Alicante. All rights reserved.

								    2006-03-08							       apertium(1)

Check Out this Related Man Page

apertium-tagger(1)														apertium-tagger(1)

NAME

       apertium-tagger - This application is part of  ( apertium )

       This tool is part of the apertium open-source machine translation architecture: http://www.apertium.org.

SYNOPSIS

       apertium-tagger --train|-t {n} DIC CRP TSX PROB [--debug|-d]

       apertium-tagger --supervised|-s {n} DIC CRP TSX PROB HTAG UNTAG [--debug|-d]

       apertium-tagger --retrain|-r {n} CRP PROB [--debug|-d]

       apertium-tagger --tagger|-g [--first|-f] PROB [--debug|-d] [INPUT [OUTPUT]]

DESCRIPTION

       apertium-tagger	is  the  application  responsible  for	the  apertium  part-of-speech tagger training or tagging, depending on the calling
       options.  This command only reads from the standard input if the option --tagger or -g is used.

OPTIONS

       -t {n}, --train {n}
	      Initializes parameters through the Kupiec's method (unsupervised), then performs n iterations of the Baum-Welch  training  algorithm
	      (unsupervised).

       -s {n}, --supervised {n}
	      Initializes parameters against a hand-tagged text (supervised) through the maximum likelihood estimate method, then performs n iter-
	      ations of the Baum-Welch training algorithm (unsupervised)

       -r {n}, --retrain {n}
	      Retrains the model with n additional Baum-Welch iterations (unsupervised).

       -g, --tagger
	      Tags input text by means of Viterbi algorithm.

       -p, --show-superficial
	      Prints the superficial form of the word along side the lexical form in the output stream.

       -f, --first
	      Used if conjuntion with -g (--tagger) makes the tagger to give all lexical forms of each word, being the choosen one  in	the  first
	      place (after the lemma)

       -d, --debug
	      Print error (if any) or debug messages while operating.

       -m, --mark
	      Mark disambiguated words.

       -h, --help
	      Display a help message.

FILES

       These are the kinds of files used with each option:

       DIC Full expanded dictionary file

       CRP Training text corpus file

       TSX Tagger specification file, in XML format

       PROB Tagger data file, built in the training and used while tagging

       HTAG Hand-tagged text corpus

       UNTAG Untagged text corpus, morphological analysis of HTAG corpus to use both jointly with -s option

       INPUT Input file, stdin by default

       OUTPUT Output file, stdout by default

SEE ALSO

       lt-proc(1), lt-comp(1), lt-expand(1), apertium-translator(1), apertium(1).

BUGS

       Lots of...lurking in the dark and waiting for you!

AUTHOR

       Copyright  (c) 2005, 2006 Universitat d'Alacant / Universidad de Alicante.  This is free software.  You may redistribute copies of it under
       the terms of the GNU General Public License <http://www.gnu.org/licenses/gpl.html>.

								    2006-08-30							apertium-tagger(1)

Linux and UNIX Man Pages

apertium(1) [debian man page]

Check Out this Related Man Page