I have two directories called English and Hindi. Each directory contains the same number of files with the only difference being that in the case of the English Directory the tag is
.english
and in the Hindi one the tag is
.Hindi
The file may contain either a single text or more than one text... (7 Replies)
Hi folks!
I have a file which contains a 1000 lines. On each line i have multiple occurrences ( 26 to be exact ) of pattern folder#/folder#.
# is depicting the line number in the file
some text here folder1/folder1 some text here folder1/folder1 some text here folder1/folder1 some text... (7 Replies)
I want to extract verbal forms from a large corpus of English. I have identified a certain number of patterns. Each pattern has the following structure
SPACE word_CATEGORY
where word refers to the verbal form and CATEGORY refers to the class of the verb
The categories are identified as per the... (4 Replies)
Hello,
I have a large file of syllables /strings in Urdu. Each word is on a separate line.
Example in English:
be
at
for
if
being
attract
I need to identify the frequency of each of these strings from a large corpus (which I cannot attach unfortunately because of size limitations) and... (7 Replies)
History: large open source PHP project, school management program. Comprises about 200 scripts. Had another developer for awhile, and he wanted a version in German, so he edited all the scripts and replaced text that would show up in the browser with variables (i.e. instead of "Click Here",... (7 Replies)
needa c program to extract text between two delimiters from some text file.
and then storing them in to diffrent variables ?
text file like 0:
abc.txt
=========
aaaaaa|11111111|sssssssssss|333333|ddddddddd|34343454564|asass
aaaaaa|11111111|sssssssssss|333333|ddddddddd|34343454564|asass... (7 Replies)
apertium-tagger(1)apertium-tagger(1)NAME
apertium-tagger - This application is part of ( apertium )
This tool is part of the apertium open-source machine translation architecture: http://www.apertium.org.
SYNOPSIS
apertium-tagger --train|-t {n} DIC CRP TSX PROB [--debug|-d]
apertium-tagger --supervised|-s {n} DIC CRP TSX PROB HTAG UNTAG [--debug|-d]
apertium-tagger --retrain|-r {n} CRP PROB [--debug|-d]
apertium-tagger --tagger|-g [--first|-f] PROB [--debug|-d] [INPUT [OUTPUT]]
DESCRIPTION
apertium-tagger is the application responsible for the apertium part-of-speech tagger training or tagging, depending on the calling
options. This command only reads from the standard input if the option --tagger or -g is used.
OPTIONS -t {n}, --train {n}
Initializes parameters through the Kupiec's method (unsupervised), then performs n iterations of the Baum-Welch training algorithm
(unsupervised).
-s {n}, --supervised {n}
Initializes parameters against a hand-tagged text (supervised) through the maximum likelihood estimate method, then performs n iter-
ations of the Baum-Welch training algorithm (unsupervised)
-r {n}, --retrain {n}
Retrains the model with n additional Baum-Welch iterations (unsupervised).
-g, --tagger
Tags input text by means of Viterbi algorithm.
-p, --show-superficial
Prints the superficial form of the word along side the lexical form in the output stream.
-f, --first
Used if conjuntion with -g (--tagger) makes the tagger to give all lexical forms of each word, being the choosen one in the first
place (after the lemma)
-d, --debug
Print error (if any) or debug messages while operating.
-m, --mark
Mark disambiguated words.
-h, --help
Display a help message.
FILES
These are the kinds of files used with each option:
DIC Full expanded dictionary file
CRP Training text corpus file
TSX Tagger specification file, in XML format
PROB Tagger data file, built in the training and used while tagging
HTAG Hand-tagged text corpus
UNTAG Untagged text corpus, morphological analysis of HTAG corpus to use both jointly with -s option
INPUT Input file, stdin by default
OUTPUT Output file, stdout by default
SEE ALSO lt-proc(1), lt-comp(1), lt-expand(1), apertium-translator(1), apertium(1).
BUGS
Lots of...lurking in the dark and waiting for you!
AUTHOR
Copyright (c) 2005, 2006 Universitat d'Alacant / Universidad de Alicante. This is free software. You may redistribute copies of it under
the terms of the GNU General Public License <http://www.gnu.org/licenses/gpl.html>.
2006-08-30 apertium-tagger(1)