apertium-preprocess-corpus-lextor(1) [debian man page]
apertium-preprocess-corpus-lextor(1)apertium-preprocess-corpus-lextor(1)NAME
apertium-preprocess-corpus-lextor - This application is part of ( apertium )
This tool is part of the apertium machine translation architecture: http://apertium.org.
SYNOPSIS
apertium-preprocess-corpus-lextor data_dir translation_dir input_file output_file
DESCRIPTION
apertium-preprocess-corpus-lextor is the application responsible for preprocessing the training corpus for the lexical selector training.
OPTIONS
This tool currently has no options.
FILES
These are the kinds of files and directories used with this tool:
data_dir the path to the linguistic data to use.
translation_dir the translation direction to use.
input_file contains a large corpus in raw format.
output_file The file which gets the preprocessed corpus.
SEE ALSO apertium-gen-lextorbil(1), apertium-gen-lextormono(1), apertium-gen-lextor-eval(1), apertium-gen-stopwords-lextor(1), aper-
tium-gen-wlist-lextor(1), apertium-gen-wlist-lextor-translation(1), apertium-lextor(1).
BUGS
Lots of...lurking in the dark and waiting for you!
AUTHOR
(c) 2005,2006 Universitat d'Alacant / Universidad de Alicante. All rights reserved.
2006-12-12 apertium-preprocess-corpus-lextor(1)
Check Out this Related Man Page
apertium-gen-wlist-lextor-translation(1)apertium-gen-wlist-lextor-translation(1)NAME
apertium-gen-wlist-lextor-translation - This application is part of ( apertium )
This tool is part of the apertium machine translation architecture: http://apertium.org.
SYNOPSIS
apertium-gen-wlist-lextor-translation --mono|-m dic.bin --bil|-b bildic.bin --wlist|-w wlistfile
DESCRIPTION
apertium-gen-wlist-lextor-translation is the application responsible for generating all the possible translations of polysemous words.
OPTIONS
--mono|-m dic.bin
Specifies the monolingual lexical selection dictionary to use (see apertium-gen-lextormono).
--bil|-b bildic.bin
Specifies the bilingual lexical selection ditionary to use (see apertium-gen-lextorbil).
--wlist|-w wlistfile
Specifies the list of words to translate (see apertium-gen-wlist-lextor).
--help|-h
Shows a brief usage help.
--version|-v
Shows the version string of this tool and it's license.
FILES
This tool uses no files apart from the ones associated to each option.
SEE ALSO apertium-gen-lextorbil(1), apertium-preprocess-corpus-lextor(1), apertium-gen-stopwords-lextor(1), apertium-gen-wlist-lextor(1), aper-
tium-gen-lextormono(1), apertium-lextor-eval(1), apertium-lextor(1).
BUGS
Lots of...lurking in the dark and waiting for you!
AUTHOR
(c) 2005,2006 Universitat d'Alacant / Universidad de Alicante. All rights reserved.
2006-12-12 apertium-gen-wlist-lextor-translation(1)
I created a "ebanking.x" file and run it as " rpcgen -a ebaning.x"
It gen a few of files to me which is - "ebanking.h", "ebanking_server.c", "ebanking_svc.c", "ebanking_client.c", "ebanking_clnt.c", "ebanking_xdr.c" and "Makefile"
The content of "ebanking.x" :
struct bankargs {
... (0 Replies)
Hi,
I have getting problem with tr command, here is my problem
I Have file CntryCd_100302.wk
When i do like this it works fine
dt=`date +%y%m%d`
data_dir=/usr/apps/mp/script/UMTS/CRE2/data
tr "(){}\"'<>;&@" " " < CntryCd_100302.wk > CntryCd_100302.temp
But... (14 Replies)
i am compiling a program called vasp on suse and get the following error. there are many more preprocess and ifort commands prior so i just grabbed the tail of the log file:
./preprocess <main.F | /usr/bin/cpp -P -C -traditional >main.f90 -DMPI -DHOST=\"LinuxIFC\" -DIFC -Dkind8 -DNGZhalf... (6 Replies)
Hi,
In a directory, e.g. ~/corpus is a lot of files and subdirectories. Some of the files are named:
12345___PP___0902___AA.txt
12346___PP___0902___AA. txt
12347___PP___0902___AA. txt
The amount of files varies. I need to keep the highest (12347___PP___0902___AA. txt) and remove... (5 Replies)
Hello guys,
I've got a big corpus (a huge text file in which words are separated by one or several spaces). I would like to know if there is a simple way - using awk for instance - to extract any co-occurrence appearing at least 3times through the whole corpus for a given word. By co-occurrence,... (7 Replies)
Hello,
I have a large file of syllables /strings in Urdu. Each word is on a separate line.
Example in English:
be
at
for
if
being
attract
I need to identify the frequency of each of these strings from a large corpus (which I cannot attach unfortunately because of size limitations) and... (7 Replies)
Dear all,
I have Files with lines of text in them, I want to replace the stopwords in them with ",".
I have create a file which contain the stopwords...
I have been trying for last 3 hours but no success
I have managed to replace one using "sed" and delete the line containing them using... (3 Replies)
I want to extract verbal forms from a large corpus of English. I have identified a certain number of patterns. Each pattern has the following structure
SPACE word_CATEGORY
where word refers to the verbal form and CATEGORY refers to the class of the verb
The categories are identified as per the... (4 Replies)
I have two directories called English and Hindi. Each directory contains the same number of files with the only difference being that in the case of the English Directory the tag is
.english
and in the Hindi one the tag is
.Hindi
The file may contain either a single text or more than one text... (7 Replies)