Grepping verbal forms from a large corpus Post: 302954830

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Forms

Hi, I currently have a form containing three boxes of info to be filled in. I would like it so if the user presses F10 a list of company names is displayed, using the company names from a table I have. I would like this list to be in a popup window if it is possible. I am using Informix, sco-unix....

2. UNIX for Dummies Questions & Answers

Unix Forms

Hi Im new so be gentle Just starting out in programing and i want to try unix to see what all the fuss is about. But right now im like a kid in a sweet shop, spoilt for choice. Theres red hat, fedora, linux, ubuntu and thats just for starters I've been told ubuntu is a nice...

3. Shell Programming and Scripting

Linguistic project: extract co-occurrences from text corpus

Hello guys, I've got a big corpus (a huge text file in which words are separated by one or several spaces). I would like to know if there is a simple way - using awk for instance - to extract any co-occurrence appearing at least 3times through the whole corpus for a given word. By co-occurrence,...

4. Shell Programming and Scripting

Grepping large list of files

Hi All, I need help to know the exact command when I grep large list of files. Either using ls or find command. However I do not want to find in the subdirectories as the number of subdirectories are not fixed. How do I achieve that. I want something like this: find ./ -name "MYFILE*.txt"...

5. Shell Programming and Scripting

Performance issue in Grepping large files

I have around 300 files(*.rdf,*.fmb,*.pll,*.ctl,*.sh,*.sql,*.prog) which are of large size. Around 8000 keywords(which will be in the file $keywordfile) needed to be searched inside those files. If a keyword is found in a file..I have to insert the filename,extension,catagoery,keyword,occurrence...

6. Shell Programming and Scripting

Creating Frequency of words from a file by accessing a corpus

Hello, I have a large file of syllables /strings in Urdu. Each word is on a separate line. Example in English: be at for if being attract I need to identify the frequency of each of these strings from a large corpus (which I cannot attach unfortunately because of size limitations) and...

7. Homework & Coursework Questions

Dialog forms

1. The problem statement, all variables and given/known data: I need to create dialog interface for adress book i created a while ago but i don't know how to read info from forms 2. Relevant commands, code, scripts, algorithms: #!/bin/bash knyga="adresu-knyga.txt" dialog...

8. Shell Programming and Scripting

Creating verbal structures from a dictionary and a template

My main aim here is to create a database of verbs in a language to Hindi. The output if it works well will be put up on a University site for researchers to use for Machine Translation. This because one of the main weaknesses of MT is in the area of verbs. Sorry for the long post but the problem...

9. Shell Programming and Scripting

Alignment tool to join text files in 2 directories to create a parallel corpus

I have two directories called English and Hindi. Each directory contains the same number of files with the only difference being that in the case of the English Directory the tag is .english and in the Hindi one the tag is .Hindi The file may contain either a single text or more than one text...

LEARN ABOUT DEBIAN

apertium-preprocess-corpus-lextor

apertium-preprocess-corpus-lextor(1)									      apertium-preprocess-corpus-lextor(1)

NAME

       apertium-preprocess-corpus-lextor - This application is part of ( apertium )

       This tool is part of the apertium machine translation architecture: http://apertium.org.

SYNOPSIS

       apertium-preprocess-corpus-lextor data_dir translation_dir input_file output_file

DESCRIPTION

       apertium-preprocess-corpus-lextor is the application responsible for preprocessing the training corpus for the lexical selector training.

OPTIONS

       This tool currently has no options.

FILES

       These are the kinds of files and directories used with this tool:

       data_dir the path to the linguistic data to use.

       translation_dir the translation direction to use.

       input_file contains a large corpus in raw format.

       output_file The file which gets the preprocessed corpus.

SEE ALSO

       apertium-gen-lextorbil(1),     apertium-gen-lextormono(1),     apertium-gen-lextor-eval(1),     apertium-gen-stopwords-lextor(1),     aper-
       tium-gen-wlist-lextor(1), apertium-gen-wlist-lextor-translation(1), apertium-lextor(1).

BUGS

       Lots of...lurking in the dark and waiting for you!

AUTHOR

       (c) 2005,2006 Universitat d'Alacant / Universidad de Alicante. All rights reserved.

								    2006-12-12				      apertium-preprocess-corpus-lextor(1)

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Forms

Discussion started by: Dan Rooney

2. UNIX for Dummies Questions & Answers

Unix Forms

Discussion started by: NightWatchman

3. Shell Programming and Scripting

Linguistic project: extract co-occurrences from text corpus

Discussion started by: bobylapointe

4. Shell Programming and Scripting

Grepping large list of files

Discussion started by: angshuman

5. Shell Programming and Scripting

Performance issue in Grepping large files

Discussion started by: millan

6. Shell Programming and Scripting

Creating Frequency of words from a file by accessing a corpus

Discussion started by: gimley

7. Homework & Coursework Questions

Dialog forms

Discussion started by: sasisken

8. Shell Programming and Scripting

Creating verbal structures from a dictionary and a template

Discussion started by: gimley

9. Shell Programming and Scripting

Alignment tool to join text files in 2 directories to create a parallel corpus

Discussion started by: gimley

LEARN ABOUT DEBIAN

apertium-preprocess-corpus-lextor