05-14-2011
Hello,
Am I still doing something wrong.
I used perl at the command line:
perl conc.pl corpus syllables
where corpus is the data from which syllables have to be found
syllables is the file which contains the syllables.
I even tried reversing the command line order, but got no output at all.
Am I doing something wrong. Sorry for the hassle. I walked through the code and it should spew out the syllables. Is the command-line wrong.
Many thanks
5 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hello,
Some time back I had posted a request for a syllable concordance in which if a syllable was provided in a file, the program would extract a word from a file entitled "Corpus" matching that syllable. The program was
The following script was provided which did the job and for which I am... (3 Replies)
Discussion started by: gimley
3 Replies
2. Shell Programming and Scripting
Hello,
I am a relative newbie and want to split Names in English into syllables. Does anyone know of a perl script which does that. Since my main area is linguistics, I would be happy to add rules to it and post the perl script back for other users. I tried the CPan perl modules but they don't... (6 Replies)
Discussion started by: gimley
6 Replies
3. Shell Programming and Scripting
I am working on a database of a language using Arabic Script. One of the major issues is that the shape of the characters changes according to their initial, medial or final positioning. Another major issue is that of the clustering of vowels within the word: the clustering changes totally the... (9 Replies)
Discussion started by: gimley
9 Replies
4. Shell Programming and Scripting
I have found this syllable splitter in awk. The code is given below. Basically the script cuts words and names into syllables. However it fails when the word contains 2 consonants which constitute a single syllable. An example is given below
ashford
raphael
The output is as under:
... (4 Replies)
Discussion started by: gimley
4 Replies
5. Shell Programming and Scripting
Hello,
I have written a syllable splitter for Pseudo English and Indic.
I have a large database with the following structure
Syllables in Pseudo English delimited by |=Syllables in Devanagari delimited by |
The tool produces syllables in both scripts. An example is given below:
... (2 Replies)
Discussion started by: gimley
2 Replies
LEARN ABOUT DEBIAN
dspam_train
dspam_train(1) DSPAM dspam_train(1)
NAME
dspam_train - train a corpus of mail
SYNOPSIS
dspam_train [username] [--client] [-i index|spam_corpus nonspam_corpus]
DESCRIPTION
dspam_train is used to train and test a corpus of mail (in maildir or MBOX format). This tool will present each message to DSPAM for a
classification and then retrain only if the message was incorrect. This provides close to real-world training and should be used to build
pretrained databases. Upon execution, the tool will automatically determine the ratio of spam:nonspam and train based on that ratio to
ensure both corpora are trained consecutively. This tool can also be used as a test jig to measure the efficiency and accuracy of a partic-
ular corpus against DSPAM in a given configuration.
OPTIONS
--client
If specified, DSPAM is used in client-server mode.
username
Specifies the user to train, if omitted the current user name is used.
-i index
Use a index file instead of the usual spam_corpus and nonspam_corpus.
index : Path to the index file having the following format per line:
[class] [path to message]
spam_corpus
Specifies either the pathname to the directory containing the corpus of spam, with each in a separate file (e.g. maildir format) or
a path to the mailbox in the traditional Unix MBOX format.
nonspam_corpus
Specifies either the pathname to the directory containing the corpus of nonspam with each message in a separate file or a path to
the mailbox in the traditional Unix MBOX format.
EXIT VALUE
0 Operation was successful.
other Operation resulted in an error.
COPYRIGHT
Copyright (C) 2002-2011 DSPAM Project
All rights reserved.
For more information, see http://dspam.sourceforge.net.
SEE ALSO
dspam(1), dspam_admin(1), dspam_clean(1), dspam_crc(1), dspam_dump(1), dspam_logrotate(1), dspam_merge(1), dspam_stats(1)
DSPAM
Apr 17, 2010 dspam_train(1)