Sponsored Content
Top Forums Shell Programming and Scripting Creating a syllable concordance Post 302522292 by gimley on Saturday 14th of May 2011 07:37:00 AM
Old 05-14-2011
Hello,
Am I still doing something wrong.
I used perl at the command line:

perl conc.pl corpus syllables

where corpus is the data from which syllables have to be found
syllables is the file which contains the syllables.
I even tried reversing the command line order, but got no output at all.
Am I doing something wrong. Sorry for the hassle. I walked through the code and it should spew out the syllables. Is the command-line wrong.
Many thanks
 

5 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

CREATING A SYLLABLE CONCORDANCE WITH POSITIONAL VARIANTS

Hello, Some time back I had posted a request for a syllable concordance in which if a syllable was provided in a file, the program would extract a word from a file entitled "Corpus" matching that syllable. The program was The following script was provided which did the job and for which I am... (3 Replies)
Discussion started by: gimley
3 Replies

2. Shell Programming and Scripting

Syllable splitter in Perl

Hello, I am a relative newbie and want to split Names in English into syllables. Does anyone know of a perl script which does that. Since my main area is linguistics, I would be happy to add rules to it and post the perl script back for other users. I tried the CPan perl modules but they don't... (6 Replies)
Discussion started by: gimley
6 Replies

3. Shell Programming and Scripting

Writing a clustering concordance for a Perso-Arabic script

I am working on a database of a language using Arabic Script. One of the major issues is that the shape of the characters changes according to their initial, medial or final positioning. Another major issue is that of the clustering of vowels within the word: the clustering changes totally the... (9 Replies)
Discussion started by: gimley
9 Replies

4. Shell Programming and Scripting

Modifying an awk script for syllable splitting

I have found this syllable splitter in awk. The code is given below. Basically the script cuts words and names into syllables. However it fails when the word contains 2 consonants which constitute a single syllable. An example is given below ashford raphael The output is as under: ... (4 Replies)
Discussion started by: gimley
4 Replies

5. Shell Programming and Scripting

Find Syllable count mismatch

Hello, I have written a syllable splitter for Pseudo English and Indic. I have a large database with the following structure Syllables in Pseudo English delimited by |=Syllables in Devanagari delimited by | The tool produces syllables in both scripts. An example is given below: ... (2 Replies)
Discussion started by: gimley
2 Replies
dspam_train(1)							       DSPAM							    dspam_train(1)

NAME
dspam_train - train a corpus of mail SYNOPSIS
dspam_train [username] [--client] [-i index|spam_corpus nonspam_corpus] DESCRIPTION
dspam_train is used to train and test a corpus of mail (in maildir or MBOX format). This tool will present each message to DSPAM for a classification and then retrain only if the message was incorrect. This provides close to real-world training and should be used to build pretrained databases. Upon execution, the tool will automatically determine the ratio of spam:nonspam and train based on that ratio to ensure both corpora are trained consecutively. This tool can also be used as a test jig to measure the efficiency and accuracy of a partic- ular corpus against DSPAM in a given configuration. OPTIONS
--client If specified, DSPAM is used in client-server mode. username Specifies the user to train, if omitted the current user name is used. -i index Use a index file instead of the usual spam_corpus and nonspam_corpus. index : Path to the index file having the following format per line: [class] [path to message] spam_corpus Specifies either the pathname to the directory containing the corpus of spam, with each in a separate file (e.g. maildir format) or a path to the mailbox in the traditional Unix MBOX format. nonspam_corpus Specifies either the pathname to the directory containing the corpus of nonspam with each message in a separate file or a path to the mailbox in the traditional Unix MBOX format. EXIT VALUE
0 Operation was successful. other Operation resulted in an error. COPYRIGHT
Copyright (C) 2002-2011 DSPAM Project All rights reserved. For more information, see http://dspam.sourceforge.net. SEE ALSO
dspam(1), dspam_admin(1), dspam_clean(1), dspam_crc(1), dspam_dump(1), dspam_logrotate(1), dspam_merge(1), dspam_stats(1) DSPAM
Apr 17, 2010 dspam_train(1)
All times are GMT -4. The time now is 02:22 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy