09-24-2011
Syllable splitter in Perl
Hello,
I am a relative newbie and want to split Names in English into syllables. Does anyone know of a perl script which does that. Since my main area is linguistics, I would be happy to add rules to it and post the perl script back for other users. I tried the CPan perl modules but they don't really do what I want.
Any help would be gratefully acknowledged
Many thanks
8 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
I need to split a file into n separate files of about the same size. The way the file will be split is at every nth row, starting with the first row, that row will be cut and copied to it's corresponding new file so that each file has unique records. Any 'leftovers' will go into the last file. e.g.... (4 Replies)
Discussion started by: sitney
4 Replies
2. Programming
I was creating a file using splitter and printwriter. The result in the file come out as:
TO:bbb,ccc,eee
Instead of,
TO:bbb
TO:ccc
TO:eee
May I know what's wrong with this? (1 Reply)
Discussion started by: eel
1 Replies
3. Shell Programming and Scripting
Hello,
I have two files. The first file contains specific syllables of a language (Hindi) and the second file contains a large database from which these syllables have been culled.
The syllable file which has syllables in Hindi has one syllable per line
and the corpus file has a data... (8 Replies)
Discussion started by: gimley
8 Replies
4. Shell Programming and Scripting
Hi,
I need to split files based on text:
BEGIN DSJOB
Identifier "LA"
DateModified "2011-10-28"
TimeModified "11.10.02"
BEGIN DSRECORD
Identifier "ROOT"
BEGIN DSSUBRECORD
Owner "APT"
Name "RecordJobPerformanceData"
Value "0"
... (16 Replies)
Discussion started by: unme
16 Replies
5. Shell Programming and Scripting
I have a source file that contains multiple XML files concatenated in it. The separator string between files is <?xml version="1.0" encoding="utf-8"?>. I wanted to split files in multiple files with mentioned names. I had used a awk code earlier to spilt files in number of lines i.e.
awk... (10 Replies)
Discussion started by: santosh2k2
10 Replies
6. Shell Programming and Scripting
I have below script which does splitting based on a different criteria. can it be amended to produce required result
SrcFileName=XML_DUMP
awk '/<\?xml version="1\.0" encoding="utf-8"\?>/{n++}
n{f="'"${SrcFileName}_"'" sprintf("%04d",n) ".txt"
print >> f
close(f)}' $SrcFileName.txt
My... (3 Replies)
Discussion started by: santosh2k2
3 Replies
7. Shell Programming and Scripting
Hello,
I am writing a Natural Language Parser and one of the tools I need is to separate prepositional phrase markers which begin with a Preposition. I have a long list of such markers (sample given below)and am looking for a script in awk or perl which will allow me to access a look-up file... (2 Replies)
Discussion started by: gimley
2 Replies
8. Shell Programming and Scripting
Hello,
I have written a syllable splitter for Pseudo English and Indic.
I have a large database with the following structure
Syllables in Pseudo English delimited by |=Syllables in Devanagari delimited by |
The tool produces syllables in both scripts. An example is given below:
... (2 Replies)
Discussion started by: gimley
2 Replies
LEARN ABOUT MOJAVE
english5.18
English(3pm) Perl Programmers Reference Guide English(3pm)
NAME
English - use nice English (or awk) names for ugly punctuation variables
SYNOPSIS
use English;
use English qw( -no_match_vars ) ; # Avoids regex performance penalty
# in perl 5.16 and earlier
...
if ($ERRNO =~ /denied/) { ... }
DESCRIPTION
This module provides aliases for the built-in variables whose names no one seems to like to read. Variables with side-effects which get
triggered just by accessing them (like $0) will still be affected.
For those variables that have an awk version, both long and short English alternatives are provided. For example, the $/ variable can be
referred to either $RS or $INPUT_RECORD_SEPARATOR if you are using the English module.
See perlvar for a complete list of these.
PERFORMANCE
NOTE: This was fixed in perl 5.20. Mentioning these three variables no longer makes a speed difference. This section still applies if
your code is to run on perl 5.18 or earlier.
This module can provoke sizeable inefficiencies for regular expressions, due to unfortunate implementation details. If performance matters
in your application and you don't need $PREMATCH, $MATCH, or $POSTMATCH, try doing
use English qw( -no_match_vars ) ;
. It is especially important to do this in modules to avoid penalizing all applications which use them.
perl v5.18.2 2014-01-06 English(3pm)