Mapping syllables in English to syllables in Indic


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Mapping syllables in English to syllables in Indic
# 1  
Old 04-27-2017
Mapping syllables in English to syllables in Indic

Hello,
I have a large file with the following structure
Code:
Englishpseudo syllable[SPACE]Englishpseudo syllable=Indicsyllable[SPACE]Indicsyllable

An example will make this clear:
Code:
la l=ला ल
gi ta=गी ता
ka la va ti=कa ला वa ती
ma h to=मa ह तो
ra je sh=रा जे श
a sha=आ शा
ra me sh=रa मे श
san ja y=सं जa य
ku ma ri=कु मा री
su shi la=सु शी ला
u sha=उ षा
su re sh=सु रे श
ka m la=कa म ला
mu nni=मु न्नी

What I need is that each English syllable should map to its Indic counterpart.
Code:
Case of san ja y=सं जa य
Expected output
san=सं
ja=जa
y=य

At present I am doing this through a Macro in Ultraedit, but since the database is large[around 80,000 words, the macro takes a lot of time
Can an AWK or PERL script speed up the process.
I work in a Windows environment.
Many thanks
# 2  
Old 04-27-2017
Try
Code:
awk -F= '
        {n = split ($1, T1, " ")
         m = split ($2, T2, " ")
         for (i=1; i<=n; i++) print T1[i], FS, T2[i]
        }
' file
la = ला
l = ल
gi = गी
ta = ता
.
.
.
san = सं
ja = जa
y = य
.
.
.

# 3  
Old 04-27-2017
Note that if you remove the commas in the print statement RudiC suggested:
Code:
print T1[i] FS T2[i]

you'll get rid of the unwanted <space>s around the equals sign in the output:
Code:
la=ला
l=ल
gi=गी
ta=ता
.
.
.
san=सं
ja=जa
y=य
.
.
.

# 4  
Old 04-27-2017
Many thanks. Am replying from my phone. Could I please check out in say 5 hours time, when I'll be back.

---------- Post updated at 09:03 AM ---------- Previous update was at 03:04 AM ----------

Sorry for the delay in posting. The tool works just fine. Many thanks
Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Matching number of syllables on right-hand and left side

I am developing a database for translating names. I have mapped through a rule engine syllables in English to syllables in Indic, delimited by an equal to sign. An example will illustrate this ra m=रा म ku ma r=कु मा र mo=मो la l=ला ल gi ta=गी ता ka la va ti=कa ला वa ती However it so... (3 Replies)
Discussion started by: gimley
3 Replies

2. Shell Programming and Scripting

Translate from english to french

Hi, I wrote a script to convert a given word from English to French. But I am not able to figure out what I am missing here. I am not able to get the translated word Below is my script: French=/root/dict/entofr.txt for i in $* do word="echo $word $i" done while: do cat <<... (1 Reply)
Discussion started by: pinky7630
1 Replies

3. Shell Programming and Scripting

Creating unique mapping from multiple mapping

Hello, I do not know if this is the right title to use. I have a large dictionary database which has the following structure: where a b c d e are in English and p q r s t are in a target language., the two separated by the delimiter =. What I am looking for is a perl script which will take... (5 Replies)
Discussion started by: gimley
5 Replies

4. Shell Programming and Scripting

syllables detect algorithm

can anybody help me with an shell script algorithm for detecting the number of syllabes from a word? (4 Replies)
Discussion started by: bismillah
4 Replies

5. Shell Programming and Scripting

convert english to chinese

Hi Experts, Can anyone help me to convert a english input into chinese in a bash script. help would be highly appreciable. thanks, Deepak (3 Replies)
Discussion started by: naw_deepak
3 Replies

6. UNIX for Dummies Questions & Answers

translate to normal english

lnode * head = temp; (1 Reply)
Discussion started by: rickym2626
1 Replies

7. Ubuntu

LANG=C not English?

On Ubuntu 7.04, why would the "C" LANG parameter not be English: $ LANG=C locale LANG=C LANGUAGE=he_IL:he:en_GB:en LC_CTYPE="he_IL.utf8" LC_NUMERIC="he_IL.utf8" LC_TIME="he_IL.utf8" LC_COLLATE="he_IL.utf8" LC_MONETARY="he_IL.utf8" LC_MESSAGES="he_IL.utf8" LC_PAPER="he_IL.utf8"... (4 Replies)
Discussion started by: dotancohen
4 Replies

8. Shell Programming and Scripting

Please decode in English

Hello: Can anyone please decode this script in English. I have also made some comments which I know.. The actual script does not have one comment also.. #! /bin/ksh . odbmsprd_env.ksh #setting the env.. echo $0 Started at : `date '+%d-%m-%Y %H:%M:%S'` # what's echo $0 ... (4 Replies)
Discussion started by: ST2000
4 Replies
Login or Register to Ask a Question