Sponsored Content
Full Discussion: Find Syllable count mismatch
Top Forums Shell Programming and Scripting Find Syllable count mismatch Post 303029510 by gimley on Monday 28th of January 2019 08:28:43 AM
Old 01-28-2019
Find Syllable count mismatch

Hello,
I have written a syllable splitter for Pseudo English [conforming to the rules of Indic] and Indic.
I have a large database with the following structure
Code:
Syllables in Pseudo English delimited by |=Syllables in Devanagari delimited by |

The tool produces syllables in both scripts. An example is given below:
Code:
a|bba|l=अ|ब्ब|ल
a|bA|s=अ|बा|स
a|bbA|s=अ|ब्बा|स
A|ba|dA=आ|ब|दा
a|bde|sh=अ|ब्दे|श
a|b|dhe|sh=अ|ब|धे|श
a|bdu|l=अ|ब्दु|ल
a|bdu|lA=अ|ब्दु|ला
a|bdu|llA=अ|ब्दु|ल्ला
a|bdu|lla|h=अ|ब्दु|ल्ल|ह
a|bdu|llA|h=अ|ब्दु|ल्ला|ह
a|bdu|r=अ|ब्दु|र
A|bhA=आ|भा
a|bha|y=अ|भ|य
a|bhi=अ|भी
a|bhi|ji|t=अ|भि|जी|त

However at times the software goofs up and the number of syllables on either side do not match as in:
Code:
zu|ba|i|dA=ज़ु|बै|दा
zu|ba|i|r=ज़ु|बै|र

As can be seen there is a mismatch: English admits 4 syllables and Devanagari admits only 3
I work in Windows environment and what I need is a script in awk or Perl which will run through the file and identify mismatches as in the example above.
Many thanks for your help and since this is my first post for the Year, belated Happy New Year.
 

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

comparing two files and find mismatch

hi i have two files and i want to compare both the files and find out mismatch in 3rd file file1 00354|1|0|1|1|0|0|0|1|2 52424|1|0|1|1|0|0|0|1|2 43236|1|0|1|1|0|0|0|1|2 41404|1|0|1|1|0|0|0|1|2 79968|1|0|1|1|0|0|0|1|2 file2 00354|1|0|1|1|0|0|0|1|2 52424|1|0|1|1|0|0|0|0|2... (9 Replies)
Discussion started by: dodasajan
9 Replies

2. Shell Programming and Scripting

To find String mismatch

Hi, I have a doubt when searching files for the existence of a particular key. I have a property file has data with key and value pair like below and i call it as property file.ini here are the contents in File: popertyfile.ini location.property=2 agent.method=begin newkey=23 ... (2 Replies)
Discussion started by: raghu.amilineni
2 Replies

3. Shell Programming and Scripting

Creating a syllable concordance

Hello, I have two files. The first file contains specific syllables of a language (Hindi) and the second file contains a large database from which these syllables have been culled. The syllable file which has syllables in Hindi has one syllable per line and the corpus file has a data... (8 Replies)
Discussion started by: gimley
8 Replies

4. Shell Programming and Scripting

Syllable splitter in Perl

Hello, I am a relative newbie and want to split Names in English into syllables. Does anyone know of a perl script which does that. Since my main area is linguistics, I would be happy to add rules to it and post the perl script back for other users. I tried the CPan perl modules but they don't... (6 Replies)
Discussion started by: gimley
6 Replies

5. UNIX for Dummies Questions & Answers

Files count mismatch when used with Tar with find

Hi I have used the below steps and found some discrepancies step 1 : find ./ -type f -mtime +7 -name "*.00*" | wc -l = 13519 ( total files ) ( the size of this files is appx : 10GB ) step 2: find ./ -type f -mtime +7 -name "*.00*" | xargs tar zcvf Archieve_7.tar.gz step... (7 Replies)
Discussion started by: rakeshkumar
7 Replies

6. Shell Programming and Scripting

Count mismatch in UNIX

Hi, I have a requirement like below. client is sending the .txt filles.In that file we have 10 records but when I execute the below command it is showing 9 records. klena20> wc -l sample_file.txt|awk '{print $1}' It is showing the output as 9 But in a file records are 10. I found... (7 Replies)
Discussion started by: kirankumar
7 Replies

7. Shell Programming and Scripting

awk to output match and mismatch with count using specific fields

In the below awk I am trying output to one file those lines that match between $2,$3,$4 of file1 and file2 with the count in (). I am also trying to output those lines that are missing between $2,$3,$4 of file1 and file2 with the count of in () each. Both input files are tab-delimited, but the... (7 Replies)
Discussion started by: cmccabe
7 Replies

8. UNIX for Beginners Questions & Answers

How to find the count of IP addresses that belong to different subnets and display the count?

Hi, I have a file with a list of bunch of IP addresses from different VLAN's . I am trying to find the list the number of each vlan occurence in the output Here is how my file looks like 1.1.1.1 1.1.1.2 1.1.1.3 1.1.2.1 1.1.2.2 1.1.3.1 1.1.3.2 1.1.3.3 1.1.3.4 So what I am trying... (2 Replies)
Discussion started by: new2prog
2 Replies
IOK(1)							      General Commands Manual							    IOK(1)

NAME
iok- Indic Onscreen Keyboard SYNOPSIS
iok [-a] [-h] [-d 1] [-n LANGCODE] DESCRIPTION
Indic Onscreen Keyboard currently shows Inscript and Inscript2 keymaps for 22 official Indian languages. The languages are Assamese, Bengali, Bodo, Dogri, Gujarati, Hindi, Kannada, Kokani, Maithili, Malayalam, Manipuri, Marathi, Nepali, Oriya, Punjabi, Sanskrit, Santali, Sindhi, Tamil, Telugu. The iok application runs in default and advanced mode. In default mode, iok starts by loading Inscript2 keymap of the current locale. If keymap is not installed or available then iok shows keymap list in the default mode. User can then select any keymap from keymap list if he want to write using it. In the advanced mode, iok allows to open non-supported keymaps. If keymap can be parsed by iok then it will be showed in iok UI other- wise it will show error message that iok can't load this keymap. Advanced mode also allows to create custom keymap by either swapping or re-assigning character mappings in the existing loaded keymap in iok. Another feature iok supports is Drag and Drop. This will allow user to swap character mappings using mouse. The keymap list shows Inscript and Inscript2 keymaps from location /usr/share/m17n and ~/.m17n.d path. To start iok in normal mode from console, use following command iok To start iok in advanced mode from console, use following command iok -a To start iok in any supported Inscript2 keymap (say in Marathi) use following command iok -n mr As Inscript2 keymap naming also uses language script code for some languages, command to open those keymaps is like this iok -n pa-guru where pa is a isocode name for the Punjabi language and guru is a language script code name in which keymap is written. To use Drag and Drop feature of iok, start iok from console as iok -d 1 The Draft version of Inscript2 keymaps are available at https://fedorahosted.org/inscript2/ This project is available at http://fedorahosted.org/iok/ or http://iok.sourceforge.net OPTIONS
-a It shows the menus and combo box in iok UI -h It show the help -d 1 This will enable Drag and Drop feature only for the single iok invocation. Otherwise iok has disabled Drag and Drop by default. -n LANGCODE In the place of LANGCODE,you need to specify a particular language code. Shows iok UI for that particular language. This will also requires language script code. e.g. for Bodo, Dogri, Kokani, Nepali, Sindhi use its langcode and "-deva" as a language script code. To start iok using Kokani keymap, run "iok -n kok-deva" AUTHOR
Suji A <suji87.msc@gmail.com> , Parag <pnemade@fedoraproject.org> March 12, 2012 IOK(1)
All times are GMT -4. The time now is 02:58 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy