Dear all,
I am working on a large Sindhi lexicon which I hope to complete by 2017 and place in open source. The database is in Arabic script in two columns delimited by an equal to sign.
Column 1 contains a word or words without the short vowel and also some extraneous information which is stored in brackets.
Column 2 contains the word along with the short vowels whose list I am giving below along with their Unicode values.
It may also be that column 2 can have the same word repeated without any short vowels as is the following case in line 5 of the database
What I need is an awk or perl script which will identify only these words whose edit distance is limited to the conditions outlined above i.e.
a. ignore all words/strings in brackets
b. identify words that are similar
c. identify words delimited by the three characters
and store such words in a separate file, delimited by an equal to sign.
A small input sample is provided below
The expected output is as under. Cleaned by hand and hopefully meeting the conditions specified above
Since the number of words compiled is around 80,000, a script using edit distance would help. I checked out awk and perl scripts for edit distance and tried to tweak them for this purpose, but they did not work out successfully.
Many thanks in advance from me and the community who,hopefully, will benefit from the database.
Hi,
I have a file like
$ cat abc
HDR XXX
content XXX
content YYY
content XXX
content YYY
content XXX
content YYY
TRL YYYI want to replace the lines staritng with HDR and TRL
For this I have written below code
#!/usr/bin/perl -w
use strict;
open ( FH , "+< abc" ) || die "Can't... (1 Reply)
Hey guys,
I'm trying to learn a bit of awk/sed and I'm using different sites to learn it from, and i think I'm starting to get confused (doesn't take much!).
Anyway, say I have a csv file which has something along the lines of the following in it:"test","127.0.0.1","startup... (6 Replies)
Is there a way to edit a file without opening two files
the only method I know is
one file for reading from
and one file writing to
I cannot think of any other ways (4 Replies)
Hi there, i need some help please...
I have this text, it's name data.txt that contains the following information:
Mark Owen: 6999999888 6999999888 +302310999999 2310999999
Steve Blade Pit: +30691111222 2310888777 6999999888
John Rose: 2310777555 310544565 +302310999999
Mary Stuart:... (7 Replies)
Hello. I am taking a Perl class in college and we've briefly covered SQL and moved on. We have a term project and we can do whatever we want. My project will rely strongly on an SQL Database so I am trying to learn as much about Perl DBI as I can to get things up and going.
I am basically... (1 Reply)
Hi power user,
I have this type of data (distance list):
file1
A B 10
B C 20
C D 50I want output like this
# A B C D
A 0 10 30 80
B 10 0 20 70
C 30 20 0 50
D 80 70 50 0 Which is a distance matrix
I have tried... (0 Replies)
I have the follwoing file:
This looks to be : seperated.
For the first field i want only the file name without ".txt" and also i want to remove "+" sign if the second field starts with "+" sign.
Input file:
Output file:
Appreciate your help (9 Replies)
Hi,
sample file looks like this..
<hp>
<name>
<detail>adsg</detail>
...
...
</name><ft>4264</ft>
</hp>
I need to edit the last but one line using perl script. I want the format to be ..
<hp>
<name>
<detail>adsg</detail>
...
...
</name> (9 Replies)
Hi,
How can I edit a line in a file?
For example, a.txt contains:
start: 1 2 3 4
stop: a b c d
and I want to change "3" to "9"
and to add "5" after "4"
the result should be (a.txt):
start: 1 9 3 4 5
stop: a b c d
Thanks,
zed (5 Replies)
Hi,
I need to get a script together to edit the dhcp service configuration file dhcpd.conf.
Mac addresses are defined in classes ex.
class "HOST1" { match if substring (hardware, 1,18)=00:11:11:FF:FF:FF;}
class "HOST2" ...
class "HOST3" ...
...
followed by allow or deny statements:... (4 Replies)