10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
I have a large file 1.5 gb and want to sort the file.
I used the following AWK script to do the job
!x++
The script works but it is very slow and takes over an hour to do the job. I suspect this is because the file is not sorted.
Any solution to speed up the AWk script or a Perl script would... (4 Replies)
Discussion started by: gimley
4 Replies
2. Shell Programming and Scripting
Hello,
I have a script which removes duplicates in a database with a single delimiter
=
The script is given below:
# script to remove dupes from a row with structure word=word
BEGIN{FS="="}
{for(i=1;i<=NF;i++){a++;}for(i in a){b=b"="i}{sub("=","",b);$0=b;b="";delete a}}1
How do I modify... (6 Replies)
Discussion started by: gimley
6 Replies
3. Shell Programming and Scripting
Dear all,
I have a large dictionary database which has the following structure
source word=target word
e.g.
book=livre
Since the database is very large in spite of all the care taken, it so happens that at times the source word is repeated
e.g.
book=livre
book=tome
Since I want to... (7 Replies)
Discussion started by: gimley
7 Replies
4. Shell Programming and Scripting
Hello,
I have a database of name variants with the following structure:
variant=variant=variant
The number of variants can be as many as thirty to forty.
Since the database is quite large (at present around 60,000 lines) duplicate sets of variants creep in. Thus
John=Johann=Jon
and... (2 Replies)
Discussion started by: gimley
2 Replies
5. Shell Programming and Scripting
Hello,
I have a very large dictionary file which is in text format and which contains a large number of sub-sections. Each sub-section starts with the following header :
#DATA
#VALID 1
and ends with a footer as shown below
#END
The data between the Header and the Footer consists of... (6 Replies)
Discussion started by: gimley
6 Replies
6. Shell Programming and Scripting
I am working on a homonym dictionary of names i.e. names which are clustered together according to their “sound-alike” pronunciation:
An example will make this clear:
Since the dictionary is manually constructed it often happens that inadvertently two sets of “homonyms” which should be grouped... (2 Replies)
Discussion started by: gimley
2 Replies
7. Shell Programming and Scripting
Hello,
I have a large database in which name homonyms are arranged in a row. Since the database is large and generated by hand, very often dupes creep in. I want to remove the dupes either using an awk or perl script.
An input is given below
The expected output is given below:
As can be... (2 Replies)
Discussion started by: gimley
2 Replies
8. Shell Programming and Scripting
Hi,
I have the following command in place
nawk -F, '!a++' file > file.uniq
It has been working perfectly as per requirements, by removing duplicates by taking into consideration only first 3 fields. Recently it has started giving below error:
bash-3.2$ nawk -F, '!a++'... (17 Replies)
Discussion started by: makn
17 Replies
9. Shell Programming and Scripting
I am compiling a synonym dictionary which has the following structure
Headword=Synonym1,Synonym2 and so on, with each synonym separated by a comma.
As is usual in such cases manual preparation of synonyms results in repeating the synonym which results in dupes as in the example below:... (3 Replies)
Discussion started by: gimley
3 Replies
10. Shell Programming and Scripting
Hello,
I have two files. File1 or the master file contains two columns separated by a delimiter:
a=b
b=d
e=f
g=h
File 2 which is the file to be processed has only a single column
a
h
c
b
What I need is an awk script to identify unique names from file 2 which are not found in the... (6 Replies)
Discussion started by: gimley
6 Replies