03-28-2012
Thanks ShamRock.
It works on the test file which i posted.
However when i tried it on my actual file of around 5.3 million rows, it stripped out 600K rows which is kind of wrong because when i load this into my database, it complains only for 3 rows. So ideally the difference between the original file and the new file (created by redirecting the awk output) should be 3. This 3 rows are stipped out but i am not sure why other rows were stripped out. I did a check for few and there were no duplicates for them in the original file.
I might be missing something - which i am investigating now. But can you explain your "awk" script? Or if i have to add one more field for checking - how do i check it in the awk script?
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi All,
I want to delete duplicate records from a tilde delimited file. Criteria is considering the first 2 fields, the combination of which has to be unique, below is a sample of records in the input file
1620000010338~2446694087~0~20061130220000~A00BCC1CT... (5 Replies)
Discussion started by: irshadm
5 Replies
2. Shell Programming and Scripting
Ok here's what I'm trying to do. I need to get a listing of all the mountpoints on a system into a file, which is easy enough, just using something like "mount | awk '{print $1}'"
However, on a couple of systems, they have some mount points looking like this:
/stage
/stand
/usr
/MFPIS... (2 Replies)
Discussion started by: paqman
2 Replies
3. Shell Programming and Scripting
I have a file content like below.
"0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""... (5 Replies)
Discussion started by: vamshikrishnab
5 Replies
4. UNIX for Dummies Questions & Answers
OK, I have read several things on how to do this, but can't make it work. I am writing this to a vi file then calling it as an awk script.
So I need to search a file for duplicate lines, delete duplicate lines, then write the result to another file, say /home/accountant/files/docs/nodup
... (2 Replies)
Discussion started by: bfurlong
2 Replies
5. UNIX for Dummies Questions & Answers
Hi please help me how to remove duplicate lines in any file.
I have a file having huge number of lines.
i want to remove selected lines in it.
And also if there exists duplicate lines, I want to delete the rest & just keep one of them.
Please help me with any unix commands or even fortran... (7 Replies)
Discussion started by: reva
7 Replies
6. Shell Programming and Scripting
I am completely new to shell scripting but have been assigned the task of creating several batch files to manipulate data. My final task requires me to find lines that have duplicates present then delete not only the duplicate but the original as well. The script will be used in a windows... (9 Replies)
Discussion started by: chino_1
9 Replies
7. UNIX for Dummies Questions & Answers
Hi ALL
I need a help
I need to retain only the first line of 035 if I have two line before =040 , if only one then need to take that
Eg:
Input
=035 (ABC)12324141241
=035 (XYZPQR)704124
=040 AB$QS$WEWR
=035 (ABC)08080880809
=035 (XYZPQR)9809314
=040 ... (4 Replies)
Discussion started by: umapearl
4 Replies
8. UNIX for Dummies Questions & Answers
I want to delete partical duplicate file
>gma-miR156d Gm01,PACID=26323927 150.00 -18.28 2 18 17 35 16 75.00% 81.25%
>>gma-miR156d Gm01,PACID=26323927 150.00 -18.28 150.00 -18.28 1 21 119 17
I want to order by the second column and delete the... (1 Reply)
Discussion started by: grace_shen
1 Replies
9. Shell Programming and Scripting
Hi,
i need help to remove duplicates in my file. The problem is i need to delete one duplicate for each line only. the input file as follows and it is not tab delimited:-
The output need to remove 2nd word (in red) that duplicate with 1st word (in blue). Other duplicates should remained... (12 Replies)
Discussion started by: redse171
12 Replies
10. UNIX for Dummies Questions & Answers
Hello All !
I need your help on this case,
I have a csv file with this:
ITEM105;ARI FSR;2016-02-01 08:02;243
ITEM101;ARI FSR;2016-02-01 06:02;240
ITEM032;RNO TLE;2016-02-01 11:03;320
ITEM032;RNO TLE;2016-02-02 05:43;320
ITEM032;RNO TLE;2016-02-01 02:03;320
ITEM032;RNO... (2 Replies)
Discussion started by: vadim-bzh
2 Replies
IGAWK(1) Utility Commands IGAWK(1)
NAME
igawk - gawk with include files
SYNOPSIS
igawk [ all gawk options ] -f program-file [ -- ] file ...
igawk [ all gawk options ] [ -- ] program-text file ...
DESCRIPTION
Igawk is a simple shell script that adds the ability to have ``include files'' to gawk(1).
AWK programs for igawk are the same as for gawk, except that, in addition, you may have lines like
@include getopt.awk
in your program to include the file getopt.awk from either the current directory or one of the other directories in the search path.
OPTIONS
See gawk(1) for a full description of the AWK language and the options that gawk supports.
EXAMPLES
cat << EOF > test.awk
@include getopt.awk
BEGIN {
while (getopt(ARGC, ARGV, "am:q") != -1)
...
}
EOF
igawk -f test.awk
SEE ALSO
gawk(1)
Effective AWK Programming, Edition 1.0, published by the Free Software Foundation, 1995.
AUTHOR
Arnold Robbins (arnold@skeeve.com).
Free Software Foundation Nov 3 1999 IGAWK(1)