Sponsored Content
Top Forums Shell Programming and Scripting Delete Duplicate line (not really) from the file Post 302614755 by GosarJunk on Wednesday 28th of March 2012 07:53:39 PM
Old 03-28-2012
Thanks ShamRock.

It works on the test file which i posted.

However when i tried it on my actual file of around 5.3 million rows, it stripped out 600K rows which is kind of wrong because when i load this into my database, it complains only for 3 rows. So ideally the difference between the original file and the new file (created by redirecting the awk output) should be 3. This 3 rows are stipped out but i am not sure why other rows were stripped out. I did a check for few and there were no duplicates for them in the original file.

I might be missing something - which i am investigating now. But can you explain your "awk" script? Or if i have to add one more field for checking - how do i check it in the awk script?
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Delete Duplicate records from a tilde delimited file

Hi All, I want to delete duplicate records from a tilde delimited file. Criteria is considering the first 2 fields, the combination of which has to be unique, below is a sample of records in the input file 1620000010338~2446694087~0~20061130220000~A00BCC1CT... (5 Replies)
Discussion started by: irshadm
5 Replies

2. Shell Programming and Scripting

delete semi-duplicate lines from file?

Ok here's what I'm trying to do. I need to get a listing of all the mountpoints on a system into a file, which is easy enough, just using something like "mount | awk '{print $1}'" However, on a couple of systems, they have some mount points looking like this: /stage /stand /usr /MFPIS... (2 Replies)
Discussion started by: paqman
2 Replies

3. Shell Programming and Scripting

how to delete duplicate rows in a file

I have a file content like below. "0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","","" "0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","","" "0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""... (5 Replies)
Discussion started by: vamshikrishnab
5 Replies

4. UNIX for Dummies Questions & Answers

Delete duplicate lines and print to file

OK, I have read several things on how to do this, but can't make it work. I am writing this to a vi file then calling it as an awk script. So I need to search a file for duplicate lines, delete duplicate lines, then write the result to another file, say /home/accountant/files/docs/nodup ... (2 Replies)
Discussion started by: bfurlong
2 Replies

5. UNIX for Dummies Questions & Answers

How to delete or remove duplicate lines in a file

Hi please help me how to remove duplicate lines in any file. I have a file having huge number of lines. i want to remove selected lines in it. And also if there exists duplicate lines, I want to delete the rest & just keep one of them. Please help me with any unix commands or even fortran... (7 Replies)
Discussion started by: reva
7 Replies

6. Shell Programming and Scripting

How to delete a duplicate line and original with sed.

I am completely new to shell scripting but have been assigned the task of creating several batch files to manipulate data. My final task requires me to find lines that have duplicates present then delete not only the duplicate but the original as well. The script will be used in a windows... (9 Replies)
Discussion started by: chino_1
9 Replies

7. UNIX for Dummies Questions & Answers

Delete duplicate second line

Hi ALL I need a help I need to retain only the first line of 035 if I have two line before =040 , if only one then need to take that Eg: Input =035 (ABC)12324141241 =035 (XYZPQR)704124 =040 AB$QS$WEWR =035 (ABC)08080880809 =035 (XYZPQR)9809314 =040 ... (4 Replies)
Discussion started by: umapearl
4 Replies

8. UNIX for Dummies Questions & Answers

Sort and delete partical duplicate file

I want to delete partical duplicate file >gma-miR156d Gm01,PACID=26323927 150.00 -18.28 2 18 17 35 16 75.00% 81.25% >>gma-miR156d Gm01,PACID=26323927 150.00 -18.28 150.00 -18.28 1 21 119 17 I want to order by the second column and delete the... (1 Reply)
Discussion started by: grace_shen
1 Replies

9. Shell Programming and Scripting

Delete duplicate strings in a line

Hi, i need help to remove duplicates in my file. The problem is i need to delete one duplicate for each line only. the input file as follows and it is not tab delimited:- The output need to remove 2nd word (in red) that duplicate with 1st word (in blue). Other duplicates should remained... (12 Replies)
Discussion started by: redse171
12 Replies

10. UNIX for Dummies Questions & Answers

Log file - Delete duplicate line & keep last date

Hello All ! I need your help on this case, I have a csv file with this: ITEM105;ARI FSR;2016-02-01 08:02;243 ITEM101;ARI FSR;2016-02-01 06:02;240 ITEM032;RNO TLE;2016-02-01 11:03;320 ITEM032;RNO TLE;2016-02-02 05:43;320 ITEM032;RNO TLE;2016-02-01 02:03;320 ITEM032;RNO... (2 Replies)
Discussion started by: vadim-bzh
2 Replies
IGAWK(1)							 Utility Commands							  IGAWK(1)

NAME
igawk - gawk with include files SYNOPSIS
igawk [ all gawk options ] -f program-file [ -- ] file ... igawk [ all gawk options ] [ -- ] program-text file ... DESCRIPTION
Igawk is a simple shell script that adds the ability to have ``include files'' to gawk(1). AWK programs for igawk are the same as for gawk, except that, in addition, you may have lines like @include getopt.awk in your program to include the file getopt.awk from either the current directory or one of the other directories in the search path. OPTIONS
See gawk(1) for a full description of the AWK language and the options that gawk supports. EXAMPLES
cat << EOF > test.awk @include getopt.awk BEGIN { while (getopt(ARGC, ARGV, "am:q") != -1) ... } EOF igawk -f test.awk SEE ALSO
gawk(1) Effective AWK Programming, Edition 1.0, published by the Free Software Foundation, 1995. AUTHOR
Arnold Robbins (arnold@skeeve.com). Free Software Foundation Nov 3 1999 IGAWK(1)
All times are GMT -4. The time now is 02:12 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy