Hi All
This is not class assignment . I would like to know awk script how to
list all the duplicate name from a file ,have a look below
Sl No Name Dt of birth Location
1 aaa 1/01/1975 delhi
2 bbb 2/03/1977 mumbai
3 aaa 1/01/1976 mumbai
4 bbb 2/03/1975 chennai
5 aaa 1/01/1975 kolkatta
6 bbb 2/03/1977 bangalore
Here what I would like is if the DOB is same and name is same then print all the details . I have tried with the command "uniq -D " in the awk script
but could not succeed.
With thanks in advance for guidence !!!
When I read the question, I had in mind a solution using arrays like that of vgersh99.
Finally I tried to see whether it were easy to make without arrays, and it's that solution that i have posted.
The vgersh99' solution is simpler and more readable.
I wanted to see the differences in performance between the two solution for a large volume of data.
For that I adapted the two solutions to determine the number of files duplicated files on my system.
I have build a file containing the list of all the files (field 1: directory path, field 2: name of the file)
The result file contains 64000 duplicate files approximately.
The solution with arrays : The solution without arrays :
The -T option of the sort command was required because there wasn't sufficient space available for work files on the current filesystem.
In fact, the sort itself takes more time to run that the complete solution with arrays. Conclusion:
The arrays win the contest.
Awk' arrays are yours friends.
They are easy to use and powerful.
I have a job that produces a file of barcodes that gets added to every time the job runs
I want to check the list to see if the barcode is already in the list and report it out if it is. (3 Replies)
Dear folks
I have a map file of around 54K lines and some of the values in the second column have the same value and I want to find them and delete all of the same values. I looked over duplicate commands but my case is not to keep one of the duplicate values. I want to remove all of the same... (4 Replies)
Hello,
I have a huge directory (with millions of files) and need to find out duplicates based on BOTH file name and File size.
I know fdupes but it calculates MD5 which is very time-consuming and especially it takes forever as I have millions of files.
Can anyone please suggest a script or... (7 Replies)
Gents,
I needs to delete duplicate values and only get uniq values based in columns 2-27
Always we should keep the last record found...
I need to store one clean file and other with the duplicate values removed.
Input :
S3033.0 7305.01 0 420123.8... (18 Replies)
Gents,
I have a file like this.
1 1
1 2
2 3
2 4
2 5
3 6
3 7
4 8
5 9
I would like to get something like it
1 1 2
2 3 4 5
3 6 7
Thanks in advance for your support :b: (8 Replies)
Hi All,
i have file like
ID|Indiv_ID
12345|10001
|10001
|10001
23456|10002
|10002
|10002
|10002
|10003
|10004
if indiv_id having duplicate values and corresponding ID column is null then copy the id.
I need output like:
ID|Indiv_ID
12345|10001... (11 Replies)
Hi,
In a file, I have to mark duplicate records as 'D' and the latest record alone as 'C'.
In the below file, I have to identify if duplicate records are there or not based on Man_ID, Man_DT, Ship_ID and I have to mark the record with latest Ship_DT as "C" and other as "D" (I have to create... (7 Replies)
EDIT : This is for perl
@data2 = grep(/$data/, @list_now);
This gives me @data2 as
Printing data2 11 testzone1 running /zones/testzone1 ***-*****-****-*****-***** native shared
But I really cant access data2 by its individual elements.
$data2 is the entire list, while $data,2,3...... (1 Reply)
I have a list which contains all the jar files shipped with the product I am involved with. Now, in this list I have some jar files which appear again and again. But these jar files are present in different folders.
My input file looks like this
/path/1/to a.jar
/path/2/to a.jar
/path/1/to... (10 Replies)