Duplicate records


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Duplicate records
# 1  
Old 08-19-2016
Duplicate records

Gents,

I have a file which contends duplicate records in column 1, but the values in column 2 are different.

Code:
3099753489 3
3099753489 5
3101954341 12
3101954341 14
3102153285 3
3102153285 5
3102153297 3
3102153297 5

I will like to get something like this:

output desired
Code:
3099753489 3 5
3101954341 12 14
3102153285 3 5
3102153297 3 5

I am trying with this code but does not work.
Code:
awk '{
			D[$1]}{key[$1,$2]++}
			END{
			for (i in key) {
			split (i, T, SUBSEP)
			print T[1],key[i], T[2]}}' file

Please can you help me.Smilie
# 2  
Old 08-19-2016
Is yout input ALWAYS two records? In sequence?
This User Gave Thanks to RudiC For This Post:
# 3  
Old 08-19-2016
Hello jiam912,

Considering that there are always 2 fields into your Input_file then following may help in same.
I- If you are not worried about the output sequence as like Input_file's sequence.
Code:
awk '{A[$1]=A[$1]?A[$1] OFS $2:$2} END{for(i in A){print i OFS A[i]}}'  Input_file

Output will be as follows.
Code:
3099753489 3 5
3102153285 3 5
3101954341 12 14
3102153297 3 5

II- If you need output in sequence as Input_file, then following may help you in same.
Code:
awk 'FNR==NR{A[$1]=A[$1]?A[$1] OFS $2:$2;next} ($1 in A){print $1 OFS A[$1];delete A[$1]}'  Input_file  Input_file

Output will be as follows.
Code:
3099753489 3 5
3101954341 12 14
3102153285 3 5
3102153297 3 5

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 4  
Old 08-19-2016
Hi RudiC.

It can be sometimes more than 2 records in secuence

---------- Post updated at 05:14 AM ---------- Previous update was at 05:13 AM ----------

Hi RavinderSingh13.

Thanks a lot
# 5  
Old 08-19-2016
Quote:
Originally Posted by jiam912
Hi RudiC.
It can be sometimes more than 2 records in secuence
Hello jiam912,

In case you have more than 2 fields into your Input_file then following may help you in same, let's say following is the Input_file.
Code:
cat Input_file
3099753489 3
3099753489 5
3101954341 12 21 31  34 56 78
3101954341 14
3102153285 3
3102153285 5
3102153297 3
3102153297 5

Then following is the code for same.
Code:
awk 'FNR==NR{if(NF>2){for(i=3;i<=NF;i++){Q=Q?Q OFS $i:$i}} else {Q=$2};A[$1]=A[$1]?A[$1] OFS Q:Q;next} ($1 in A){print $1 OFS A[$1];delete A[$1]}'  Input_file  Input_file

Output will be as follows.
Code:
3099753489 3 5
3101954341 5 21 31 34 56 78 14
3102153285 3 5
3102153297 3 5

Thanks,
R. Singh
# 6  
Old 08-19-2016
Try also
Code:
awk 'NR == 1 || $1 != LAST {printf "%s%s", NR==1?"":RS, LAST = $1} {printf " %s", $2} END {print _} ' file
3099753489 3 5
3101954341 12 14 16 24
3102153285 3 5
3102153297 3 5

# 7  
Old 08-19-2016
Dear R. Singh

Thanks a lot

And for this case?.

input

Code:
3099753489 3
3099753489 5
3099753489 7
3101954341 12
3101954341 14
3102153285 3
3102153285 5
3102153297 3
3102153297 5
3102153297 8

output

Code:
3099753489 3 5 7
3101954341 12 14
3102153285 3 5
3102153297 3 5 8

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Duplicate records

Gents, Please give a help file --BAD STATUS NOT RESHOOTED-- *** VP 41255/51341 in sw 2973 *** VP 41679/51521 in sw 2973 *** VP 41687/51653 in sw 2973 *** VP 41719/51629 in sw 2976 --BAD COG NOT RESHOOTED-- *** VP 41689/51497 in sw 2974 *** VP 41699/51677 in sw 2974 *** VP... (18 Replies)
Discussion started by: jiam912
18 Replies

2. Shell Programming and Scripting

How to keep the last 2 records from duplicate entries?

Gents, Please how I can get only the last 2 records from repetead values, from column 2 input 1 1011 1 1011 1 1012 1 1012 1 5001 1 5001 1 5002 1 5002 1 5003 1 5003 1 7001 1 7001 1 7002 1 7002 (2 Replies)
Discussion started by: jiam912
2 Replies

3. Shell Programming and Scripting

Remove duplicate records

Hi, i am working on a script that would remove records or lines in a flat file. The only difference in the file is the "NOT NULL" word. Please see below example of the input file. INPUT FILE:> CREATE a ( TRIAL_CLIENT NOT NULL VARCHAR2(60), TRIAL_FUND NOT NULL... (3 Replies)
Discussion started by: reignangel2003
3 Replies

4. Shell Programming and Scripting

Deleting duplicate records from file 1 if records from file 2 match

I have 2 files "File 1" is delimited by ";" and "File 2" is delimited by "|". File 1 below (3 record shown): Doc1;03/01/2012;New York;6 Main Street;Mr. Smith 1;Mr. Jones Doc2;03/01/2012;Syracuse;876 Broadway;John Davis;Barbara Lull Doc3;03/01/2012;Buffalo;779 Old Windy Road;Charles... (2 Replies)
Discussion started by: vestport
2 Replies

5. UNIX for Dummies Questions & Answers

Need to keep duplicate records

Consider my input is 10 10 20 then, uniq -u will give 20 and uniq -dwill return 10. But i need the output as , 10 10 How we can achieve this? Thanks (4 Replies)
Discussion started by: pandeesh
4 Replies

6. UNIX for Dummies Questions & Answers

Getting non-duplicate records

Hi, I have a file with these records abc xyz xyz pqr uvw cde cde In my o/p file , I want all the non duplicate rows to be shown. o/p abc pqr uvw Any suggestions how to do this? Thanks for the help. rs (2 Replies)
Discussion started by: rs123
2 Replies

7. Shell Programming and Scripting

Remove Duplicate Records

Hi frinds, Need your help. item , color ,desc ==== ======= ==== 1,red ,abc 1,red , a b c 2,blue,x 3,black,y 4,brown,xv 4,brown,x v 4,brown, x v I have to elemnet the duplicate rows on the basis of item. the final out put will be 1,red ,abc (6 Replies)
Discussion started by: imipsita.rath
6 Replies

8. Shell Programming and Scripting

combine duplicate records

I have a .DAT file like below 23666483030000653-B94030001OLFXXX000000120081227 23797049900000654-E71060001OLFXXX000000220081227 23699281320000655 E71060002OLFXXX000000320081227 22885068900000652 B86860003OLFXXX592123320081227 22885068900000652 B86860003ODL-SP592123420081227... (8 Replies)
Discussion started by: kshuser
8 Replies

9. Shell Programming and Scripting

find duplicate records... again

Hi all: Let's suppose I have a file like this (but with many more records). XX ME 342 8688 2006 7 6 3c 60.029 -38.568 2901 0001 74 4 7603 8 969.8 958.4 3.6320 34.8630 985.5 973.9 3.6130 34.8600 998.7 986.9 3.6070 34.8610 1003.6 991.7 ... (4 Replies)
Discussion started by: rleal
4 Replies

10. Shell Programming and Scripting

Records Duplicate

Hi Everyone, I have a flat file of 1000 unique records like following : For eg Andy,Flower,201-987-0000,12/23/01 Andrew,Smith,101-387-3400,11/12/01 Ani,Ross,401-757-8640,10/4/01 Rich,Finny,245-308-0000,2/27/06 Craig,Ford,842-094-8740,1/3/04 . . . . . . Now I want to duplicate... (9 Replies)
Discussion started by: ganesh123
9 Replies
Login or Register to Ask a Question