concatenate all duplicate line in a file.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting concatenate all duplicate line in a file.
# 22  
Old 08-28-2008
TIMTOWTDI...
Code:
awk -F\| '{
   if (k[$1])
       k[$1] = sprintf("%s|%s|%s", k[$1], $2, $3)
   else
       k[$1] = sprintf("%s|%s|%s", $1, $2, $3)
} END {
   for (i in k)
       print k[i]
}' file

# 23  
Old 08-28-2008
I am very much new to shell script , I am facing a big problem in parsing a flat file.

My input file format is just like bellow



794051400123|COM|21|0|BD|R|99.98
794051413727|COM|11|0|BD|R|28.99
794051415622|COM|22|0|BD|R|28.99
883929004676|COM|33|0|BD|R|28.99
794051400123|MOM|24|0|BD|R|99.98
794051413727|MOM|11|0|BD|R|28.99
794051415622|MOM|23|0|BD|R|28.99
883929004676|MOM|01|0|BD|R|28.99
794051400123|RNO|50|0|BD|R|99.98


Currently the file contains duplicate first field.
What I want is that first field should be unique for each line which
will contain the other field as well.


My desired output file format is just like bellow

794051400123,BD,R,99.98,COM=21,MOM=24,RNO=50

I am using the following piece of code for the purpose

awk -F\| '{
if (k[$1])
k[$1] = sprintf("%s,%s=%s", k[$1],$2,$3)
else
k[$1] = sprintf("%s,%s=%s", $1,$2,$3)
} END {
for (i in k)
print k[i]
}' input.txt > out.txt

exit 0

but it is not working for me.Please help me.
# 24  
Old 08-28-2008
The script only captures the first, second, and third fields. You need to copy all the fields you want in the output.

Code:
awk -F '|' '{ k[$1] = (k[$1] ? k[$1] : $1 "," $4 "," $5 "," $6 "," $7) "," $2 "=" $3 }
END { for (i in k) print k[i] }' file


Last edited by era; 08-28-2008 at 05:44 PM.. Reason: Include reworked version of script
# 25  
Old 08-28-2008
era's code works too though it has one extra field in it.
Code:
awk -F\| '{
   if (k[$1])
       k[$1] = sprintf("%s,%s=%s",k[$1],$2,$3)
   else
       k[$1] = sprintf("%s,%s,%s,%s,%s=%s",$1,$5,$6,$7,$2,$3)
} END {
   for (i in k) print k[i]
}' file

# 26  
Old 08-28-2008
I have changed my code

awk -F\| '{
if (k[$1])
k[$1] = sprintf("%s,%s,%s,%s,%s=%s",k[$1],$5,$6,$7,$2,$3)
else
k[$1] = sprintf("%s,%s,%s,%s,%s=%s", $1,$5,$6,$7,$2,$3)
} END {
for (i in k)
print k[i]
}'

I have got the out put

794051400123,BD,R,99.98^M,COM=21,BD,R,99.98^M,MOM=24

Also some junk char comes after 14.99
# 27  
Old 08-28-2008
I told you before you should check for DOS carriage returns. Those are the reason the earlier script didn't work correctly.

Shamrock's code was correct for the k[$1] case, you added too many fields there. You should only be adding $2 and $3 from the duplicate lines.

Last edited by era; 08-28-2008 at 06:15 PM.. Reason: Actually it was only on the previous screen five hours ago
# 28  
Old 08-28-2008
Are you going back and forth between UNIX and DOS?? The Ctrl-M (^M) characters in your output are indicative of mixing UNIX and DOS.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Log file - Delete duplicate line & keep last date

Hello All ! I need your help on this case, I have a csv file with this: ITEM105;ARI FSR;2016-02-01 08:02;243 ITEM101;ARI FSR;2016-02-01 06:02;240 ITEM032;RNO TLE;2016-02-01 11:03;320 ITEM032;RNO TLE;2016-02-02 05:43;320 ITEM032;RNO TLE;2016-02-01 02:03;320 ITEM032;RNO... (2 Replies)
Discussion started by: vadim-bzh
2 Replies

2. Shell Programming and Scripting

Help with concatenate multiple line into one line

Hi, Do anybody experience how to concatenate multiple line into one line by using awk or perl command? Input file: >set1 QAWEQRQ@EWQEASED ASDAEQW QAWEQRQTQ ASRFQWRGWQ From the above Input file, it got 5 lines Desired output file: >set1... (6 Replies)
Discussion started by: perl_beginner
6 Replies

3. Shell Programming and Scripting

Honey, I broke awk! (duplicate line removal in 30M line 3.7GB csv file)

I have a script that builds a database ~30 million lines, ~3.7 GB .cvs file. After multiple optimzations It takes about 62 min to bring in and parse all the files and used to take 10 min to remove duplicates until I was requested to add another column. I am using the highly optimized awk code: awk... (34 Replies)
Discussion started by: Michael Stora
34 Replies

4. Shell Programming and Scripting

Concatenate small line with next line perl script

Hello to all, I'm new to perl, I have input file that contains the string below: 315350535ff450000014534130101ff4500ff45453779ff450ff45545f01ff45ff453245341ff4500000545000This string has as line separator "ff45". So, I want to print each line but the code below is not working. perl -pe '... (2 Replies)
Discussion started by: Ophiuchus
2 Replies

5. Shell Programming and Scripting

Delete Duplicate line (not really) from the file

I need help in figuring out hoe to delete lines in a data file. The data file is huge. I am currently using "vi" to search and delete the lines - which is cumbersome since it takes lots of time to save that file (due to its huge size). Here is the issue. I have a data file with the following... (4 Replies)
Discussion started by: GosarJunk
4 Replies

6. Shell Programming and Scripting

awk concatenate every line of a file in a single line

I have several hundreds of tiny files which need to be concatenated into one single line and all those in a single file. Some files have several blank lines. Tried to use this script but failed on it. awk 'END { print r } r && !/^/ { print FILENAME, r; r = "" }{ r = r ? r $0 : $0 }' *.txt... (8 Replies)
Discussion started by: sdf
8 Replies

7. Shell Programming and Scripting

remove of duplicate line from a file

I have a file a.txt having content like deepak ram sham deepram sita kumar I Want to delete the first line containing "deep" ... I tried using... grep -i 'deep' a.txt It gives me 2 rows...I want to delete the first one.. + need to know the command to delete the line from... (5 Replies)
Discussion started by: saluja.deepak
5 Replies

8. Shell Programming and Scripting

How to find duplicate line in log file?

Hi guys, I'm really happy to find this forum I have a log file, and I have to find all lines that have "error" word, and then save this output in file, the output file has to have just only one line to any Duplicated lines and counter that show how many time this lines duplicated? I already... (2 Replies)
Discussion started by: wax_light
2 Replies

9. Shell Programming and Scripting

Concatenate strings line by line

Hi, I have a noob question . Can someone help me how to concatenate line by line using this variables? var1: Apple| Banana| var2: Red Yellow then how can I concatenate both line by line? in which the result would be: Apple|Red Banana|Yellow just to generate a row result i was... (6 Replies)
Discussion started by: hagdanan
6 Replies

10. Shell Programming and Scripting

duplicate line in a text file

i would like to scan file in for duplicate lines, and print the duplicates to another file, oh and it has to be case insensitive. example line1 line2 line2 line3 line4 line4 outputfile: line2 line4 any ideas (5 Replies)
Discussion started by: nixguy
5 Replies
Login or Register to Ask a Question