Identify duplicates and update the last 2 digits to 0 for both the Orig and Dup


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Identify duplicates and update the last 2 digits to 0 for both the Orig and Dup
# 1  
Old 06-14-2012
Identify duplicates and update the last 2 digits to 0 for both the Orig and Dup

Hi,

I have a requirement where I have to identify duplicates from a file based on the first 6 chars (It is fixed width file of 12 chars length) and whenever a duplicate row is found, its original and duplicate row's last 2 chars should be updated to all 0's if they are not same. (I mean last 2 digits of original and duplicate row should be same, if not then default to 00 else keep them as is)


I thought of using
Code:
sort -u -k 1.1,1.6 inputfile

and then manipulating the output but I am stuck...

here is the sample input and output

Code:
input:
1251233Y1234
1221249N8821
1231116Y9945
1231113Y2123
1231109Y3212
1231123N1214
1231126N1214

output should be:
Code:
1251233Y1234
1221249N8821
1231116Y9900
1231113Y2100
1231109N3212
1231123N1214
1231126N1214 (Since last 2 digits are same nothing changed)

Any help in achieving the above result using either awk/sed will be greatly appreciated.

Thanks,
Faraway

Last edited by Scrutinizer; 06-14-2012 at 06:59 PM.. Reason: code tags instead of quote tags
# 2  
Old 06-15-2012
Code:
sort -k1.1,1.6 inputfile |awk '
{
  if (substr($0,1,6) == substr(x,1,6) &&
        substr($0,11,2) != substr(x,11,2)) {
    sub(/..$/, "00", x)
    sub(/..$/, "00")
  }
  if (x) print x
  x = $0
}
END { if (x) print x }'

# 3  
Old 06-15-2012
Alternatively try:
Code:
awk '{k=substr($0,1,6)} NR==FNR{A[k]++; next} A[k]>1{sub(/..$/,"00")}1' input input

( sic, input is 2x )
# 4  
Old 06-16-2012
Code:
# awk '{x0=$0;;x=split($1,a,"");xlast=substr($1,x-1,2);x1=substr($1,1,6);if(x2==x1){if(xlast!=x2last){
if(zz<1){print substr(x00,1,x-2)"00" RS substr(x0,1,x-2)"00";z=0;zz++;}}else{zz=0;if(z==0)print x00;
if(z==1)print x00 RS x0;z=1}}else{if(z!=0)print x00;z=1;zz=0};x00=$0;x2last=substr(x00,x-1,2);x2=substr($1,1,6);}' infile
1251233Y1234
1221249N8821
1231116Y9900
1231113Y2100
1231109Y3212
1231123N1214
1231126N1214

another file
Code:
# cat try2
1251233Y1234
1251234Y1235
1221249N8821
1231116Y9945
1231113Y2123
1231109Y3212
1231123N1214
1231126N1215
1231127N1216
1231128N1216
1231129N1218
12311X7N1217
12311X8N1217

Code:
# awk '{x0=$0;;x=split($1,a,"");xlast=substr($1,x-1,2);x1=substr($1,1,6);if(x2==x1){if(xlast!=x2last){
if(zz<1){print substr(x00,1,x-2)"00" RS substr(x0,1,x-2)"00";z=0;zz++;}}else{zz=0;if(z==0)print x00;
if(z==1)print x00 RS x0;z=1}}else{if(z!=0)print x00;z=1;zz=0};x00=$0;x2last=substr(x00,x-1,2);x2=substr($1,1,6);}' try2
1251233Y1200
1251234Y1200
1221249N8821
1231116Y9900
1231113Y2100
1231109Y3212
1231123N1200
1231126N1200
1231127N1216
1231128N1200
1231129N1200
12311X7N1217
12311X8N1217

note: code checks the one-to-one method on the all lines (previous records (print) if not "00" duplicate counts)

regards
ygemici
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

sed / awk script to delete the two digits from first 3 digits

Hi All , I am having an input file as stated below 5728 U_TOP_LOGIC/U_CM0P/core/u_cortexm0plus/u_top/u_sys/u_core/r03_q_reg_20_/Q 011 611 U_TOP_LOGIC/U_CM0P/core/u_cortexm0plus/u_top/u_sys/u_core/r04_q_reg_20_/Q 011 3486... (4 Replies)
Discussion started by: kshitij
4 Replies

2. Shell Programming and Scripting

Remove duplicates and update last 2 digits of the original row with 0's

Hi, I have a requirement where I have to remove duplicates from a file based on the first 8 chars (It is fixed width file of 10 chars length) and whenever a duplicate row is found, its original row's last 2 chars should be updated to all 0's. I thought of using sort -u -k 1.1,1.8... (4 Replies)
Discussion started by: farawaydsky
4 Replies

3. Shell Programming and Scripting

Find filenames with three digits and add zeros to make five digits

Hello all! I've looked all over the internet and this site and have come up a loss with an easy way to make a bash script to do what I want to do. I have a file with a naming convention as follows: 2012-01-18 string of words here 123.jpg 2012-01-18 string of words here 1234.jpg 2012-01-18... (2 Replies)
Discussion started by: Buzzman25
2 Replies

4. Red Hat

ping error (DUP!)

Ntop is running on redhat. But It gives DUP! error while pinging to any places I dont know why DUP! error is occured. # ping google.com PING google.com (74.125.39.147) 56(84) bytes of data. 64 bytes from fx-in-f147.1e100.net (74.125.39.147): icmp_seq=1 ttl=44 time=54.1 ms 64 bytes from... (6 Replies)
Discussion started by: getrue
6 Replies

5. Shell Programming and Scripting

help: single digits inflated to 2 digits

Hi Folks Probably an easy one here but how do I get a sequence to get used as mentioned. For example in the following I want to automatically create files that have a 2 digit number at the end of their names: m@pyhead:~$ for x in $(seq 00 10); do touch file_$x; done m@pyhead:~$ ls file*... (2 Replies)
Discussion started by: amadain
2 Replies

6. Programming

dup()

when i want to replace standard output with output file int out; out = open("out", O_WRONLY)p; dup2(out,1); What Shall I do in case of appending??? I am using here O_WRONLY TO WRITE.BUT IF i wanna append, whats the word? (5 Replies)
Discussion started by: joey
5 Replies

7. Shell Programming and Scripting

archive with .orig extension

i am moving old file from folder to archive file by this command ls -rt | grep -v '^archive$' | sed '$d' | xargs -I{} mv {} archive can I add .orig extension to all file and then move into archive folder. Where would be adjectly I place my syntex. (1 Reply)
Discussion started by: u263066
1 Replies

8. Programming

fork() and dup()

I have met this code: switch(fork()) { case 0: close(1); dup(p); close(p); close(p); execvp(<whatever>); perror("Exec failed"); } Can anyone tell me what this piece of code does? Thx alot.. (1 Reply)
Discussion started by: AkumaTay
1 Replies

9. Programming

dup()

Would anyone be so kind to explain to me the function of dup() in UNIX? As far as I am concerned, it duplicates a file descriptor. Under what circumstances would we need to duplicate a file descriptor in a UNIX environment? Thank you. vinchen (3 Replies)
Discussion started by: vinchen
3 Replies
Login or Register to Ask a Question