Visit Our UNIX and Linux User Community


Remove duplicates and update last 2 digits of the original row with 0's


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Remove duplicates and update last 2 digits of the original row with 0's
# 1  
Old 06-13-2012
Lightbulb Remove duplicates and update last 2 digits of the original row with 0's

Hi,

I have a requirement where I have to remove duplicates from a file based on the first 8 chars (It is fixed width file of 10 chars length) and whenever a duplicate row is found, its original row's last 2 chars should be updated to all 0's.


I thought of using
Code:
sort -u -k 1.1,1.8 inputfile

but that will give me the result after remove duplicates and with the original last digits as is for the duplicate records

here is the sample input and output

Quote:
input:
1251233Y34
1221249N21
1231116Y45
1231116Y23
1231116N12

output should be:
1251233Y34
1221249N21
1231116Y00
1231116N12
Any help in achieving the above result using either awk/sed will be greatly appreciated.

Thanks,
Faraway
# 2  
Old 06-13-2012
Try:
Code:
awk '{a[substr($0,1,8)]++;b[substr($0,1,8)]=$0}END{for (i in a){if (a[i]>1) {print i"00"}else print b[i]}}' file

This User Gave Thanks to bartus11 For This Post:
# 3  
Old 06-13-2012
awk

Hi,

Try this one,
Code:
awk '{k=substr($0,1,8);if(a[k]){a[k]=k"00";next;}a[k]=$0;}END{for(i in a)print a[i];}' file

Cheers,
Ranga:-)
This User Gave Thanks to rangarasan For This Post:
# 4  
Old 06-13-2012
Assuming that whitespace does not occur in those 10 characters:
Code:
sed 's/\(.*\)\(..\)/\2 \1/' file | sort -k2,2 | uniq -cf1 | awk '$1>1 {$2="00"} {print $3$2}'

Regards,
Alister

Last edited by alister; 06-14-2012 at 09:55 PM.. Reason: original suggestions were incorrect
# 5  
Old 06-14-2012
Thank you....it works as expected

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Splitting single row into multiple rows based on for every 10 digits of last field of the row

Hi ALL, We have requirement in a file, i have multiple rows. Example below: Input file rows 01,1,102319,0,0,70,26,U,1,331,000000113200000011920000001212 01,1,102319,0,1,80,20,U,1,241,00000059420000006021 I need my output file should be as mentioned below. Last field should split for... (4 Replies)
Discussion started by: kotra
4 Replies

2. Shell Programming and Scripting

Trying to remove duplicates based on field and row

I am trying to see if I can use awk to remove duplicates from a file. This is the file: -==> Listvol <== deleting /vol/eng_rmd_0941 deleting /vol/eng_rmd_0943 deleting /vol/eng_rmd_0943 deleting /vol/eng_rmd_1006 deleting /vol/eng_rmd_1012 rearrange /vol/eng_rmd_0943 ... (6 Replies)
Discussion started by: newbie2010
6 Replies

3. Shell Programming and Scripting

Need to print duplicate row along with highest version of original

There are some duplicate field on description column .I want to print duplicate row along with highest version of number and corresponding description column. file1.txt number Description === ============ 34567 nl21a00is-centerdb001:ncdbareq:Error in loading init 34577 ... (7 Replies)
Discussion started by: vijay_rajni
7 Replies

4. Shell Programming and Scripting

Remove duplicates within row and separate column

Hi all I have following kind of input file ESR1 PA156 leflunomide PA450192 leflunomide CHST3 PA26503 docetaxel Pa4586; thalidomide Pa34958; decetaxel docetaxel docetaxel I want to remove duplicates and I want to separate anything before and after PAxxxx entry into columns or... (1 Reply)
Discussion started by: manigrover
1 Replies

5. Shell Programming and Scripting

Remove all digits and rename a file

Hi, I have a file nexus-1234 in a directory. I want to generate a random number and replace the 1234 with it and rename the file. So nexus-1234 becomes nexus-2863 after running the script. Any help is appreciated. Thanks in advance. (2 Replies)
Discussion started by: scorpioraghu
2 Replies

6. Shell Programming and Scripting

Identify duplicates and update the last 2 digits to 0 for both the Orig and Dup

Hi, I have a requirement where I have to identify duplicates from a file based on the first 6 chars (It is fixed width file of 12 chars length) and whenever a duplicate row is found, its original and duplicate row's last 2 chars should be updated to all 0's if they are not same. (I mean last 2... (3 Replies)
Discussion started by: farawaydsky
3 Replies

7. Shell Programming and Scripting

remove directories with two digits after decimal point

Hi everyone, I am new here and generally not experienced with linux. My question must be easy, but as for now I have no idea how to do it. I have lots of directories with numerical names, e.g. 50 50.1 50.12 etc. What I want is to leave directories with no or single digit after the decimal... (2 Replies)
Discussion started by: cabaciucia
2 Replies

8. UNIX for Dummies Questions & Answers

CSV file:Find duplicates, save original and duplicate records in a new file

Hi Unix gurus, Maybe it is too much to ask for but please take a moment and help me out. A very humble request to you gurus. I'm new to Unix and I have started learning Unix. I have this project which is way to advanced for me. File format: CSV file File has four columns with no header... (8 Replies)
Discussion started by: arvindosu
8 Replies

9. Shell Programming and Scripting

remove row if string is same as previous row

I have data like: Blue Apple 6 Red Apple 7 Yellow Apple 8 Green Banana 2 Purple Banana 8 Orange Pear 11 What I want to do is if $2 in a row is the same as $2 in the previous row remove that row. An identical $2 may exist more than one time. So the out file would look like: Blue... (4 Replies)
Discussion started by: dcfargo
4 Replies

10. UNIX for Dummies Questions & Answers

Testing Var for 3 digits in a row only

Hi, I want to test a var, $inputPin to see if is 3 digits in a row or not in a Bash script. I am getting stuck on the regex to do this. And the equality test if it is in an if statement as below? -ne is for comparing numbers But I guess it would be != in this case if \{3,3\]}] ... (6 Replies)
Discussion started by: de_la_espada
6 Replies

Featured Tech Videos