Delete duplicated fields in a line


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Delete duplicated fields in a line
# 8  
Old 03-17-2014
Thanks Scrutinizer.. it works!

Can you explain what is the meaning of the code?
# 9  
Old 03-17-2014
Code:
awk '{delete a; delete b; for(i = 1; i <= NF; i++) {a[i] = $i; b[$i]++}; for(i = 1; i <= length(a); i++) {if(b[$i] == 1) {printf "%s%s", a[i], FS}}; print ""}' file

This User Gave Thanks to SriniShoo For This Post:
# 10  
Old 03-17-2014
Good job SriniShoo.. your code working too.. can you explain please?
# 11  
Old 03-17-2014
Code:
delete a; delete b

to clear arrays a & b
Code:
for(i = 1; i <= NF; i++) {a[i] = $i; b[$i]++}

Parse through the line and and store each field value int to different arrays
a - to print the output in an order
b - to cehck duplicates
Code:
for(i = 1; i <= length(a); i++) {if(b[$i] == 1) {printf "%s%s", a[i], FS}}

After I read the line, I am printint the values from array a if array b says it doesn't have duplicate values
Code:
printf "%s%s", a[i], FS

for formatting the output

Last edited by SriniShoo; 03-17-2014 at 02:44 AM.. Reason: tags
This User Gave Thanks to SriniShoo For This Post:
# 12  
Old 03-17-2014
Small addition to my old code, which I missed yesterday Smilie
Code:
$ awk '{delete B;for(i=1;i<=NF;i++){if($i in B){$i=$(B[$i])=x}B[$i]=i}$0=$0;$1=$1}1' file

SEKK101 1C23.delay sequence=1 >>> sequence=0 >>>done.
SEKK106 1C22.delay sequence=1 >>> sequence=0 >>>done.
SEKK102 1C24.delay sequence=1 >>> sequence=0 >>>done.
SEKK101 1C20.delay sequence=1 >>> sequence=0 >>>done.
SEKK104 1C10.delay sequence=1 >>> sequence=0 >>>done.
SEKK104 1C11.delay sequence=1 >>> sequence=0 >>>done.
SEKK101 1C12.delay upThresh=10 >>> upThresh=11 >>>done.
SEKK101 1C15.delay thresHold=10 >>> thresHold=11 >>>done.
SEKK106 1C16.delay upThresh=10 >>> upThresh=11 >>>done.
SEKK106 1C17.delay thresHold=10 >>> thresHold=11 >>>done.
SEKK102 1C18.delay upThresh=10 >>> upThresh=11 >>>done.

---------- Post updated at 02:48 PM ---------- Previous update was at 02:44 PM ----------

Add delete a to bartus11's approach it works here is modified version of bartus11

Code:
$ awk '{delete a;for (i=1;i<=NF;i++) a[$i]++;for (i=1;i<=NF;i++) if (a[$i]==1) printf $i" ";printf "\n"}' file

This User Gave Thanks to Akshay Hegde For This Post:
# 13  
Old 03-17-2014
Quote:
Originally Posted by Gr4wk
Thanks Scrutinizer.. it works!

Can you explain what is the meaning of the code?
Sure:
Code:
awk '
{                              # For every line in file "file"
  for(i=1; i<NF; i++)          # Iterate variable "i" over the number of fields-1
    for(j=i+1; j<=NF; j++)     # Do the same for variable j from i+1 to the number of fields
      if($i==$j) $i=$j=x       # If two of these fields are equal then make their values ""
  $0=$0                        # Recalculate the fields, if previously fields were made equal to "" 
                                    #then there are now fewer fields..
  $1=$1                        # Recalculate the record, so that any amount of spacing between fields 
                                    # is converted to the OFS which is a single space.  
}
1                              # Print the record
' file                         # Read the file "file"

Hope this helps..

Last edited by Scrutinizer; 03-17-2014 at 07:58 PM..
These 2 Users Gave Thanks to Scrutinizer For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove duplicated records and update last line record counts

Hi Gurus, I need to remove duplicate line in file and update TRAILER (last line) record count. the file is comma delimited, field 2 is key to identify duplicated record. I can use below command to remove duplicated. but don't know how to replace last line 2nd field to new count. awk -F","... (11 Replies)
Discussion started by: green_k
11 Replies

2. Shell Programming and Scripting

How to delete 'duplicated' column values and make a delimited file too?

Hi, I have the following output from an Oracle SQL statement and I want to remove duplicated column values. I know it is possible using Oracle analytical/statistical functions but unfortunately I don't know how to use any of those. So now, I've gone to PLAN B using awk/sed maybe or any... (5 Replies)
Discussion started by: newbie_01
5 Replies

3. Shell Programming and Scripting

Comparison of fields then increment a counter reading line by line in a file

Hi, i have a scenario were i should compare a few fields from each line then increment a variable based on that. Example file 989878|8999|Y|0|Y|N|V 989878|8999|Y|0|N|N|V 989878|8999|Y|2344|Y|N|V i have 3 conditions to check and increment a variable on every line condition 1 if ( $3... (4 Replies)
Discussion started by: selvankj
4 Replies

4. Shell Programming and Scripting

Delete last 2 fields from every record in a file

Sample file record : "20130617003","2013-06-18T07:00:03","OUTWARD","01001011","TEST PLC","","HFX834346364364","20130617","10","DUM87534758","","1.28","826","020201","65879278","","","","","","010101","56789","DUMMY... (3 Replies)
Discussion started by: bigbuk
3 Replies

5. UNIX for Dummies Questions & Answers

using sed delete a line from csv file based on specific data in two separate fields

Hello, :wall: I have a 12 column csv file. I wish to delete the entire line if column 7 = hello and column 12 = goodbye. I have tried everything that I can find in all of my ref books. I know this does not work /^*,*,*,*,*,*,"hello",*,*,*,*,"goodbye"/d Any ideas? Thanks Please... (2 Replies)
Discussion started by: Chris Eagleson
2 Replies

6. Shell Programming and Scripting

Remove rows with first 4 fields duplicated in awk

Hi, I am trying to use awk to remove all rows where the first 4 fields are duplicates. e.g. in the following data lines 6-9 would be removed, leaving one copy of the duplicated row (row 5) Borgarhraun FH9822 ol24 FH9822_ol24_m20 ol Deformed c Borgarhraun FH9822 ol24 ... (3 Replies)
Discussion started by: tomahawk
3 Replies

7. Shell Programming and Scripting

delete duplicated characters in each line

I'm a biologist trying to analyse some data and I'll appreciate some help with the following problem. I have a column of characters which I'll like to delete the duplicated characters in each line and report only the unique one.No sorting should be done. E.g. The original data: GTG CTC CTC... (5 Replies)
Discussion started by: ivpz
5 Replies

8. Shell Programming and Scripting

Compare multiple fields in file1 to file2 and print line and next line

Hello, I have two files that I need to compare and print out the line from file2 that has the first 6 fields matching the first 6 fields in file1. Complicating this are the following restrictions 1. file1 is only a few thousand lines at most and file2 is greater than 2 million 2. I need to... (7 Replies)
Discussion started by: gillesc_mac
7 Replies

9. Shell Programming and Scripting

grep and delete 2nd duplicated of txt... -part2

Hi, I find out one problem is...the main point is we must delete 2nd duplicated of word in txt file. For example apple orange pink green orange yellow orange red output should be: apple orange pink green yellow orange (16 Replies)
Discussion started by: happyv
16 Replies

10. Shell Programming and Scripting

Delete spaces in between fields

I am new to unix and need some assistance. I have a file in the format below with about 15 fields per each record. I have 2 records displayed below. "1234","Andy ","Rich ","0001","123 Main Street ","Dallas " "2345","Andrew ","Richter ","0002","234 First Ave ... (12 Replies)
Discussion started by: guiguy
12 Replies
Login or Register to Ask a Question