Print unique names in each row of a specific column using awk


Login or Register for Dates, Times and to Reply

 
Thread Tools Search this Thread
# 1  
Print unique names in each row of a specific column using awk

Is it possible to remove redundant names in the 4th column?


input

Code:
cqWE    100    200    singapore;singapore
AZO    300    400    brazil;america;germany;ireland;germany
....
....

output
Code:
cqWE    100    200    singapore
AZO    300    400    brazil;america;germany;ireland

# 2  
With Perl you could write something like this:

Code:
perl -lpe'
  %_ = ();
  s/[^\s]+$/join ";", grep !$_{$_}++, split ";", $&/e
  ' infile

With awk the code will be more noisy.

Last edited by radoulov; 12-12-2012 at 02:20 PM..
# 3  
a 'noisy' awk: awk -f quincy.awk myFile
quincy.awk:
Code:
{
  split("",t)
  n=split($4, a,";")
  $4=""
  for(i=1;i<=n;i++)
    if( !(a[i] in t)) {
      $4=(i==1)?a[i]:$4 ";" a[i]
      t[a[i]]
    }
  print
}

This User Gave Thanks to vgersh99 For This Post:
# 4  
Smilie Smilie
# 5  
If you want/prefer to use awk and you have a recent GNU awk implementation,
you could reconstruct the records after the change exactly (including variable FS')
and preserve the original formatting:

Code:
awk '{
  split($0, t, FS, s)
  for (i = 0; ++i < NF;)
    printf "%s", $i s[i]
  n = split(t[i], tt, fs)
  delete _; lf = x
  for (i = 0; ++i <= n;)
    lf = lf (_[tt[i]]++ ? x : tt[i] fs) 
  print substr(lf, 1, length(lf) - 1)
  }' fs=\; infile

Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #208
Difficulty: Medium
Open Shortest Path First (OSPF) was designed as an exterior gateway protocol (EGP) for use in an autonomous systems such as a local area network (LAN).
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Using awk to change a specific column and in a specific row

I am trying to change the number in bold to 2400 01,000300032,193631306,190619,0640,1,80,,2/ 02,193631306,000300032,1,190618,0640,CAD,2/ I'm not sure if sed or awk is the answer. I was going to use sed and do a character count up to that point, but that column directly before 0640 might... (8 Replies)
Discussion started by: juggernautjoee
8 Replies

2. UNIX for Dummies Questions & Answers

Unique values in a row sum the next column in UNIX

Hi would like to ask you guys any advise regarding my problem I have this kind of data file.txt 111111111,20 111111111,50 222222222,70 333333333,40 444444444,10 444444444,20 I need to get this file1.txt 111111111,70 222222222,70 333333333,40 444444444,30 using this code I can... (6 Replies)
Discussion started by: reks
6 Replies

3. Shell Programming and Scripting

Print row on 4th column to all row

Dear All, I have input : SEG901 5173 9005 5740 SEG902 5227 5284 SEG903 5284 5346 SEG904 5346 9010 SEG905 5400 5456 SEG906 5456 5511 SEG907 5511 9011 SEG908 5572 9015 SEG909 5622 9020 SEG910 5678 5739 SEG911 5739 5796 SEG912 5796 9025 ... (3 Replies)
Discussion started by: attila
3 Replies

4. Shell Programming and Scripting

Print unique names in a specific column using awk

Is it possible to modify file like this. 1. Remove all the duplicate names in a define column i.e 4th col 2. Count the no.of unique names separated by ";" and print as a 5th col thanx in advance!! Q input c1 30 3 Eh2 c10 96 3 Frp c41 396 3 Ua5;Lop;Kol;Kol c62 2 30 Fmp;Fmp;Fmp ... (5 Replies)
Discussion started by: quincyjones
5 Replies

5. UNIX for Dummies Questions & Answers

awk to print first row with forth column and last row with fifth column in each file

file with this content awk 'NR==1 {print $4} && NR==2 {print $5}' file The error is shown with syntax error; what can be done (4 Replies)
Discussion started by: cdfd123
4 Replies

6. UNIX for Dummies Questions & Answers

awk: convert column to row in a specific way

Hi all! I have this kind of output: a1|b1|c1|d1|e1 a2|b2|c2 a3|b3|c3|d3 I would like to transpose columns d and e (when they exist) in column c, and under the row where they come from. Then copying the beginning of the row. In order to obtain: a1|b1|c1 a1|b1|d1 a1|b1|e1 a2|b2|c2... (1 Reply)
Discussion started by: lucasvs
1 Replies

7. Shell Programming and Scripting

AWK Script - Print a column - within a Row Range

Hi, Please read the whole thread. I have been working on this script below. It works fine, feel free to copy and test with the INPUT File below as well. example: PACKET DATA PROTOCOL CONTEXT DATA APNID PDPADD EQOSID VPAA PDPCH PDPTY PDPID 10 ... (6 Replies)
Discussion started by: panapty
6 Replies

8. Shell Programming and Scripting

awk print specific columns one row at a time

Hello, I have the following piece of code: roleName =`cat $inputFile | awk -F';' '{ print $1 }'` roleDescription =`cat $inputFile | awk -F';' '{ print $2 }'` roleAuthProfile =`cat $inputFile | awk -F';' '{ print $3 }'` mappedUserID (5 Replies)
Discussion started by: pr0tocoldan
5 Replies

9. Shell Programming and Scripting

Concatenating column values with unique id into single row

Hi, I have a table in Db2 with data say id_1 phase1 id_1 phase2 id_1 phase3 id_2 phase1 id_2 phase2 I need to concatenate the values like id_1 phase1,phase2,phase3 id_2 phase1,phase2 I tried recursive query but in vain as the length of string to be concatenated in quite long. ... (17 Replies)
Discussion started by: jsaravana
17 Replies

10. Shell Programming and Scripting

Insert a text from a specific row into a specific column using SED or AWK

Hi, I am having trouble converting a text file. I have been working for this whole day now, still i couldn't make it. Here is how the text file looks: _______________________________________________________ DEVICE STATUS INFORMATION FOR LOCATION 1: OPER STATES: Disabled E:Enabled ... (5 Replies)
Discussion started by: Issemael
5 Replies

Featured Tech Videos