how to identify duplicate columns in a row


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting how to identify duplicate columns in a row
# 1  
Old 11-12-2009
MySQL how to identify duplicate columns in a row

Hi,

How to identify duplicate columns in a row?

Input data: may have 30 columns
Code:
9211480750 LK 120070417 920091030
9211480893 AZ 120070607
9205323621 O7 120090914 120090914 1420090914 2020090914 2020090914
9211479568 AZ 120070327 320090730
9211479571 MM 120070326
9211480892 MM 120070324
9211479945 AZ 120070306 320091109 920091002
9211480855 AZ 120070330 920090913
9211479857 AZ 120070306 920090916
9211480863 MM 120070314
9211479935 MM 120070306
9211479588 AZ 120070323
9211479565 MM 120070311
9289819968 OD null
9211479947 AZ 120070306 120070306
9211479939 ID 120070306 220091105 920091031 1220091105

expected output
Code:
9205323621 O7 120090914 120090914 1420090914 2020090914 2020090914
9211479947 AZ 120070306 120070306


Last edited by Franklin52; 11-12-2009 at 08:16 AM.. Reason: Please use code tags!
# 2  
Old 11-12-2009
Try this:

Code:
awk '{
  for(i=1;i<=NF;i++){
    for(j=i+1;j<=NF;j++){
      if($i==$j){print; next}
    }
  }
}' file

# 3  
Old 11-12-2009
With Perl:

Code:
perl -ane'
  grep $_{$_}++, @F and print; undef %_
  ' infile

And another one with AWK:

Code:
awk '{
  for (i=1; i<=NF; i++)
    if (_[$i]++) { print; break }
	split(x, _)
  }' infile

If your AWK implementation supports delete <array>:

Code:
awk '{
  for (i=1; i<=NF; i++)
    if (_[$i]++) { print; break }
	delete _
  }' infile


Last edited by radoulov; 11-12-2009 at 09:53 AM..
# 4  
Old 11-16-2009
Quote:
Originally Posted by radoulov
Code:
awk '{
  for (i=1; i<=NF; i++)
    if (_[$i]++) { print; break }
	split(x, _)
  }' infile

Good solution to use split in awk. update the code for easily understanding.

Code:
awk '{
  for (i=1; i<=NF; i++)
    if (A[$i]++) { print; break }
	split(0, A)
  }'

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Identify duplicate values at first column in csv file

Input 1,ABCD,no 2,system,yes 3,ABCD,yes 4,XYZ,no 5,XYZ,yes 6,pc,noCode used to find duplicate with regard to 2nd column awk 'NR == 1 {p=$2; next} p == $2 { print "Line" NR "$2 is duplicated"} {p=$2}' FS="," ./input.csv Now is there a wise way to de-duplicate the entire line (remove... (4 Replies)
Discussion started by: deadyetagain
4 Replies

2. Shell Programming and Scripting

Delete duplicate row

Hi all, how can delete duplicate files in file form, e.g. $cat file1 aaa 123 234 345 456 bbb 345 345 657 568 ccc 345 768 897 456 aaa 123 234 345 456 ddd 786 784 234 263 ccc 345 768 897 456 aaa 123 234 345 456 ccc 345 768 897 456 then i need ouput file1 some, (4 Replies)
Discussion started by: aav1307
4 Replies

3. Shell Programming and Scripting

Identify max value in diff columns for same row

Hi, I have a file with 1M records ABC 200 400 2.4 5.6 ABC 410 299 12 1.5 XYZ 4 5 6 7 MNO 22 40 30 70 MNO 47 55 80 150 What I want is for all the rows it should take the max value where there are duplicates output ABC 410 400 12 5.6 XYZ 4 5 6 7 MNO 47 55 80 150 How can i... (6 Replies)
Discussion started by: Diya123
6 Replies

4. Shell Programming and Scripting

Check to identify duplicate values at first column in csv file

Hello experts, I have a requirement where I have to implement two checks on a csv file: 1. Check to see if the value in first column is duplicate, if any value is duplicate script should exit. 2. Check to verify if the value at second column is between "yes" or "no", if it is anything else... (4 Replies)
Discussion started by: avikaljain
4 Replies

5. UNIX for Dummies Questions & Answers

Select 2 columns and transpose row by row

Hi, I have a tab-delimited file as follows: 1 1 2 2 3 3 4 4 a a b b c c d d 5 5 6 6 7 7 8 8 e e f f g g h h 9 9 10 10 11 11 12 12 i i j j k k l l 13 13 14 14 15 15 16 16 m m n n o o p p The output I need is: 1 1 a a 5 5 e e 9 9 i i 13... (5 Replies)
Discussion started by: mvaishnav
5 Replies

6. UNIX for Dummies Questions & Answers

help to identify duplicate columns adjacent value

Hi friends, I have a xlsheet like below first column having id ABCfollowed by 7digit numbers and the next column have title against the ids. Titles are unique and duplicateboth, but ids are unique even for duplicate title.Now I need to identify those duplicate title having the highest id for... (9 Replies)
Discussion started by: umapearl
9 Replies

7. Shell Programming and Scripting

duplicate row based on single column

I am a newbie to shell scripting .. I have a .csv file. It has 1000 some rows and about 7 columns... but before I insert this data to a table I have to parse it and clean it ..basing on the value of the first column..which a string of phone number type... example below.. column 1 ... (2 Replies)
Discussion started by: mitr
2 Replies

8. Shell Programming and Scripting

Delete a row that has a duplicate column

I'm trying to remove lines of data that contain duplicate data in a specific column. For example. apple 12345 apple 54321 apple 14234 orange 55656 orange 88989 orange 99898 I only want to see apple 12345 orange 55656 How would i go about doing this? (5 Replies)
Discussion started by: spartan22
5 Replies

9. Shell Programming and Scripting

Deleting all occurences of a duplicate row

Hi, I need to delete all occurences of the repeated lines from a file and retain only the lines that is not repeated elsewhere in the file. As seen below the first two lines are same except that for the string "From BaseLine" and "From SMS".I shouldn't consider the string "From SMS" and "From... (7 Replies)
Discussion started by: ragavhere
7 Replies

10. UNIX for Dummies Questions & Answers

Identify duplicate words in a line using command

Hi, Let me explain the problem clearly: Let the entries in my file be: lion,tiger,bear apple,mango,orange,apple,grape unix,windows,solaris,windows,linux red,blue,green,yellow orange,maroon,pink,violet,orange,pink Can we detect the lines in which one of the words(separated by field... (8 Replies)
Discussion started by: srinivasan_85
8 Replies
Login or Register to Ask a Question