Unique words in each line


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Unique words in each line
# 1  
Old 01-27-2013
Unique words in each line

In each row there could be repetition of a word. I want to delete all repetitions and keep unique occurrences.

Example:
Code:
a+b+c ab+c ab+c
abbb+c ab+bbc a+bbbc
aaa aaa aaa

Output:
Code:
a+b+c ab+c 
abbb+c ab+bbc a+bbbc
aaa

# 2  
Old 01-27-2013
One way:
Code:
awk 'function cleanarray (   i) {for(i in a) delete a[i]}
function printarray (   i) {for(i in a) printf("%s ",i);print ""}
NF{cleanarray()
for(i=1;i<=NF;i++) a[$i]
printarray()}' infile

# 3  
Old 01-27-2013
Perfect solution if you don't care for the order of fields. If you do, it may disappoint you, as the order of elements supplied is not granted in a for (i in a) construct. Then you may want to try
Code:
awk '{delete a
      for (i=1; i<=NF; i++)
        {for (j=1; j<=i; j++) {if ($i == a[j]) break}
         if (j>i) a[++ix]=$i
        }
      for (i=1; i<=ix; i++) printf("%s ", a[i]); print ""; ix=0
     }
    ' file

The delete array statement is NOT supported in all awks, fallback to previous proposal , then. This proposal is clumsier than the previous solution, but it keeps the order of fields, suppressing their later occurrences.
# 4  
Old 01-27-2013
if delete array is not supported, the work-around is split("",array)
# 5  
Old 01-27-2013
Or just:
Code:
awk '{for(i=1; i<NF; i++) for(j=i+1; j<=NF; j++) if($i==$j) $j=x; $0=$0; $1=$1 }1' file

# 6  
Old 01-27-2013
Brilliant!
But - wouldn't an assignment to $j suffice to rebuild $0? So - why the $0 and $1 assignment? I tried this:
Code:
$ awk '{for(i=1; i<NF; i++) for(j=i+1; j<=NF; j++) if($i==$j) $j=x}1' file

and it's working, too.
# 7  
Old 01-27-2013
HI. that would work, yes but it would introduce excess whitespace. An extra recalculation of the fields ( $0=$0 ) first reduces the number of fields (if duplicates were removed), so that after that, by recalculating the record ( $1=$1 ) the excess whitespace gets removed...

Last edited by Scrutinizer; 01-28-2013 at 01:59 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Replace particular words in file based on if finds another words in that line

Hi All, I need one help to replace particular words in file based on if finds another words in that file . i.e. my self is peter@king. i am staying at north sydney. we all are peter@king. How to replace peter to sham if it finds @king in any line of that file. Please help me... (8 Replies)
Discussion started by: Rajib Podder
8 Replies

2. UNIX for Beginners Questions & Answers

Count unique words

Dear all, I would like to know how to list and count unique words in thousands number of text files. Please help me out thanks in advance (9 Replies)
Discussion started by: imranrasheedamu
9 Replies

3. Shell Programming and Scripting

Regex to identify unique words in a dictionary database

Hello, I have a dictionary which I am building for the Open Source Community. The data structure is as under HEADWORD=PARTOFSPEECH=ENGLISH MEANING as shown in the example below अ=m=Prefix signifying negation. अँहँ=ind=Interjection expressing disapprobation. अं=int=An interjection... (2 Replies)
Discussion started by: gimley
2 Replies

4. Shell Programming and Scripting

Search words in multiple file line by line

Hi All I have to search servers name say like 1000+ "unique names" line by line in child.txt files in another file that is a master file where all server present say "master.txt",if child.txt's server name matches with master files then it print yes else no with server name. (4 Replies)
Discussion started by: netdbaind
4 Replies

5. Shell Programming and Scripting

display unique words.

I am having a file with duplicate words how can I eliminate them ant,bat bat,cat cat a.txt | grep -bat | awk '{print $1}' expecting o/p as ant,bat,cat How can I display the output as ant,bat,cat in a single line and no duplicates exists. (2 Replies)
Discussion started by: shikshavarma
2 Replies

6. Homework & Coursework Questions

unique words in files of folder and its subfolders

Hello, I tried to count all unique words of all files in one folder and its subfolders. Can anybody say me, why this doesnt work: ls| find -d | cat | tr "\ " "\n"| uniq -u | wc -l ??? Cat writes only the names of those files, but not the wors, which should be in them. Thanks for any advice. ... (9 Replies)
Discussion started by: Dworza
9 Replies

7. Shell Programming and Scripting

Finding the number of unique words in a file

find the number of unique words in a file using sort com- mand. (7 Replies)
Discussion started by: abhikamune
7 Replies

8. Shell Programming and Scripting

Copying x words from end of line to specific location in same line

Hello all i know it is pretty hard one but you will manage it all after noticing and calculating i find a rhythm for the file i want to edit to copy the last 12 characters in line but the problem is to add after first 25 characters in same line in other way too copy the last 12 characters... (10 Replies)
Discussion started by: princesasa
10 Replies

9. Shell Programming and Scripting

How to print the words in the same line with space or to the predefined line?

HI, cat test abc echo "def" >> test output is cat test abc def the needed output is cat test abc def and so on (5 Replies)
Discussion started by: jobycxa
5 Replies

10. Shell Programming and Scripting

how to read all the unique words in a text file

How can i read all the unique words in a file, i used - cat comment_file.txt | /usr/xpg6/bin/tr -sc 'A-Za-z' '/012' and cat comment_file.txt | /usr/xpg6/bin/tr -sdc 'A-Za-z' '/012' but they didnt worked..... (5 Replies)
Discussion started by: aditya.ece1985
5 Replies
Login or Register to Ask a Question