Select Distinct on multiple fields


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Select Distinct on multiple fields
# 1  
Old 10-08-2009
Lightbulb Select Distinct on multiple fields

How do I create a script that provides a count of distinct values of a particular field in a file utilizing commonly available UNIX commands (sh or awk)?

Field1|Field2|Field3|Field4
AAA|BBB|CCC|DDD
111|222|333|777
AAA|EEE|ZZZ|EEE
111|555|333|444
AAA|EEE|CCC|DDD
111|222|555|444

For the above file, the result I am looking for would be:

Field1
AAA(3)
111(3)
Field2
BBB(1)
222(2)
EEE(2)
555(1)
Field3
ccc(2)
333(2)
zzz(1)
555(1)
Field4
DDD(2)
777(1)
EEE(1)
444(2)

Thank you in advance for your assistance.
# 2  
Old 10-08-2009
Code:
awk ' { arr1[$1]++
          arr2[$2]++
          arr3[$3]++
          arr4[$4]++}
       END {
       for ( i in arr1 ) { print i, arr1[i]}
       for ( i in arr2 ) { print i, arr2[i]}
       for ( i in arr3 ) { print i, arr3[i]}
       for ( i in arr4 ) { print i, arr4[i]}
       } '   inputfile > outputfile

# 3  
Old 10-08-2009
You may need to specify the file separator (FS="|") as in
Code:
awk 'BEGIN {FS="|"}{ arr1[$1]++

 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Select distinct sequences from fasta file and list

Hi How can I extract sequences from a fasta file with respect a certain criteria? The beginning of my file (containing in total more than 1000 sequences) looks like this: >H8V34IS02I59VP SDACNDLTIALLQIAREVRVCNPTFSFRWHPQVKDEVMRECFDCIRQGLG YPSMRNDPILIANCMNWHGHPLEEARQWVHQACMSPCPSTKHGFQPFRMA... (6 Replies)
Discussion started by: Marion MPI
6 Replies

2. Shell Programming and Scripting

Select records and fields

Hi All I would like to modify a file like this: >antax gioq21 tris notes abcdefghij klmnopqrs >betax gion32 ter notes2 tuvzabcdef ahgskslsooin this: >tris abcdefghij klmnopqrs >ter tuvzabcdef ahgskslsoo So, I would like to remove the first two fields(and output field 3) in record... (4 Replies)
Discussion started by: giuliangiuseppe
4 Replies

3. Shell Programming and Scripting

Select distinct rows in a file by last column

Hi, I have the following file: LOG:015608::ERR:2310:map_spsrec:Invalid parameter LOG:015608::ERR:2471:map_dgdrec:Invalid parameter LOG:015608::ERR:2487:map_nnmrec:Invalid number LOG:015608::ERR:2310:map_nmrec:Invalid number LOG:015608::ERR:2438:map_nmrec:Invalid number As a delimiter I... (2 Replies)
Discussion started by: apenkov
2 Replies

4. Shell Programming and Scripting

awk with fields select?

If i have a log file record.txt, with 10 fields - First field is datetime - 7th field is status - 8th filed is name - The last field (10th) is epoch time of the first field 02/17/2012 1:47 PM||||||In Use|chicken||1329515230 02/17/2012 2:53 PM||||||Available|chicken||1329519195 02/17/2012... (4 Replies)
Discussion started by: sabercats
4 Replies

5. Shell Programming and Scripting

distinct values of all the fields

I am a beginner to scripting, please help me in this regard. How do I create a script that provides a count of distinct values of all the fields in the pipe delimited file ? I have 20 different files with multiple columns in each file. I needed to write a generic script where I give the number... (2 Replies)
Discussion started by: vukkusila
2 Replies

6. UNIX for Dummies Questions & Answers

distinct values of all the fields

I am a beginner to scripting, please help me in this regard. How do I create a script that provides a count of distinct values of all the fields in the pipe delimited file ? I have 20 different files with multiple columns in each file. I needed to write a generic script where I give the number... (1 Reply)
Discussion started by: vukkusila
1 Replies

7. Shell Programming and Scripting

To count distinct fields in a row

I have . dat file which contains data in a specific format: 0 3 892 921 342 1 3 921 342 543 2 4 817 562 718 765 3 3 819 562 717 761 i need to compare each field in a row with another field of the same column but different row and cont the... (8 Replies)
Discussion started by: Abhik
8 Replies

8. Shell Programming and Scripting

Select distinct values from a flat file

Hi , I have a similar problem. Please can anyone help me with a shell script or a perl. I have a flat file like this fruit country apple germany apple india banana pakistan banana saudi mango india I want to get a output like fruit country apple ... (7 Replies)
Discussion started by: smalya
7 Replies

9. Shell Programming and Scripting

printing select fields in awk

Hi, I want to print certain fields from my data file depending on certain conditions. Somebody pls let me know how to send it to awk. The command below is the one which I want to use in a shell script and this prints fine cat ./datafile.dat | grep -i $SEARCH_STR | awk -F: '{ print $1 $2 $3... (5 Replies)
Discussion started by: maverix
5 Replies

10. UNIX for Dummies Questions & Answers

select distinct row from a file

Hi, buddies out there. I have a text file ( only one column ) which I created using vi editor. The file contains duplicate rows and I would like to select distinct rows, how to go on it using unix command: file content = apple apple orange watermelon apple orange Can it be done... (7 Replies)
Discussion started by: merry susana
7 Replies
Login or Register to Ask a Question