distinct values of all the fields


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting distinct values of all the fields
# 1  
Old 10-12-2010
distinct values of all the fields

I am a beginner to scripting, please help me in this regard.

How do I create a script that provides a count of distinct values of all the fields in the pipe delimited file ? I have 20 different files with multiple columns in each file. I needed to write a generic script where I give the number of columns as a parameter to the script or the script by itself should be able to recognize the number of columns in the file based on the delimiter. The script needs to generate the output as below.

Sample data

Field1|Field2|Field3|Field4
AAA|BBB|CCC|DDD
111|222|333|777
AAA|EEE|ZZZ|EEE
111|555|333|444
AAA|EEE|CCC|DDD
111|222|555|444

For the above file, the result I am looking for would be:

Field1
AAA(3)
111(3)

Field2
BBB(1)
222(2)
EEE(2)
555(1)

Field3
ccc(2)
333(2)
zzz(1)
555(1)

Field4
DDD(2)
777(1)
EEE(1)
444(2)

Thank you in advance for your assistance.
# 2  
Old 10-12-2010
Probably not too efficient, especially for large files, but it is straight forward:

Code:
awk -F '|' '
        {
                for( i = 1; i <= NF; i++ )
                {
                        count[i " " $(i)]++;    # count by field number and field value
                        uniq[$(i)] = 1;         # save a list of unique strings
                }
                if( NF > fields )
                        fields = NF;            # in case a variable number in file; capture max
        }
        END {
                for( i = 1; i <= fields; i++ )
                {
                        printf( "field %d\n", i );
                        for( x in uniq )
                                if( count[i " " x] )
                                        printf( "%s (%d)\n", x, count[i " " x] );  # print by field and value
                        printf( "\n" );
                }
        }
' <input-filename

# 3  
Old 10-12-2010
Code:
awk -F \| '{for (i=1;i<=NF;i++) a[i FS $i]++}END {for (i in a) print i,a[i] |"sort -n" }  ' infile

1|111 3
1|AAA 3
2|222 2
2|555 1
2|BBB 1
2|EEE 2
3|333 2
3|555 1
3|CCC 2
3|ZZZ 1
4|444 2
4|777 1
4|DDD 2
4|EEE 1
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need distinct values from command in a script

Hello, I am using below command srvctl config service -d cmdbut cmdbut_01 (P):/devoragridcn_01/app/oracle> srvctl config service -d cmdbut Service name: boms10.world Service is enabled Server pool: cmdbut_boms10.world Cardinality: 1 Disconnect: false Service role: PRIMARY Management... (7 Replies)
Discussion started by: Vishal_dba
7 Replies

2. Shell Programming and Scripting

Find distinct values

Hi, I have two files of the following format file1 chr1:345-456 chr2:123-456 chr2:455-678 chr3:456-789 chr3:444-555 file2 chr1:345-456 chr2:123-456 chr3:456-789 output (2 Replies)
Discussion started by: jacobs.smith
2 Replies

3. Shell Programming and Scripting

average of distinct values with awk

Hi guys, I am not an expert in shell and I need help with awk command. I have a file with values like 200 1 1 200 7 2 200 6 3 200 5 4 300 3 1 300 7 2 300 6 3 300 4 4 I need resulting file with averages of... (3 Replies)
Discussion started by: saif
3 Replies

4. UNIX for Dummies Questions & Answers

distinct values of all the fields

I am a beginner to scripting, please help me in this regard. How do I create a script that provides a count of distinct values of all the fields in the pipe delimited file ? I have 20 different files with multiple columns in each file. I needed to write a generic script where I give the number... (1 Reply)
Discussion started by: vukkusila
1 Replies

5. Shell Programming and Scripting

To count distinct fields in a row

I have . dat file which contains data in a specific format: 0 3 892 921 342 1 3 921 342 543 2 4 817 562 718 765 3 3 819 562 717 761 i need to compare each field in a row with another field of the same column but different row and cont the... (8 Replies)
Discussion started by: Abhik
8 Replies

6. Shell Programming and Scripting

grep distinct values

this is a little more complex than that. I have a text file and I need to find all the distinct words that appear in a line after the word TABLESPACE when I grep for just the word tablespace, I get: how do i parse this a little better so i have a smaller file to read? This is just an... (4 Replies)
Discussion started by: guessingo
4 Replies

7. UNIX for Dummies Questions & Answers

Select Distinct on multiple fields

How do I create a script that provides a count of distinct values of a particular field in a file utilizing commonly available UNIX commands (sh or awk)? Field1|Field2|Field3|Field4 AAA|BBB|CCC|DDD 111|222|333|777 AAA|EEE|ZZZ|EEE 111|555|333|444 AAA|EEE|CCC|DDD 111|222|555|444 For... (2 Replies)
Discussion started by: Refresher
2 Replies

8. Shell Programming and Scripting

Getting Distinct values from second field in a file....

Hi I have a pipe delimited file. I am trying to grab the DISTINCT value from the second field. The file is something like: 1233|apple|ron 1234|apple|elephant 1235|egg|man the output I am trying to get from second field is apple,egg (apple coming only once) Thanks simi (4 Replies)
Discussion started by: simi28
4 Replies

9. Shell Programming and Scripting

Awk to print distinct col values

Hi Guys... I am newbie to awk and would like a solution to probably one of the simple practical questions. I have a test file that goes as: 1,2,3,4,5,6 7,2,3,8,7,6 9,3,5,6,7,3 8,3,1,1,1,1 4,4,2,2,2,2 I would like to know how AWK can get me the distinct values say for eg: on col2... (22 Replies)
Discussion started by: anduzzi
22 Replies

10. Shell Programming and Scripting

Loop through only the distinct values in a file

Datafile has the following data seperated by : FIELD1:FIELD2:FIELD3 D1:/opt/9.1.9:Y D2:/opt/10.1.10:Y D3:/opt/9.1.9:Y D4:/opt/8.1.8:Y D5:/opt/8.1.8:Y D6:/opt/9.1.9:Y D7:/opt/9.1.9:Y D8:/opt/10.1.10:Y D9:/opt/9.1.9:Y D10:/opt/10.1.10:Y I want to do some operations only on the distinct... (2 Replies)
Discussion started by: pbekal
2 Replies
Login or Register to Ask a Question