Extracting unique values of a column from a feed file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extracting unique values of a column from a feed file
# 1  
Old 03-23-2014
Wrench Extracting unique values of a column from a feed file

Hi Folks,

I have the below feed file named abc1.txt in which you can see there is a title and below is the respective values in the rows and it is completely pipe delimited file ,.

Code:
ABC_ID|AMOUNT|ABC_DATE|ABC_CODE|ABD_ID|ABDE_ID|ABEFF_DATE|ABAL_AMOUNT|ab_ON|AB_ODE|TY_CODE|RITY_TE|CR_OKER|SYS_FLAG|CRT_MENT|ADM_ID|ERG_ID|ASH_ADE
0|0.00|24-Jun-14|SSRD|1677|82588|20-Mar-14|100004.00|0|Serest|TRRS|24-Mar-19||true|Receive|861|0|3862880
1|0.00|24-Sep-14|SRSD|1477|85288|20-Mar-14|100003.00|0|Serest|TYRS|24-Mar-19||true|Receive|831|0|3828680
2|0.00|24-Dec-14|HHSD|1777|82858|20-Mar-14|100006.00|0|Serest|UIRS|24-Mar-19||true|Receive|811|0|3862880
2|0.00|24-Dec-14|ESJD|1877|82885|20-Mar-14|100009.00|0|Serest|OPRS|24-Mar-19||true|Receive|861|0|3682880




now this feed files is been generated regularly by a process and is being kept at unix box at the following location /usr/cft/str so finally the file is at /usr/cft/str/abc1.txt
now from the right side you can see there is a column named ADM_ID ,
can you please advise the script or command that will extract the unique ADM_ID and will store those unique ADM_ID in a newly created file the name of the newly created file is Unique_ADM and this file will be stored at the same location .


so the newly created Unique_ADM file will contain the following data that it will extract from the above feed file..

Code:
861
831
811

please advise how to achieve this.Smilie
# 2  
Old 03-23-2014
You could try something like:
Code:
#!/bin/ksh
cd /usr/cft/str/  || exit 1
awk -F'|' -v id='ADM_ID' '
NR == 1 {
        for(f = 1; f <= NF; f++)
                if($f == id)
                        break
        if(f > NF) {
                printf("Column header \"%s\" not found.\n" id)
                exit 2
        }
        next
}
x[$f]++ == 0 {
        print $f
}' abc1.txt > Unique_ADM

This was tested using the Korn shell, but it will work with any shell that recognizes basic Bourne shell syntax.
# 3  
Old 03-23-2014
Specifically for ADM_ID, not flexible as Don's code
Code:
awk -F'|' 'NR>1 && $0=$16' infile | sort -u

# 4  
Old 03-23-2014
@ahamed, there would be an issue if $16=0, also there was an issue with BSD awk which did not print everything.. Rearranging your script like this, worked though:
Code:
awk -F'|' '{$0=$16}NR>1' infile | sort -u

This User Gave Thanks to Scrutinizer For This Post:
# 5  
Old 03-23-2014
Blade

Code:
sed '1 d' fo|cut -'d|' -f16|sort|uniq
811
831
861

easiest way to do is,in case if you want an explanations whats happening

sed 1d is selecting all but first line and rest all is self explanatory
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Count number of unique values in each column of array

What is an efficient way of counting the number of unique values in a 400 column by 1000 row array and outputting the counts per column, assuming the unique values in the array are: A, B, C, D In other words the output should look like: Value COL1 COL2 COL3 A 50 51 52... (16 Replies)
Discussion started by: Geneanalyst
16 Replies

2. UNIX for Beginners Questions & Answers

Find unique values but only in column 1

Hi All, Does anyone have any suggestions/examples of how i could show only lines where the first field is not duplicated. If the first field is listed more than once it shouldnt be shown even if the other columns make it unique. Example file : 876,RIBDA,EC2 876,RIBDH,EX7 877,RIBDF,E28... (4 Replies)
Discussion started by: mutley2202
4 Replies

3. UNIX for Dummies Questions & Answers

Unique values in a row sum the next column in UNIX

Hi would like to ask you guys any advise regarding my problem I have this kind of data file.txt 111111111,20 111111111,50 222222222,70 333333333,40 444444444,10 444444444,20 I need to get this file1.txt 111111111,70 222222222,70 333333333,40 444444444,30 using this code I can... (6 Replies)
Discussion started by: reks
6 Replies

4. Linux

To get all the columns in a CSV file based on unique values of particular column

cat sample.csv ID,Name,no 1,AAA,1 2,BBB,1 3,AAA,1 4,BBB,1 cut -d',' -f2 sample.csv | sort | uniq this gives only the 2nd column values Name AAA BBB How to I get all the columns of CSV along with this? (1 Reply)
Discussion started by: sanvel
1 Replies

5. Shell Programming and Scripting

Script for extracting data from csv file based on column values.

Hi all, I am new to shell script.I need your help to write a shell script. I need to write a shell script to extract data from a .csv file where columns are ',' separated. The file has 5 columns having values say column 1,column 2.....column 5 as below along with their valuesm.... (3 Replies)
Discussion started by: Vivekit82
3 Replies

6. Shell Programming and Scripting

AWK, Perl or Shell? Unique strings and their maximum values from 3 column data file

I have a file containing data like so: 2012-01-02 GREEN 4 2012-01-02 GREEN 6 2012-01-02 GREEN 7 2012-01-02 BLUE 4 2012-01-02 BLUE 3 2012-01-02 GREEN 4 2012-01-02 RED 4 2012-01-02 RED 8 2012-01-02 GREEN 4 2012-01-02 YELLOW 5 2012-01-02 YELLOW 2 I can't always predict what the... (4 Replies)
Discussion started by: rich@ardz
4 Replies

7. UNIX for Dummies Questions & Answers

Extracting rows from a space delimited text file based on the values of a column

I have a space delimited text file. I want to extract rows where the third column has 0 as a value and write those rows into a new space delimited text file. How do I go about doing that? Thanks! (2 Replies)
Discussion started by: evelibertine
2 Replies

8. UNIX for Dummies Questions & Answers

Extracting rows from a text file based on numerical values of a column

I have a text file where the second column is a list of numbers going from small to large. I want to extract the rows where the second column is smaller than or equal to 0.0001. My input: rs10082730 9e-08 12 46002702 rs2544081 1e-07 12 46015487 rs1425136 1e-06 7 35396742 rs2712590... (1 Reply)
Discussion started by: evelibertine
1 Replies

9. Shell Programming and Scripting

print unique values of a column and sum up the corresponding values in next column

Hi All, I have a file which is having 3 columns as (string string integer) a b 1 x y 2 p k 5 y y 4 ..... ..... Question: I want get the unique value of column 2 in a sorted way(on column 2) and the sum of the 3rd column of the corresponding rows. e.g the above file should return the... (6 Replies)
Discussion started by: amigarus
6 Replies

10. Shell Programming and Scripting

return a list of unique values of a column from csv format file

Hi all, I have a huge csv file with the following format of data, Num SNPs, 549997 Total SNPs,555352 Num Samples, 157 SNP, SampleID, Allele1, Allele2 A001,AB1,A,A A002,AB1,A,A A003,AB1,A,A ... ... ... I would like to write out a list of unique SNP (column 1). Could you... (3 Replies)
Discussion started by: phoeberunner
3 Replies
Login or Register to Ask a Question