converting unique identifiers in a column using conversion file


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers converting unique identifiers in a column using conversion file
# 1  
Old 01-21-2011
converting unique identifiers in a column using conversion file

Hello,

I often have this problem:

I have a file with a column of unique identifiers

e.g. file1 below has an id column and data column/columns with p rows:
Code:
 
cat data1
dog data2
cow data3
.
.
.
elephant datap-1
horse datap

and I have a conversion file,file2, with n<p rows and 2 columns: a column of the original ids and a column with the converted ids:
e.g.
Code:
 
horse earl
cat maggi
.
.
.
elephant jim

I want to out put the following output file, file3, with n rows:

Code:
 
earl datap
maggi data1
.
.
.
jim datap-1

Note that the n rows in the conversion file, file2, are not in the same order as in file3

I.E., I want file3 to be a subset of file1 where the first column of unique ids are replaced with converted unique ids using the conversion file.

Thanks much!Smilie
# 2  
Old 01-21-2011
Try:
Code:
awk 'NR==FNR{A[$1]=$2;next}{print $2,A[$1]}' file1 file2

This User Gave Thanks to Scrutinizer For This Post:
# 3  
Old 01-21-2011
Thanks, this is really helpful! What if file1 had several fields after the first one, and I wanted to print out all of the fields (not just $2)?
# 4  
Old 01-21-2011
Something like this:
Code:
awk 'NR==FNR{p=$1;sub($1 FS,x);A[p]=$0;next}{print $2,A[$1]}' file1 file2

This User Gave Thanks to Scrutinizer For This Post:
# 5  
Old 01-21-2011
It works very well thanks!
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

CSV File:Filter duplicate records from column1 & another column having unique record

Hi Experts, I have csv file with 30, 40 columns Pasting just 2 column for problem description. Need to print error if below combination is not present in file check for column-1 (DocumentNumber) and filter columns where value in DocumentNumber field is same. For all such rows, the field... (7 Replies)
Discussion started by: as7951
7 Replies

2. Shell Programming and Scripting

Count occurrence of column one unique value having unique second column value

Hello Team, I need your help on the following: My input file a.txt is as below: 3330690|373846|108471 3330690|373846|108471 0640829|459725|100001 0640829|459725|100001 3330690|373847|108471 Here row 1 and row 2 of column 1 are identical but corresponding column 2 value are... (4 Replies)
Discussion started by: angshuman
4 Replies

3. Shell Programming and Scripting

Merging two tables including multiple ocurrence of column identifiers and unique lines

I would like to merge two tables based on column 1: File 1: 1 today 1 green 2 tomorrow 3 red File 2: 1 a lot 1 sometimes 2 at work 2 at home 2 sometimes 3 new 4 a lot 5 sometimes 6 at work (4 Replies)
Discussion started by: BSP
4 Replies

4. Shell Programming and Scripting

Extracting unique values of a column from a feed file

Hi Folks, I have the below feed file named abc1.txt in which you can see there is a title and below is the respective values in the rows and it is completely pipe delimited file ,. ... (4 Replies)
Discussion started by: punpun66
4 Replies

5. Linux

To get all the columns in a CSV file based on unique values of particular column

cat sample.csv ID,Name,no 1,AAA,1 2,BBB,1 3,AAA,1 4,BBB,1 cut -d',' -f2 sample.csv | sort | uniq this gives only the 2nd column values Name AAA BBB How to I get all the columns of CSV along with this? (1 Reply)
Discussion started by: sanvel
1 Replies

6. Shell Programming and Scripting

Converting Single Column into Multiple rows, but with strings to specific tab column

Dear fellows, I need your help. I'm trying to write a script to convert a single column into multiple rows. But it need to recognize the beginning of the string and set it to its specific Column number. Each Line (loop) begins with digit (RANGE). At this moment it's kind of working, but it... (6 Replies)
Discussion started by: AK47
6 Replies

7. UNIX for Dummies Questions & Answers

Find & Replace identifiers using a conversion table

Hi ! I have input.tab with one column containing Item IDs under a number format (the second column is the Location of this item): Location Item ID rack1 12; 35; 43 rack35 23; 894; 5478; 98 etc... (The number of Items per row is variable. Item IDs in a same field are... (17 Replies)
Discussion started by: lucasvs
17 Replies

8. Shell Programming and Scripting

AWK, Perl or Shell? Unique strings and their maximum values from 3 column data file

I have a file containing data like so: 2012-01-02 GREEN 4 2012-01-02 GREEN 6 2012-01-02 GREEN 7 2012-01-02 BLUE 4 2012-01-02 BLUE 3 2012-01-02 GREEN 4 2012-01-02 RED 4 2012-01-02 RED 8 2012-01-02 GREEN 4 2012-01-02 YELLOW 5 2012-01-02 YELLOW 2 I can't always predict what the... (4 Replies)
Discussion started by: rich@ardz
4 Replies

9. Shell Programming and Scripting

return a list of unique values of a column from csv format file

Hi all, I have a huge csv file with the following format of data, Num SNPs, 549997 Total SNPs,555352 Num Samples, 157 SNP, SampleID, Allele1, Allele2 A001,AB1,A,A A002,AB1,A,A A003,AB1,A,A ... ... ... I would like to write out a list of unique SNP (column 1). Could you... (3 Replies)
Discussion started by: phoeberunner
3 Replies

10. Shell Programming and Scripting

Converting Column to Rows in a Flat file

Hi, Request To guide me in writing a shell program for the following requirement: Example:if the Input File contains the follwing data Input File Data: 80723240029,12,323,443,88,98,7,98,67,87 80723240030,12,56,6,,,3,12,56,6,7,2,3,12,56,6,7,2,3,88,98,7,98,67,87... (5 Replies)
Discussion started by: srinikal
5 Replies
Login or Register to Ask a Question