Adding info to end of line if two columns match from files with different separators


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Adding info to end of line if two columns match from files with different separators
# 1  
Old 04-12-2013
Adding info to end of line if two columns match from files with different separators

I have two files (csv and vcf) which look exactly like this

S1.csv
Code:
func,gene,start,info
"exonic","AL","2309","het"
"exonic","NEF","6912","hom"

S1.vcf
Code:
##fileinfo
#CHROM POS ID INFO
chr1      4567     rs323211     1/1:84,104,99
chr4      2309     rs346742     1/1:27,213,90
chr6      5834     rs234492     0/1:22,765,22
chr8      6912     rs239299     1/1:56,765,13

I want to add the fourth line in the second file to the end of the line in the first file if the number in column 2 (file2) matches the number in column 3 (file1)

Also the first file is seperate using , and " (except for first line) while the second is tab seperated

The final file should look like:

Code:
func,gene,start,info
"exonic","AL","2309","het","1/1:27,213,90"
"exonic","NEF","6912","hom","1/1:56,765,13"

I have to do this for 200 files, each pair has the same name with either .csv or .vcf
# 2  
Old 04-12-2013
Try this:
Code:
awk 'NR==FNR{a[$2]=$4; next}a[$6]{$NF="," FS a[$6] FS}1' S1.vcf OFS=\" FS=\" S1.csv

# 3  
Old 04-12-2013
That works for the exact files I gave you, but not for my actual files.

The main difference is that my actual files are bigger and the columns for the first file are in a different place - the number to be searched is in column 34

also the column I want copied over is actually in column 10, I assume I change to $4 to $10 to get this right.

What's the $6 for?
# 4  
Old 04-12-2013
Quote:
Originally Posted by Sarah_19
The main difference is that my actual files are bigger and the columns for the first file are in a different place - the number to be searched is in column 34

also the column I want copied over is actually in column 10, I assume I change to $4 to $10 to get this right.
Yes.

Quote:
Originally Posted by Sarah_19
What's the $6 for?
For the file S1.csv I've use the double quote as field separator. The "3th" field with a comma separator is the 6th field with the double quote as field separator.
# 5  
Old 04-12-2013
ok so as it turns out, due to the csv coming from excel, my seperators look more like this:

"exonic","ALS",4567,112,44,text,"moretext","34","text,with,comma"

I got the code to work some of the time using a comma instead of quotation mark, for example if I change $6 to $34 I get three correctly adding info, if I add $36 I get a differnt five correctly adding info

can you say multiple columns could have the number in?

Thanks in advance
# 6  
Old 04-13-2013
Try this. It's based on the file samples of your first post. For the second file the field separator is setted to a comma:
Code:
awk 'NR==FNR{a[$2]=$4; next}{s=$0;gsub("\"",x)}a[$3]{$0=s ",\"" a[$3] "\""}1' S1.vcf FS=, S1.csv

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Data match 2 files based on first 2 columns matching only and join if match

Hi, i have 2 files , the data i need to match is in masterfile and i need to pull out column 3 from master if column 1 and 2 match and output entire row to new file I have tried with join and awk and i keep getting blank outputs or same file is there an easier way than what i am... (4 Replies)
Discussion started by: axis88
4 Replies

2. Shell Programming and Scripting

Adding columns from 2 files with variable number of columns

I have two files, file1 and file2 who have identical number of rows and columns. However, the script is supposed to be used for for different files and I cannot know the format in advance. Also, the number of columns changes within the file, some rows have more and some less columns (they are... (13 Replies)
Discussion started by: maya3
13 Replies

3. Shell Programming and Scripting

Adding line in a file using info from previous line

I have a shell script that looks something like the following: mysql -uroot db1 < db1.sql mysql -uroot db2 < db2.sql mysql -uroot db3 < db3.sql mysql -uroot db4 < db4.sql .... different db names in more than 160 lines. I want to run this script with nohup and have a status later. So,... (6 Replies)
Discussion started by: MKH
6 Replies

4. UNIX for Advanced & Expert Users

Help in adding a string at the end of each line and append files vertically

hi, i need a help in the script , need to append a string at the end of each line of a files , and append the files into a single file vertically. eg file1 has the following columns abc,def,aaa aaa,aa,aaa files 2 has the following rows and columns abc,def,aaa aaa,aa,aaa i... (3 Replies)
Discussion started by: senkerth
3 Replies

5. Shell Programming and Scripting

Adding semicolon at the end of each line

Hi, I have a script which I need to change. I want to add a semicolon at the end of each line where the line starts with "grant" for e.g. create table(.... ); grant select on TABL1 to USER1 grant select on TABL1 to USER2should become create table(.... ); grant select on TABL1 to... (3 Replies)
Discussion started by: pparthiv
3 Replies

6. Shell Programming and Scripting

Adding tab/new line at the end of each line of a file

Hello Everyone, I need a help from experts of this community regarding one of the issue that I am facing with shell scripting. My requirement is to append char's at the end of each line of a file. The char that will be appended is variable and will be passed through command line. The... (20 Replies)
Discussion started by: Sourav Das
20 Replies

7. UNIX for Dummies Questions & Answers

Adding comma at the end of every line

Hi all, I have this sample file (actual file is larger) and i need to add comma at the end of every line. 1234 4335 232345 1212 3535 Output 1234, 4335, 232345, 1212, 3535, TIA - jak (2 Replies)
Discussion started by: jakSun8
2 Replies

8. Shell Programming and Scripting

adding characters end of line where line begins with..

Hi all, using VI, can anyone tell me how to add some characters onto the end of a line where the line begins with certain charactars eg a,b,c,......., r,s,t,........, a,b,c,......., all lines in the above example starting with a,b,c, I want to add an x at the end of the line so the... (6 Replies)
Discussion started by: satnamx
6 Replies

9. Shell Programming and Scripting

Adding lines at end of a line

This is what I want to do. I want to write a script that reads each line (of the highlighted file below) and add a specific number of blank lines (sometime 2, 3 or 5 lines) at the end of each line while copying that line. For example, here is the input. The sky is blue. I like to eat. I like... (19 Replies)
Discussion started by: Ernst
19 Replies

10. Shell Programming and Scripting

Adding new line at the end of file

Hi I have few files. For some files the cursor is at the end of last line. For other files, cursor is at the new line at the end. I want to bring the cursor down to next line for the files that are having cursor at the end of last line In otherwords, I want to introduce a blank line at the... (5 Replies)
Discussion started by: somesh_p
5 Replies
Login or Register to Ask a Question