Merging and Adding colon to columns


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Merging and Adding colon to columns
# 1  
Old 04-12-2018
Merging and Adding colon to columns

Hello,
I have a tab delim file that looks like this

Code:
CHROM    POS    ID    REF    ALT    ID    HGVS_C    HGVS_P
1    17319011    rs2076603    G    A    NM_022089.3,NM_001141973.2,NM_001141974.2    c.1815C>T,c.1800C>T,c.1800C>T    p.Pro605Pro,p.Pro600Pro,p.Pro600Pro
1    20960230    rs45530340    C    T    NM_032409.2,NR_106732.1    c.189C>T,n.59C>T    p.Leu63Leu,.
1    20964328    rs2298298    A    G    NM_032409.2,NR_106732.1,NR_046507.1    c.388-7A>G,n.*4047A>G,n.*4822T>C    .,.,.
1    20972048    rs3131713    G    A    NM_032409.2,NR_046507.1    c.960-5G>A,n.3981+30C>T    .,.
1    43395635    rs2229682    C    T    NM_006516.2    c.588G>A    p.Pro196Pro
1    43396414    rs11537641    G    A    NM_006516.2    c.399C>T    p.Cys133Cys
1    43408966    rs1385129    G    A    NM_006516.2    c.45C>T    p.Ala15Ala

I need the output file to look like this where ID column is merged with the last two. The columns do not have a consistent number of values in it, sometimes its just a single value, sometimes many and sometimes it's a blank.

Code:
CHROM    POS    ID    REF    ALT    ID:HGVS_C    ID:HGVS_P
1    17319011    rs2076603    G    A    NM_022089.3:c.1815C>T,NM_001141973.2:c.1800C>T,NM_001141974.2:c.1800C>T    NM_022089.3:p.Pro605Pro,NM_001141973.2:p.Pro600Pro,NM_001141974.2:p.Pro600Pro
1    20960230    rs45530340    C    T    NM_032409.2:c.189C>T,NR_106732.1:n.59C>T    NM_032409.2:p.Leu63Leu,.
1    20964328    rs2298298    A    G    NM_032409.2:c.388-7A>G,NR_106732.1:n.*4047A>G,NR_046507.1:n.*4822T>C    
1    20972048    rs3131713    G    A    NM_032409.2:c.960-5G>A,NR_046507.1:n.3981+30C>T    
1    43395635    rs2229682    C    T    NM_006516.2:c.588G>A    NM_006516.2:p.Pro196Pro
1    43396414    rs11537641    G    A    NM_006516.2:c.399C>T    NM_006516.2:p.Cys133Cys
1    43408966    rs1385129    G    A    NM_006516.2:c.45C>T    NM_006516.2:p.Ala15Ala

I tried this

Code:
awk '{print $1"\t"$2"\t"$3"\t"$4"\t"$5"\t"$6":"$7"\t"$6":"$8}' input.txt > output.txt

It works for single values (ie last 3-4 rows) but I am not able to merge multiple values together (first few rows).

Any help/suggestions are appreciated.

thank you
# 2  
Old 04-12-2018
Like so:
Code:
awk '
        {n = split ($6, T, ",")
         m = split ($7, V, ",")
         l = split ($8, W, ",")
         $6 = $7 = $8 = ""
         for (i=1; i<=n; i++)   {$6 = $6 T[i] ":" V[i] (i==n?"":",")
                                 $7 = $7 T[i] ":" W[i] (i==n?"":",")
                                }
        }
1
' OFS="\t" file
CHROM	POS	ID	REF	ALT	ID:HGVS_C	ID:HGVS_P	
1	17319011	rs2076603	G	A	NM_022089.3:c.1815C>T,NM_001141973.2:c.1800C>T,NM_001141974.2:c.1800C>T	NM_022089.3:p.Pro605Pro,NM_001141973.2:p.Pro600Pro,NM_001141974.2:p.Pro600Pro	
1	20960230	rs45530340	C	T	NM_032409.2:c.189C>T,NR_106732.1:n.59C>T	NM_032409.2:p.Leu63Leu,NR_106732.1:.	
1	20964328	rs2298298	A	G	NM_032409.2:c.388-7A>G,NR_106732.1:n.*4047A>G,NR_046507.1:n.*4822T>C	NM_032409.2:.,NR_106732.1:.,NR_046507.1:.	
1	20972048	rs3131713	G	A	NM_032409.2:c.960-5G>A,NR_046507.1:n.3981+30C>T	NM_032409.2:.,NR_046507.1:.	
1	43395635	rs2229682	C	T	NM_006516.2:c.588G>A	NM_006516.2:p.Pro196Pro	
1	43396414	rs11537641	G	A	NM_006516.2:c.399C>T	NM_006516.2:p.Cys133Cys	
1	43408966	rs1385129	G	A	NM_006516.2:c.45C>T	NM_006516.2:p.Ala15Ala

?
This User Gave Thanks to RudiC For This Post:
# 3  
Old 04-12-2018
Hi, try:
Code:
awk '
  {
    n=split($6,F,/,/)
    split($7,G,/,/)
    split($8,H,/,/)
    $6=$7=$8=""
    for(i=1; i<=n; i++) { 
      s=(i>1)?",":""
      $6=$6 s F[i] ":" G[i]
      $7=$7 s F[i] ":" H[i]
    } 
    print $1,$2,$3,$4,$5,$6,$7
  }
' FS='\t' OFS='\t' file


Last edited by Scrutinizer; 04-12-2018 at 08:34 AM..
This User Gave Thanks to Scrutinizer For This Post:
# 4  
Old 04-12-2018
Thank you so much. Both the solutions worked!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Adding columns from 2 files with variable number of columns

I have two files, file1 and file2 who have identical number of rows and columns. However, the script is supposed to be used for for different files and I cannot know the format in advance. Also, the number of columns changes within the file, some rows have more and some less columns (they are... (13 Replies)
Discussion started by: maya3
13 Replies

2. Shell Programming and Scripting

Merging Columns

Hi, Can you please help me. I have 2 files to merge File1 1251 743 1250 742 1249 741 1248 749 1247 722 1246 740 1245 739 1244 740 1243 705 1242 631 1241 590 File2 (2 Replies)
Discussion started by: jiam912
2 Replies

3. Shell Programming and Scripting

Adding columns with values dependent on existing columns

Hello I have a file as below chr1 start ref alt code1 code2 chr1 18884 C CAAAA 2 0 chr1 135419 TATACA T 2 0 chr1 332045 T TTG 0 2 chr1 453838 T TAC 2 0 chr1 567652 T TG 1 0 chr1 602541 ... (2 Replies)
Discussion started by: plumb_r
2 Replies

4. Shell Programming and Scripting

Merging two columns into one

Suppose I have file1.txt 1 2 4 5 10 11 and I want to produce 1 2 4 5 10 11 file2.txt Thanks for your help :) (2 Replies)
Discussion started by: johnkim0806
2 Replies

5. UNIX for Dummies Questions & Answers

Merging two text files by two columns

Hi, I have two text files that I would like to merge/join. I would like to join them if the first columns of both text files match and the second column of the first text file matches the third column of the second text file. Example input: First file: 1334 10 0 0 1 5.2 1334 12 0 0 1 4.5... (4 Replies)
Discussion started by: evelibertine
4 Replies

6. Shell Programming and Scripting

merging files and adding special columns

Hi everyone, I got a problem with merging files and hoped one of you would have an idea how to approach this issue. I tried it with awk, but didn't get far. This is what I have: I got 40 files looking like the ones below. All have three columns but the number of rows differs (20000 to 50000).... (6 Replies)
Discussion started by: TuAd
6 Replies

7. Shell Programming and Scripting

Merging columns from multiple files

Hello, I have a number of tab delimited data files consists of two columns. Like that: File1 800.000000 0.002744 799.000000 0.002517 798.000000 0.002836 797.000000 0.002553 FIle2 800.000000 0.000261 799.000000 0.000001 798.000000 0.000551 797.000000 0.000275 File3... (19 Replies)
Discussion started by: erden
19 Replies

8. Shell Programming and Scripting

merging line and adding number

I having file below o/p ibapp311dg,,20480,s,,,,,,,,, test,,20480,s,,,,,,,,, test,,20480,s,,,,,,,,, ibapp311dg,,20480,s,,,,,,,,, I want to to chk unique word line in the first field seperated by , as well as addup corressponding the number in field for each unique word like ibapp311dg... (8 Replies)
Discussion started by: tarunn.dubeyy
8 Replies

9. UNIX for Dummies Questions & Answers

Merging two columns

Hi, I have two columns that look like this (tab seperated): name top carl ball bob lost joe smith I want the two columns to merge and look like this: nametop carlball boblost joesmith Also, I want to trim the edges of a column. So lets say the above column... (3 Replies)
Discussion started by: phil_heath
3 Replies

10. Shell Programming and Scripting

Merging columns

Hi, I have input file. File1: Seqno Name 124 name1 121 name2 123 name3 122 name4 We will send the file1 to some other team. They will replace name column with place in file1 and send back to us as file2. file2: Seqno Place 124 place1 121 place2 123 place3file2: (5 Replies)
Discussion started by: manneni prakash
5 Replies
Login or Register to Ask a Question