Merge row based on replicates ID


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Merge row based on replicates ID
# 1  
Old 05-20-2016
Merge row based on replicates ID

Dear All,
I was wondering if you may help me with an issue.
I would like to merge row based on column 1.

input file:
Code:
b1 ggg b2 fff NA NA hhh NA NA NA NA NA
a1 xxx a2 yyy NA NA zzz NA NA NA NA NA
a1 xxx NA NA a3 ttt NA ggg NA NA NA NA

output file:
Code:
b1 ggg b2 fff NA NA hhh NA NA NA NA NA
a1 xxx a2 yyy a3 ttt zzz ggg NA NA NA NA

well, basically if column 1 has the same ID (and there aren't more that two equal replicate ID) I would like to replace the NA value of first replicate row with the value (in the same column) of second replicate.
If both are NA or other same values leave as it is. Just replace the NA value with Not NA value in the same column withint replicates (column 1).

Well I think that the explanation is bad but the example should be clear I guess.

Let me know please if you need futher details

Thank you as always for your help.

Giuliano
# 2  
Old 05-20-2016
Code:
awk '
{a[$1]=$1; f[$1]=NF; for (i=2; i<=NF; i++) if ($i != "NA") b[$1 FS i]=$i} # load id and id columns arrays
END {
   for (i in a) {                                                         # loop thru id array
      l=i;                                                                # initialize output string
      for (j=2; j<=f[i]; j++) {                                           # loop thru id columns array
         l=l FS ((b[i FS j]) ? b[i FS j] : "NA");                         # fill output string with output values from id columns array
      }
      print l;                                                            # print output string
   }
}' infile

This User Gave Thanks to rdrtx1 For This Post:
# 3  
Old 05-20-2016
That works perfectly!
thanks!

G
# 4  
Old 05-20-2016
Hi Giuliano,
The code rdrtx1 suggested looks like it will work fine for the input files similar to the sample you provided, but it doesn't take into account the part of your description that said "and there aren't more that two equal replicate ID".

What should happen if there are three or more lines in your input file that have the same string in column 1?
# 5  
Old 05-21-2016
Don, Guiliano means there are not more than two lines with the same ID in column1.
And otherwise rdrtx1 solution would even handle it well.
--
In case the duplicate IDs are in adjacent lines, the following saves some memory
Code:
awk '
function printP(){ o=P[1]; for (i=2; i<=NF; i++) o=(o FS P[i]); print o }
function updateP(){ for (i=2; i<=NF; i++) if (P[i]=="NA") P[i]=$i }
NR>1 {
  if ($1==P[1]) {
    updateP()
    next
  }
  printP()
}
{ split ($0,P) }
END { printP() }
' infile

# 6  
Old 05-21-2016
Yes my specific case was without more that 2 replicates.
Thank you for your help!

Best
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Splitting single row into multiple rows based on for every 10 digits of last field of the row

Hi ALL, We have requirement in a file, i have multiple rows. Example below: Input file rows 01,1,102319,0,0,70,26,U,1,331,000000113200000011920000001212 01,1,102319,0,1,80,20,U,1,241,00000059420000006021 I need my output file should be as mentioned below. Last field should split for... (4 Replies)
Discussion started by: kotra
4 Replies

2. Shell Programming and Scripting

Help with merge row if share same column info

Input file: 32568 SSO7483 32568 SSO7486 117231 SSO1293 117231 SSO1772 178081 SSO3076 178081 SSO3077 222417 porA-2 222417 porB-2 263778 SSO1245 263778 SSO0509 . . Desired output: 32568 SSO7483,SSO7486 117231 ... (3 Replies)
Discussion started by: perl_beginner
3 Replies

3. Shell Programming and Scripting

Help with merge data at same column but different row inquiry

Hi, Anyone did experience to merge data at same column but different row previously by using awk, sed, perl, etc? Input File: SSO12256 SSO0001 thiD-1 rbsK-1 SSO0006 SSO0007 SSO0008 SSO0009 SSO0010 SSO0011 Desired Output File: (5 Replies)
Discussion started by: perl_beginner
5 Replies

4. Shell Programming and Scripting

Merge lines based on match

I am trying to merge two lines to one based on some matching condition. The file is as follows: Matches filter: 'request ', timestamp, <HTTPFlow request=<GET: Matches filter: 'request ', timestamp, <HTTPFlow request=<GET: Matches filter: 'request ', timestamp, <HTTPFlow ... (8 Replies)
Discussion started by: jamie_123
8 Replies

5. Shell Programming and Scripting

Merge files based on columns

011111123444 1234 1 20000 011111123444 1235 1 30000 011111123446 1234 3 40000 011111123447 1234 4 50000 011111123448 1234 3 50000 File2: 011111123444,Rsttponrfgtrgtrkrfrgtrgrer 011111123446,Rsttponrfgtrgtr 011111123447,Rsttponrfgtrguii 011111123448,Rsttponrfgtrgtjiiu I have 2 files... (4 Replies)
Discussion started by: vinus
4 Replies

6. Shell Programming and Scripting

How to merge multiple rows into single row if first column matches ?

Hi, Can anyone suggest quick way to get desired output? Sample input file content: A 12 9 A -0.3 2.3 B 1.0 -4 C 34 1000 C -111 900 C 99 0.09 Output required: A 12 9 -0.3 2.3 B 1.0 -4 C 34 1000 -111 900 99 0.09 Thanks (3 Replies)
Discussion started by: cbm_000
3 Replies

7. Shell Programming and Scripting

Help with merge data based on similarity

Input_file data1 USA 100 ASE data3 UK 20 GWQR data4 Brazil 40 QWE data2 Scotland 60 THWE data5 USA 40 QWERR Reference_file USA 12312 34532 1324 Brazil 23321 231 3421 Scotland 342 34235 UK 231 141 England... (1 Reply)
Discussion started by: patrick87
1 Replies

8. Shell Programming and Scripting

How to merge lines based off of text?

Hello Everyone, I have two files, similar to the following: File 1: 8010 ITEM01 CODE1 FLAG1 filler filler 7020 OBJECT CODE2 FLAG2 filler 6010 THING1 CODE4 FLAG4 6011 ITEM20 CODE7 FLAG7 File 2 contains: 6020 ITEM01 CODEA FLAGA filler filler filler 7000 OBJECT CODEB... (2 Replies)
Discussion started by: jl487
2 Replies

9. Shell Programming and Scripting

merge files with same row values

Hi everyone, I'm just wondering how could I using awk language merge two files by comparison of one their row. I mean, I have one file like this: file#1: 21/07/2009 11:45:00 100.0000000 27.2727280 21/07/2009 11:50:00 75.9856644 25.2492676 21/07/2009 11:55:00 51.9713287 23.2258072... (4 Replies)
Discussion started by: tonet
4 Replies

10. Shell Programming and Scripting

Merge Two Files based on First column

Hi, I need to join two files based on first column of both files.If first column of first file matches with the first column of second file, then the lines should be merged together and go for next line to check. It is something like: File one: 110001 abc efd 110002 fgh dfg 110003 ... (10 Replies)
Discussion started by: apjneeraj
10 Replies
Login or Register to Ask a Question