Visit The New, Modern Unix Linux Community


Merging lines based on one column


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Merging lines based on one column
# 1  
Merging lines based on one column

Hi,
I have a file which I'd like to merge lines based on duplicates in one column while keeping the info for other columns. Let me simplify it by an example:
File
Code:
ESR1	ANASTROZOLE	NA	FDA_approved
ESR1	CISPLATIN	NA	FDA_approved
ESR1	DANAZOL	agonist	NA
ESR1	EXEMESTANE	NA	FDA_approved
FOXA1	Cisplatin	NA	NA
GATA3	NA	NA	NA

Now I want to merge the lines which have similar string in the first column:
Code:
ESR1	ANASTROZOLE(NA/FDA_approved);CISPLATIN(NA/FDA_approved);DANAZOL(agonist/NA);EXEMESTANE(NA/FDA_approved)
FOXA1	Cisplatin(NA/NA)
GATA3	NA(NA/NA)

Thanks in advance
# 2  
An awk approach, if order doesn't matter:
Code:
awk '
        {
                A[$1] = A[$1] ? A[$1] ";" $2 "(" $3 "/" $4 ")" : $1 "\t" $2 "(" $3 "/" $4 ")"
        }
        END {
                for ( k in A )
                        print A[k]
        }
' file

This User Gave Thanks to Yoda For This Post:
# 3  
Should order matter, based on Yoda's proposal, try
Code:
awk     '!B[$1]         {A[++i]=$1}
                        {B[$1]=B[$1]";"$2"("$3"/"$4")"}
         END            {for(j=1;j<=i;j++) print A[j] "\t" substr(B[A[j]],2)
                        }
        ' file

This User Gave Thanks to RudiC For This Post:
# 4  
Thank you both.
 

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #543
Difficulty: Easy
A global variable can be accessed and referenced on every line of code.
True or False?

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Merging rows based on same ID in First column.

Hellow, I have a tab-delimited file with 3 columns : BINPACKER.13259.1.p2 SSF48239 BINPACKER.13259.1.p2 PF13243 BINPACKER.13259.1.p2 G3DSA:1.50.10.20 BINPACKER.13259.2.p2 SSF48239 BINPACKER.13259.2.p2 PF13243 BINPACKER.13259.2.p2 G3DSA:1.50.10.20... (7 Replies)
Discussion started by: anjaliANJALI
7 Replies

2. UNIX for Beginners Questions & Answers

Merging multiple lines into single line based on one column

I Want to merge multiple lines based on the 1st field and keep into single record. SRC File: AAA_POC_DB.TAB1 AAA_POC_DB.TAB2 AAA_POC_DB.TAB3 AAA_POC_DB.TAB4 BBB_POC_DB.TAB1 BBB_POC_DB.TAB2 CCC_POC_DB.TAB6 OUTPUT ----------------- 'AAA_POC_DB','TAB1','TAB2','TAB3','TAB4'... (10 Replies)
Discussion started by: raju2016
10 Replies

3. UNIX for Dummies Questions & Answers

File merging based on column patterns

Hello :) I am in this situation: Input: two tab-delimited files, `File1` and `File2`. `File2` (`$2`) has to be parsed by patterns found in `File1` (`$1`). Expected output: tab-delimited file, `File3`. `File3` has to contain the same rows as `File2`, plus the corresponding value in... (5 Replies)
Discussion started by: dovah
5 Replies

4. Shell Programming and Scripting

Merging 2 lines based on a string

Dear Unix gurus I need help with a command or script to merge 2 lines where ever we find the string. I have attached scanned document. First line has string value: VSIN, immediate line has value: SETTLEMENT Where it finds the 2 string values in the whole file, one below the other,... (8 Replies)
Discussion started by: Karunyam
8 Replies

5. Shell Programming and Scripting

Merging columns based on one or more column in two files

I have two files. FileA.txt 30910 rs7468327 36587 rs10814410 91857 rs9408752 105797 rs1133715 146659 rs2262038 152695 rs2810979 181843 rs3008128 182129 rs3008131 192118 rs3008170 FileB.txt 30910 1.9415219673 0 36431 1.3351312477 0.0107191428 36587 1.3169171182... (2 Replies)
Discussion started by: genehunter
2 Replies

6. Shell Programming and Scripting

Merging Lines based on criteria

Hello, Need help with following scenario. A file contains following text: {beginning of file} New: This is a new record and it is not on same line. Since I have lost touch with script take this challenge and bring all this in one line. New: Hello losttouch. You seem to be struggling... (4 Replies)
Discussion started by: losttouch
4 Replies

7. Shell Programming and Scripting

merging two files based on first column

I had two files file1 and file2. I want a o/p file(file3) like below using first column as ref. Pls give suggestion ass join is not working as the number of lines in each file is nealry 5 C? file1 --------------------- 404000324810001 Y 404000324810004 N 404000324810008 Y 404000324810009 N... (1 Reply)
Discussion started by: p_sai_ias
1 Replies

8. Shell Programming and Scripting

Merging 2 files based on a common column

Hi All, I do have 2 files file 1 has 4 tab delimited columns 234 a c dfgyu 294 b g fih 302 c h jzh 328 z c san 597 f g son File 2 has 2 tab delimted columns 234 23 302 24 597 24 I want to merge file 2 with file 1 based on the data common in both files which is the first column so... (6 Replies)
Discussion started by: Lucky Ali
6 Replies

9. Shell Programming and Scripting

merging column from two files based on identifier

Hi, I have two files consisting of two columns. So I want to merge column 2 if column 1 is the same. So heres an example of what I mean. FILE1 driver 444 car 333 hat 222 FILE2 driver 333 car 666 hat 999 So I want to merge the column 2's together so... (4 Replies)
Discussion started by: phil_heath
4 Replies

10. Shell Programming and Scripting

Merging lines based on occurances of a particular character in a file

Hi, Is there any way to merge two lines based on specific occurance of a character in a file. I am having a flat file which contains multiple records. Each row in the file should contain specified number of delimiter. For a particular row , if the delimiter count is not matched with... (2 Replies)
Discussion started by: mohan_tuty
2 Replies

Featured Tech Videos