Visit The New, Modern Unix Linux Community


Merging lines based on one column


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Merging lines based on one column
# 1  
Merging lines based on one column

Hi,
I have a file which I'd like to merge lines based on duplicates in one column while keeping the info for other columns. Let me simplify it by an example:
File
Code:
ESR1	ANASTROZOLE	NA	FDA_approved
ESR1	CISPLATIN	NA	FDA_approved
ESR1	DANAZOL	agonist	NA
ESR1	EXEMESTANE	NA	FDA_approved
FOXA1	Cisplatin	NA	NA
GATA3	NA	NA	NA

Now I want to merge the lines which have similar string in the first column:
Code:
ESR1	ANASTROZOLE(NA/FDA_approved);CISPLATIN(NA/FDA_approved);DANAZOL(agonist/NA);EXEMESTANE(NA/FDA_approved)
FOXA1	Cisplatin(NA/NA)
GATA3	NA(NA/NA)

Thanks in advance
# 2  
An awk approach, if order doesn't matter:
Code:
awk '
        {
                A[$1] = A[$1] ? A[$1] ";" $2 "(" $3 "/" $4 ")" : $1 "\t" $2 "(" $3 "/" $4 ")"
        }
        END {
                for ( k in A )
                        print A[k]
        }
' file

This User Gave Thanks to Yoda For This Post:
# 3  
Should order matter, based on Yoda's proposal, try
Code:
awk     '!B[$1]         {A[++i]=$1}
                        {B[$1]=B[$1]";"$2"("$3"/"$4")"}
         END            {for(j=1;j<=i;j++) print A[j] "\t" substr(B[A[j]],2)
                        }
        ' file

This User Gave Thanks to RudiC For This Post:
# 4  
Thank you both.
 

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #136
Difficulty: Easy
The IEEE named Linus Torvalds as the recipient of the IEEE Computer Society's Computer Pioneer Award in 2014.
True or False?

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Merging rows based on same ID in First column.

Hellow, I have a tab-delimited file with 3 columns : BINPACKER.13259.1.p2 SSF48239 BINPACKER.13259.1.p2 PF13243 BINPACKER.13259.1.p2 G3DSA:1.50.10.20 BINPACKER.13259.2.p2 SSF48239 BINPACKER.13259.2.p2 PF13243 BINPACKER.13259.2.p2 G3DSA:1.50.10.20... (7 Replies)
Discussion started by: anjaliANJALI
7 Replies

2. UNIX for Beginners Questions & Answers

Merging multiple lines into single line based on one column

I Want to merge multiple lines based on the 1st field and keep into single record. SRC File: AAA_POC_DB.TAB1 AAA_POC_DB.TAB2 AAA_POC_DB.TAB3 AAA_POC_DB.TAB4 BBB_POC_DB.TAB1 BBB_POC_DB.TAB2 CCC_POC_DB.TAB6 OUTPUT ----------------- 'AAA_POC_DB','TAB1','TAB2','TAB3','TAB4'... (10 Replies)
Discussion started by: raju2016
10 Replies

3. UNIX for Dummies Questions & Answers

File merging based on column patterns

Hello :) I am in this situation: Input: two tab-delimited files, `File1` and `File2`. `File2` (`$2`) has to be parsed by patterns found in `File1` (`$1`). Expected output: tab-delimited file, `File3`. `File3` has to contain the same rows as `File2`, plus the corresponding value in... (5 Replies)
Discussion started by: dovah
5 Replies

4. Shell Programming and Scripting

Merging 2 lines based on a string

Dear Unix gurus I need help with a command or script to merge 2 lines where ever we find the string. I have attached scanned document. First line has string value: VSIN, immediate line has value: SETTLEMENT Where it finds the 2 string values in the whole file, one below the other,... (8 Replies)
Discussion started by: Karunyam
8 Replies

5. Shell Programming and Scripting

Merging columns based on one or more column in two files

I have two files. FileA.txt 30910 rs7468327 36587 rs10814410 91857 rs9408752 105797 rs1133715 146659 rs2262038 152695 rs2810979 181843 rs3008128 182129 rs3008131 192118 rs3008170 FileB.txt 30910 1.9415219673 0 36431 1.3351312477 0.0107191428 36587 1.3169171182... (2 Replies)
Discussion started by: genehunter
2 Replies

6. Shell Programming and Scripting

Merging Lines based on criteria

Hello, Need help with following scenario. A file contains following text: {beginning of file} New: This is a new record and it is not on same line. Since I have lost touch with script take this challenge and bring all this in one line. New: Hello losttouch. You seem to be struggling... (4 Replies)
Discussion started by: losttouch
4 Replies

7. Shell Programming and Scripting

merging two files based on first column

I had two files file1 and file2. I want a o/p file(file3) like below using first column as ref. Pls give suggestion ass join is not working as the number of lines in each file is nealry 5 C? file1 --------------------- 404000324810001 Y 404000324810004 N 404000324810008 Y 404000324810009 N... (1 Reply)
Discussion started by: p_sai_ias
1 Replies

8. Shell Programming and Scripting

Merging 2 files based on a common column

Hi All, I do have 2 files file 1 has 4 tab delimited columns 234 a c dfgyu 294 b g fih 302 c h jzh 328 z c san 597 f g son File 2 has 2 tab delimted columns 234 23 302 24 597 24 I want to merge file 2 with file 1 based on the data common in both files which is the first column so... (6 Replies)
Discussion started by: Lucky Ali
6 Replies

9. Shell Programming and Scripting

merging column from two files based on identifier

Hi, I have two files consisting of two columns. So I want to merge column 2 if column 1 is the same. So heres an example of what I mean. FILE1 driver 444 car 333 hat 222 FILE2 driver 333 car 666 hat 999 So I want to merge the column 2's together so... (4 Replies)
Discussion started by: phil_heath
4 Replies

10. Shell Programming and Scripting

Merging lines based on occurances of a particular character in a file

Hi, Is there any way to merge two lines based on specific occurance of a character in a file. I am having a flat file which contains multiple records. Each row in the file should contain specified number of delimiter. For a particular row , if the delimiter count is not matched with... (2 Replies)
Discussion started by: mohan_tuty
2 Replies

Featured Tech Videos