Help with reformat data set


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with reformat data set
# 1  
Old 10-31-2012
Help with reformat data set

Input file
Code:
4CL1	O24145	CoA1	
4CL1	P31684	CoA1	
4CL1	Q54P77	CoA_1	
73	O36421	Unknown	
4CL3	Q9S777	coumarate	
4CL3	Q54P79	coumarate	
4CL3	QP7932	coumarate

Desired output result
Code:
4CL1	O24145#P31684	CoA1	
4CL1	Q54P77	CoA_1	
73	O36421	Unknown	
4CL3	Q9S777#Q54P79#QP7932	coumarate

I have long list of input file as shown above.
If column 1 and column 3 is exactly the same content.
I would like to merge the info at column 2 with a "#" and then print out it respectively column 1 and column 3 info.

Thanks for any advice.

Last edited by perl_beginner; 10-31-2012 at 02:00 AM..
# 2  
Old 10-31-2012
I have similar question too Smilie
# 3  
Old 10-31-2012
If the file is sorted , a quick and dirty one:
Code:
awk '{if ( a[$1] != $3 ) {if ( L ){print " "L};printf $1" "$2}else{printf "#"$2};L=$3;a[$1]=L}END{print " "L}' infile

Code:
4CL1 O24145#P31684 CoA1
4CL1 Q54P77 CoA_1
73 O36421 Unknown
4CL3 Q9S777#Q54P79#QP7932 coumarate

This User Gave Thanks to Klashxx For This Post:
# 4  
Old 10-31-2012
Code:
awk '{a[$3]=a[$3]?a[$3]"#"$2:$1" "$2}END{for(i in a){print a[i],i}}' file

This User Gave Thanks to pamu For This Post:
# 5  
Old 10-31-2012
Quote:
Originally Posted by pamu
Code:
awk '{a[$3]=a[$3]?a[$3]"#"$2:$1" "$2}END{for(i in a){print a[i],i}}' file

Pamu, could you explain how this works:
Code:
a[$3]=a[$3]?a[$3]"#"$2:$1" "$2

I never really understood what that does exactly...
# 6  
Old 10-31-2012
Quote:
Originally Posted by Subbeh
Code:
a[$3]=a[$3]?a[$3]"#"$2:$1" "$2

I never really understood what that does exactly...
Please check this

Just replace s with a[$3] hereSmilie
These 2 Users Gave Thanks to pamu For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help with reformat data structure

Input file: bv|111259484|pir||T49736_real_data bv|159484|pir||T9736_data_figure bv|113584|prf|T4736|truth bv|113584|pir||T4736_truth Desired output: bv|111259484|pir|T49736|real_data bv|159484|pir|T9736|data_figure bv|113584|prf|T4736|truth bv|113584|pir|T4736|truth Once the... (8 Replies)
Discussion started by: perl_beginner
8 Replies

2. Shell Programming and Scripting

Data reformat and rearrangement problem asking

Input file: dependent general_process dependent general_process regulation general_process - - template component food component binding data_rearrangement binding data_rearrangement specific_activity data_rearrangement - ... (7 Replies)
Discussion started by: cpp_beginner
7 Replies

3. Shell Programming and Scripting

Reformat MLS Data - Use AWK?

I am helping my wife set up a real estate site and I am starting to integrate MLS listings. We are using a HostGator level 5 VPS running CentOS and have full root and SSH access to the VPS. Thus far I have automated the daily FTP download of listings from our MLS server using a little sh script.... (4 Replies)
Discussion started by: Chicago_Realtor
4 Replies

4. Shell Programming and Scripting

Help with reformat input data

Input file: 58227131 50087390 57339526 40578034 65348841 55614853 64363217 44178559 Desired output file: 58227131 50087390 57339526 40578034 65348841 55614853 64363217 44178559 Command that I try: (4 Replies)
Discussion started by: perl_beginner
4 Replies

5. Shell Programming and Scripting

Help with reformat data content

input file: hsa-miR-4726-5p Score hsa-miR-483-5p Score hsa-miR-125b-2* Score hsa-miR-4492 hsa-miR-4508 hsa-miR-4486 Score Desired output file: hsa-miR-4726-5p Score hsa-miR-483-5p Score hsa-miR-125b-2* Score hsa-miR-4492 hsa-miR-4508 hsa-miR-4486 Score ... (6 Replies)
Discussion started by: perl_beginner
6 Replies

6. Shell Programming and Scripting

Split, Search and Reformat by Data Group

Hi, I am writing just to share my appreciation for help I have received from this site in the past. In a previous post Split File by Data Group I received a lot of help with a troublesome awk script to reformat some complicated data blocks. What I learned really came in hand recently when I... (1 Reply)
Discussion started by: mkastin
1 Replies

7. Shell Programming and Scripting

Reformat the data of a file.

I have a file which have data like A.txt a 1Jan I am in a1. 1Jan I was born. 2Jan I am here. 3Jan I am in a3. b 1Jan I am in b1. c 2Jan I am in c2. d 2Jan I am in d2. 5jan I am in d5. date in the file might be vary evertime. (9 Replies)
Discussion started by: samkhu
9 Replies

8. Shell Programming and Scripting

reformat data with a shell script

Can anyone help me with a shell script that can do the following: I have a data in fasta format (first line is the header, followed by a sequence of characters). >ALLLY GGCCCCTCGAGCCTCGAACCGGAACCTCCAAATCCGAGACGCTCTGCTTATGAGGACCTC GAAATATGCCGGCCAGTGAAAAAATCTTGTGGCTTTGAGGGCTTTTGGTTGGCCAGGGGC... (5 Replies)
Discussion started by: manishabh
5 Replies

9. Shell Programming and Scripting

Reformat Data (Perl)

I am new to Perl. I need to reformat a data file as the last part of a script I am working on. I am stuck on this. Here is the current format: CUSTOMER Filename 09/04/07-08:49 CUSTOMER Filename 09/04/07-08:52 CUSTOMER Filename 09/04/07-08:52 CUSTOMER2 Filename 09/04/07-08:49 CUSTOMER2... (3 Replies)
Discussion started by: flood
3 Replies

10. Shell Programming and Scripting

help reformat data with awk

I am trying to write an awk program to reformat a data table and convert the date to julian time. I have all the individual steps working, but I am having some issues joing them into one program. Can anyone help me out? Here is my code so far: # This is an awk program to convert the dates from... (4 Replies)
Discussion started by: climbak
4 Replies
Login or Register to Ask a Question