Help with file manipulation


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Help with file manipulation
# 1  
Old 04-12-2011
Lightbulb Help with file manipulation

Dear All,
I have a question. So for the following sample file I would like to collect information about entries in $F[0], $F[1] & $F[4] so as to acheive the following output as shown below
Code:
SK1.chr01    854    levure5    A    G    225    .    DP=407;AF1=0.5;CI95=0.5,0.5;DP4=142,103,72,68;MQ=31;FQ=225;PV4=0.24,1,1,1    GT:PL:GQ;telomere;ID=TEL01L;Name=TEL01L
SK1.chr01    854    levure6    A    G    199    .    DP=360;AF1=0.5;CI95=0.5,0.5;DP4=138,80,70,48;MQ=31;FQ=202;PV4=0.48,1,1,1    GT:PL:GQ;telomere;ID=TEL01L;Name=TEL01L
SK1.chr01    854    levure7    A    G    225    .    DP=163;AF1=0.5;CI95=0.5,0.5;DP4=60,30,37,19;MQ=31;FQ=225;PV4=1,0.26,1,1    GT:PL:GQ;telomere;ID=TEL01L;Name=TEL01L
SK1.chr01    854    levure8    A    G    225    .    DP=194;AF1=0.5;CI95=0.5,0.5;DP4=66,46,40,28;MQ=31;FQ=225;PV4=1,1,1,1    GT:PL:GQ;telomere;ID=TEL01L;Name=TEL01L
SK1.chr01    12745    levure5    C    G    185    .    DP=125;AF1=0.5;CI95=0.5,0.5;DP4=45,21,25,20;MQ=23;FQ=155;PV4=0.23,1,1,1    GT:PL:GQ
SK1.chr01    12745    levure6    C    G    197    .    DP=85;AF1=0.5;CI95=0.5,0.5;DP4=30,15,18,18;MQ=23;FQ=153;PV4=0.17,1,1,1    GT:PL:GQ
SK1.chr01    12745    levure7    C    G    152    .    DP=36;AF1=0.5;CI95=0.5,0.5;DP4=10,7,11,6;MQ=22;FQ=42;PV4=1,1,1,1    GT:PL:GQ
SK1.chr01    12745    levure8    C    G    173    .    DP=63;AF1=0.5;CI95=0.5,0.5;DP4=21,16,12,14;MQ=23;FQ=98;PV4=0.45,1,1,1    GT:PL:GQ
SK1.chr02    16511    levure5    G    A    148    .    DP=43;AF1=1;CI95=1,1;DP4=2,1,16,19;MQ=24;FQ=-85;PV4=0.59,5.9e-05,1,1    GT:PL:GQ
SK1.chr02    16511    levure6    G    A    127    .    DP=35;AF1=0.5;CI95=0.5,0.5;DP4=4,3,7,16;MQ=25;FQ=30;PV4=0.37,0.0035,0.24,1    GT:PL:GQ

Expected output:
Code:
chr01    854     AAAA     GGGG
chr01    12745   CCCC     GGGG
chr02    16511   GG       AA

Could someone help me figure out a way to do this?
Cheers and hv a nice daySmilie
# 2  
Old 04-12-2011
Code:
nawk '{x=$1;sub(".*[.]","",x);A[$2]=x;B[$2]=B[$2]$4""}END{for(i in A) print A[i],i,B[i]}' inputfile

Ooops, i missed the $5 here you go :

Code:
nawk '{x=$1;sub(".*[.]","",x);A[$2]=x;B[$2]=B[$2]$4"";C[$2]=C[$2]$5""}END{for(i in A) print A[i],i,B[i],C[i]}' inputfile

This User Gave Thanks to ctsgnb For This Post:
# 3  
Old 04-12-2011
Thanks ctsgnb
nawk was not supported but awk version worked Smilie
Cheers

---------- Post updated at 10:28 AM ---------- Previous update was at 10:25 AM ----------

Could you comment on the code please and if you have a Perl version I'll appreciate that too Smilie
# 4  
Old 04-12-2011
Perl solution:
Code:
perl -alne '$F[0]=~s/.*\.//;push @{$a{$F[1]}},$F[3];push @{$b{$F[1]}},$F[4];$c{$F[1]}=$F[0];END{for $i (keys %a){print "$c{$i}\t$i\t",@{$a{$i}},"\t",@{$b{$i}}}}' file

This User Gave Thanks to bartus11 For This Post:
# 5  
Old 04-13-2011
Thanks Bartus Smilie
Hv a nice day
# 6  
Old 04-13-2011
nawk '{ call nawk
x=$1store 1rst field in variable x
sub(".*[.]","",x)replace the string followed by the dot by nothing (extract only the chr<nn> part) into the x variable
A[$2]=xcreate associative array indexed by field 2 and store the value of x
B[$2]=B[$2]$4""concatenate field 4 into an associative array indexed by field 2 (the trailing "" force the element to be considered as strings.)
C[$2]=C[$2]$5""concatenate field 5 into an associative array indexed by field 2
}END{when th whole file has been scanned so that the array have been built
for(i in A)for i taking the values of the index of the associative array A (in fact it will go through the $2 value previously found)
print A[i],i,B[i],C[i]}' inputfileprint the wanted output
This User Gave Thanks to ctsgnb For This Post:
# 7  
Old 04-13-2011
Thanks for the comments ctsgnb
Hv a nice daySmilie
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Populating File data with custom manipulation on file names

Hi, I am confused how to proceed firther please find the problem below: Input Files: DCIA_GEOG_DATA_OCEAN.TXT DCIA_GEOG_DATA_MCRO.TXT DCIA_GEOG_DATA_CVAS.TXT DCIA_GEOG_DATA_MCR.TXT Output File Name: MMA_RFC_GEOG_NAM_DIM_LOD.txt Sample Record(DCIA_GEOG_DATA_OCEAN.TXT):(Layout same for... (4 Replies)
Discussion started by: Arun Mishra
4 Replies

2. Shell Programming and Scripting

Awk to convert a text file to CSV file with some string manipulation

Hi , I have a simple text file with contents as below: 12345678900 971,76 4234560890 22345678900 5971,72 5234560990 32345678900 71,12 6234560190 the new csv-file should be like: Column1;Column2;Column3;Column4;Column5 123456;78900;971,76;423456;0890... (9 Replies)
Discussion started by: FreddyDaKing
9 Replies

3. UNIX for Dummies Questions & Answers

Filtering records from 1 file based on some manipulation doen on second file

Hi, I am looking for an awk script which should help me to meet the following requirement: File1 has records in following format INF: FAILEd RECORD AB1234 INF: FAILEd RECORD PQ1145 INF: FAILEd RECORD AB3215 INF: FAILEd RECORD AB6114 ............................ (2 Replies)
Discussion started by: mintu41
2 Replies

4. Shell Programming and Scripting

value in file - manipulation

Hi Forum. I have the following 2 files: edw_mf_bypass_msg.txt and EDW_server.cfg. edw_mf_bypass_msg.txt - File#1 contains the following text To EDW Support: This is an automatic email sent from var_hostname. Please note that the Mutual Fund load did not run today due to previous... (2 Replies)
Discussion started by: pchang
2 Replies

5. Shell Programming and Scripting

File manipulation

Legends, Please help me to get the following I have a file abc.txt with the following contents 12 13 14 15 And, i want to get the output to a variable like below 12,13,14,15 .... How do i do this? Regards, san Please use code tags when posting data and code samples! (5 Replies)
Discussion started by: sdosanjh
5 Replies

6. UNIX for Dummies Questions & Answers

file manipulation help please

Hi there, I've trawled all over the web for help, and although seen some examples of what i want to do, I cannot seem to get it to work. I need to have this as a script. If anyone can help, I would like to do the following: I have 2 files, File A and File B. I would like to keep file A but... (5 Replies)
Discussion started by: lazerlyte
5 Replies

7. Shell Programming and Scripting

File Manipulation

Hi, i have a file with fixed record length with the following content (only one sentence) 12345678901234567890123456789012345678901234567890 12345678 87654321 hugo meyer friedhofpaul the numbers above are only the column-positions and not part of the file! Now i want... (2 Replies)
Discussion started by: FranzB
2 Replies

8. Shell Programming and Scripting

Help with file manipulation

I need help manipulating text in a file. I am wanting to know a way to shell (ksh)script-edit a file by having a script that searches for a specific string, and then input lines of text in the file before that specific string, without deleting any of the other text in the file. I got this... (2 Replies)
Discussion started by: LinuxRacr
2 Replies

9. Shell Programming and Scripting

file manipulation best way ?

Hi , i would convert the following file V M BOURSE EMPLOI mail/mail-03/dfr-dc.nsf V M DelSpam mail/mail-04/celine_bet.nsf like that : mail/mail-03/dfr-dc.nsf;BOURSE EMPLOI mail/mail-20/celine_bet.nsf;DelSpam the second field ( ex:... (2 Replies)
Discussion started by: Nicol
2 Replies

10. Programming

need help with file manipulation

I've been able to open and write data to files but I need to know how to search a file for a hex string and replace it. (2 Replies)
Discussion started by: angelfly
2 Replies
Login or Register to Ask a Question