Reformatting a file for biological purpose


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Reformatting a file for biological purpose
# 1  
Old 04-08-2013
Reformatting a file for biological purpose

Dear ALL,
I would really appreciate if you could help me in reformatting a file in this way:

The file refers to a list of genetic coordinates, each lines has a score value and the associated chromosome is listed in the line starting with chrom .

If more coordinates are found, the start refers to the start plus n lines:

original file
Code:
chrom=chr1 start=3000306
0.006
0.010
0.014
chrom=chrX start=40000306
0.014
chrom=chr9 start=80000306 
0.1
0.2

I would like to obtain a file like this
Code:
#chr #start #end #score

chr1 3000306 3000307 0.006
chr1 3000307 3000308 0.010
chr1 3000308 3000309 0.019
chrX 40000306 40000307 0.014
chr9 80000306 80000307 0.1
chr9 80000307 80000308 0.2

Let me know if you can help me, the original file in up to 1 GB and is impossible editing my hand,
Thanks,
Paolo

Last edited by paolo.kunder; 04-08-2013 at 07:42 AM..
# 2  
Old 04-08-2013
Try this,
Code:
awk '/^chrom/{split($1,a,"=");split($2,b,"=");next} { printf "%s\t%10d\t%10d\t%f\n",a[2],b[2],b[2]+1,$1;b[2]++}' filename

This User Gave Thanks to pravin27 For This Post:
# 3  
Old 04-08-2013
Amazing thanks again for all your support!
Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Reformatting of an output file

Hi, i've got the following output file: 170724_1600 | SYSTEM | 449 | 282 | 167 | 62 170724_1600 | CCS_SCP_DATA | 200 | 88 | 112 | 44 170724_1600 | CCS_SCP_SUBS_I | 2001 | 1751 | 250 | 87 170724_1600 | UIS_CDR_INDEX | 2001 | 1 | 2000 | 0 170724_1600 | LCP_INDEX | 200 | 5 | 195 | 2... (4 Replies)
Discussion started by: nms
4 Replies

2. UNIX for Dummies Questions & Answers

Help reformatting input file

Hi, I have an input file that looks like this (columns are tab delimited: Data000005-RA GO:0003735 GO:0005840 GO:0006412 Data000005-RA GO:0003735 Data000009-RA GO:0003735 GO:0005622 GO:0005840 GO:0006412 ... (2 Replies)
Discussion started by: Fahmida
2 Replies

3. Shell Programming and Scripting

Reformatting single column text file starting new line when finding particular string

Hi, I have a single colum file and I need to reformat the file so that it creates a new line every time it come to an IP address and the following lines are corresponding rows until it comes to the next IP address. I want to turn this 172.xx.xx.xx gwpusprdrp02_pv seinwnprd03... (7 Replies)
Discussion started by: kieranfoley
7 Replies

4. Shell Programming and Scripting

Stripping characters from a file and reformatting according to another one

Dear experts, my problem is pretty tricky. I want to change a file (see attached input.txt), according to another file (help.txt). The output that is desired is in output.txt. The example is attached. Note that -dashes should not be treated specially, they are considered normal characters,... (2 Replies)
Discussion started by: TheTransporter
2 Replies

5. Shell Programming and Scripting

Help for reformatting text file and creating new format

Hi all, I have an input file like 1,date,company,, 1,date,comapny,, 2,000,,,567,ACT,00,,,,KKG,M1,D45,,67J,+4500000000 2,000,,,567,ACT,00,,,,KKG,M6,D49,,56J,+6000 2,000,,,567,ACT,00,,7,,KKG,M3,D58,,68h,-70000 2,000,,,567,ACT,00,,,,KKG,M9,D95,,34m,0.00 3,total what i require is 1.I... (2 Replies)
Discussion started by: selvankj
2 Replies

6. Shell Programming and Scripting

reformatting xml file, sed or awk I think (possibly perl)

I have some xml files that cannot be read using a standard parser, or I am using the wrong parser. The issues seems to be spaces in some of the tags. Here is a sample,<UgUn 2 > <Un> -0.426753 </Un> </UgUn>The parser isn't able to find the number 2, so that information is lost, etc. It seems... (16 Replies)
Discussion started by: LMHmedchem
16 Replies

7. Shell Programming and Scripting

awk multiple file reformatting

I hopefully have a simple request - I need to process multiple files reformatting the output based on tags at the beginning of each line. So the data for the new 3 lines of the output file are in the HDR line and then the details are in the DTL tagged lines. for ifile in $indir do echo... (1 Reply)
Discussion started by: jason_v_brown
1 Replies

8. UNIX for Dummies Questions & Answers

Reformatting file

Hi, How can I reformat a file (text file) using unix command. This file was FTP'd from Mainframe and contains some garbage character at the end of each line. Each line contains special characters '<soh>' at the end which should have been spaces when I view it in emacs or nedit. I couldnt do find... (2 Replies)
Discussion started by: mrjunsy
2 Replies
Login or Register to Ask a Question