Run a program-print parameters to output file-replace op file contents with max 4th col


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Run a program-print parameters to output file-replace op file contents with max 4th col
# 1  
Old 01-23-2013
Run a program-print parameters to output file-replace op file contents with max 4th col

Hi Friends,

This is the only solution to my task. So, any help is highly appreciated.

I have a file

Code:
cat input1.bed

chr1 100 200 abc
chr1 120 300 def
chr1 145 226 ghi
chr2 567 600 unix

Now, I have another file by name

Code:
input2.bed (This file is a binary file not readable by the terminal).

But, there is a program in our field that executes by taking this
Code:
input2.bed

Code:
program input_file -chrom -start -end output_file

Now, my task is this

1. Read input1.bed's each record

2. Feed it in the following way to the program, so that the program executes in a continuous loop for each record in input1.bed this way and generate the output files with each input1.bed's record as their name

Code:
program input2.bed -chrom=chr1 -start=100 -end=200 chr1_100_200_op.bed
program input2.bed -chrom=chr1 -start=120 -end=300 chr1_120_300_op.bed
program input2.bed -chrom=chr1 -start=145 -end=226 chr1_145_226_op.bed
program input2.bed -chrom=chr2 -start=567 -end=600 chr2_567_600_op.bed

3. For example, I consider the first output file -
Code:
chr1_100_200_op.bed

.

Code:
cat chr1_100_200_op.bed

chr1 110 120 45.67
chr1 177 189 98.50
chr1 195 200 111.11

4. Now, ignore the first three columns of the above output file, but consider the maximum fourth column value, which is 111.11 and replace the entire contents of my chr1_100_200_op.bed with just the file name, which will be this one

Code:
cat chr1_100_200_op.bed

chr1_100_200 111.11

This is it. Please ask me as many questions as you have for a better solution. Thanks a ton for all your time.
# 2  
Old 01-23-2013
Code:
while read CHROM START END NAME
do
        # Create the bed file
        program input2.bed -chrom=$CHROM -start=$START -end=$END ${CHROM}_${START}_${END}_op.bed

        # Replace column 1 with filename,
        # column 2 with the last column,
        # reduce it to 2 columns,
        # and print all lines.
        awk '{$1=F ; $2=$NF; NF=2 } 1' F="${CHROM}_${START}_${END}" ${CHROM}_${START}_${END}_op.bed > /tmp/$$
        cat /tmp/$$ > ${CHROM}_${START}_${END}_op.bed
done < input1.bed
# Remove temporary file
rm -f /tmp/$$

For 3 and 4, you start with 3 lines and end with 1 line. Is this intended? I've assumed it's not, that you want 3 lines out for 3 lines in.
This User Gave Thanks to Corona688 For This Post:
# 3  
Old 01-23-2013
Quote:
Originally Posted by Corona688
Code:
while read CHROM START END NAME
do
        # Create the bed file
        program input2.bed -chrom=$CHROM -start=$START -end=$END ${CHROM}_${START}_${END}_op.bed

        # Replace column 1 with filename,
        # column 2 with the last column,
        # reduce it to 2 columns,
        # and print all lines.
        awk '{$1=F ; $2=$NF; NF=2 } 1' F="${CHROM}_${START}_${END}" ${CHROM}_${START}_${END}_op.bed > /tmp/$$
        cat /tmp/$$ > ${CHROM}_${START}_${END}_op.bed
done < input1.bed
# Remove temporary file
rm -f /tmp/$$

For 3 and 4, you start with 3 lines and end with 1 line. Is this intended? I've assumed it's not, that you want 3 lines out for 3 lines in.
Hi Corona,

Thanks for your time.

For 3 and 4, usually the output file has thousands of records. But, I want to consider the maximum value of fourth column and print the filename as another column.

So, the three records will go out and only one record will remain, as in the example.
# 4  
Old 01-23-2013
Code:
while read CHROM START END NAME
do
        # Create the bed file
        program input2.bed -chrom=$CHROM -start=$START -end=$END ${CHROM}_${START}_${END}_op.bed

        # Replace column 1 with filename,
        # column 2 with the last column,
        # reduce it to 2 columns,
        # and print all lines.
        awk '(!M)||(M<$NF){ M=$NF } END { print F, M }' F="${CHROM}_${START}_${END}" ${CHROM}_${START}_${END}_op.bed > /tmp/$$
        cat /tmp/$$ > ${CHROM}_${START}_${END}_op.bed
done < input1.bed
# Remove temporary file
rm -f /tmp/$$

This User Gave Thanks to Corona688 For This Post:
# 5  
Old 01-24-2013
Thanks corona for your quick solution. It took me a while to make my input files and cross check the output files.

The only problem I am getting here is that, for some combinations of the start and end there is no data in my input2.bed.

So, the output file is printing blank spaces, for example like this

Code:
cat output.bed
chr1 100 200 45.09999
chr1 120 130 
chr1 145 178 78.999

How do I replace that empty space on column 4 with "ND"?

My output would be

Code:
cat output.bed
chr1 100 200 45.09999
chr1 120 130 ND
chr1 145 178 78.999

# 6  
Old 01-24-2013
Try sed -i 's/ $/ ND/' output.bed
This User Gave Thanks to Corona688 For This Post:
# 7  
Old 01-24-2013
Quote:
Originally Posted by Corona688
Try sed -i 's/ $/ ND/' output.bed
Its not generating any output.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Output file name and file contents of multiple files to a single file

I am trying to consolidate multiple information files (<hostname>.Linux.nfslist) into one file so that I can import it into Excel. I can get the file contents with cat *Linux.nfslist >> nfslist.txt. I need each line prefaced with the hostname. I am unsure how to do this. --- Post updated at... (5 Replies)
Discussion started by: Kentlee65
5 Replies

2. Shell Programming and Scripting

Shell script (sh file) logic to compare contents of one file with another file and output to file

Shell script logic Hi I have 2 input files like with file 1 content as (file1) "BRGTEST-242" a.txt "BRGTEST-240" a.txt "BRGTEST-219" e.txt File 2 contents as fle(2) "BRGTEST-244" a.txt "BRGTEST-244" b.txt "BRGTEST-231" c.txt "BRGTEST-231" d.txt "BRGTEST-221" e.txt I want to get... (22 Replies)
Discussion started by: pottic
22 Replies

3. Shell Programming and Scripting

Replace col 23 - 26 with new value, non delimited file

hello, i have a undelimited file which contains 229 byte records. i want to change column 23 - 26 with a new value and also change the sign of the data in colulmn 30 - 70. i've tried SED for the first change, but nothing happens: sed 's/\(^.\{22\}\).\{4\}\(.*\)/\0603\2/' inputfile heres an... (8 Replies)
Discussion started by: blt123
8 Replies

4. Shell Programming and Scripting

Scripting a global find and replace in an VME output print file

Hi Folks, Below is an extract from a VME Print file which gets handed over to a print house. The problem I have is not that tricky rther looking for a way to handle it in a simple and clean way. Is to first select all lines with "0058" which have four spaces so "0058 " as the selcetion... (3 Replies)
Discussion started by: Gary Hay
3 Replies

5. Shell Programming and Scripting

Awk script to run a sql and print the output to an output file

Hi All, I have around 900 Select Sql's which I would like to run in an awk script and print the output of those sql's in an txt file. Can you anyone pls let me know how do I do it and execute the awk script? Thanks. (4 Replies)
Discussion started by: adept
4 Replies

6. Shell Programming and Scripting

Replace partial contents of file with contents read from other file

Hi, I am facing issue while reading data from a file in UNIX. my requirement is to compare two files and for the text pattern matching in the 1st file, replace the contents in second file by the contents of first file from start to the end and write the contents to thrid file. i am able to... (2 Replies)
Discussion started by: seeki
2 Replies

7. UNIX for Advanced & Expert Users

Print line based on highest value of col (B) and repetion of values in col (A)

Hello everyone, I am writing a script to process data from the ATP world tour. I have a file which contains: t=540 y=2011 r=1 p=N409 t=540 y=2011 r=2 p=N409 t=540 y=2011 r=3 p=N409 t=540 y=2011 r=4 p=N409 t=520 y=2011 r=1 p=N409 t=520 y=2011 r=2 p=N409 t=520 y=2011 r=3 p=N409 The... (4 Replies)
Discussion started by: imahmoud
4 Replies

8. Ubuntu

Match col 1 of File 1 with col 1 File 2 and create a 3rd file

Hello, I have a 1.6 GB file that I would like to modify by matching some ids in col1 with the ids in col 1 of file2.txt and save the results into a 3rd file. For example: File 1 has 1411 rows, I ignore how many columns it has (thousands) File 2 has 311 rows, 1 column Would like to... (7 Replies)
Discussion started by: sogi
7 Replies

9. Shell Programming and Scripting

compare two col from 2 files, and output uniq from file 1

Hi, I can't find how to achive such thing, please help. I have try with uniq and comm but those command can't compare columns just whole lines, I think awk will be the best but awk is magic for me as of now. file a a1~a2~a3~a4~a6~a7~a8 file b b1~b2~b3~b4~b6~b7~b8 output 1: compare... (2 Replies)
Discussion started by: pp56825
2 Replies

10. Shell Programming and Scripting

Run a program and feed her parameters automaticly

How can I run a program ('prog') through perl (in unix) and feed her required parameters? system ("prog \n"); ????????? now 'prog' ask for parameters and I want the perl script to give them automaticly without humen intervention? (2 Replies)
Discussion started by: roco
2 Replies
Login or Register to Ask a Question