How to get this script work on multiple input files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to get this script work on multiple input files
# 1  
Old 08-26-2010
Bug How to get this script work on multiple input files

Hello Gyues!

I would like to use awk to perform data extraction from several files. The data files look like this:

Code:
 DWT26R 1 PEP1 CA 1 OH2 SKIPPED: 0 STEP: 1
0.29000E+01 0.55005E-02 0.60012E-03
0.30000E+01 0.11149E+00 0.13603E-01
0.31000E+01 0.39719E+00 0.63013E-01
0.32000E+01 0.94264E+00 0.18784E+00
0.33000E+01 0.17744E+01 0.43749E+00
0.35000E+01 0.32350E+01 0.13273E+01
0.36000E+01 0.34913E+01 0.19104E+01
.
.
.

The first line is unique for each file and contains information I would like to add to the output. In fact, I need to seach for the highest value in $2 and print it together with the the first line of that file. Then the next file needs to be processed the same way.

For A single file it works fine though but how can I do this with multiple files? I think I somehow need to assigne information from the unique first line to the values of each file and store it in an array. At the end I simply need to print that array containing these information... However I really could not get it work so far...

The current code that works for a single file is:

Code:
BEGIN     {
    print "trajectory= traj molecules= mol Peptide= pep resid(CA?)= res contact= so (max)solv/sphere= n Radius(A)= r";
    print "traj", "mol", "pep", "res", "co", "n", "     r"; #just a header for the output
    }



# need to read substring in order to get exponential funktion
    {
    if (NR==1)    {
            expo=0;
            coomp=0;
            co=0;
            max=0;
            maxline=0;    
            traj=$2;
            mol=$1;
            pep=$3;
            res=$5;
            so=$6;
            } #saving file information and resetting comparison set
    else         {
            expo=10^(substr($2,9,3)); #extract exponent
            comp=(substr($2,3,5)/100000); 
            co=comp*expo;
            if (co > max) {max=co; maxline=substr($1,3,5)/100000*10^(substr($1,9,3))} # extract highest value from file
            }
    }



END     { 
    print traj, mol, pep, res, so, max, maxline; #print highest value and information from the first line
    }

Hope you gyues can help me out.

Cheers,
Daniel
# 2  
Old 08-26-2010
Like this?
Code:
awk ' BEGIN { h=0 } FNR == 1 { if(header!="") { printf "%s\n%d\n", header, h } header=$0; next  } \
          $2>h { h=$2 } END { printf "%s\n%d\n", header, h } ' file1 file2 file3 ....

# 3  
Old 08-26-2010
Quote:
Originally Posted by kevintse
Like this?
Code:
awk ' BEGIN { h=0 } FNR == 1 { if(header!="") { printf "%s\n%d\n", header, h } header=$0; next  } \
          $2>h { h=$2 } END { printf "%s\n%d\n", header, h } ' file1 file2 file3 ....

Thanks for your suggestion. I will try this as soon as I am back at the lab.However I am not sure wether I understand everything of your code.

Will will this produce an output like this:
Code:
First_line_of_file1 $2_of_file1_with_highest_value_in_file1
First_line_of_file2 $2_of_file2_with_highest_value_in_file2
First_line_of_file3 $2_of_file3_with_highest_value_in_file3
.
.
.

cheers,
Daniel
# 4  
Old 08-26-2010
it will produce the following if three files are the same:
Code:
 DWT26R 1 PEP1 CA 1 OH2 SKIPPED: 0 STEP: 1
3
 DWT26R 1 PEP1 CA 1 OH2 SKIPPED: 0 STEP: 1
3
 DWT26R 1 PEP1 CA 1 OH2 SKIPPED: 0 STEP: 1
3

# 5  
Old 08-26-2010
Quote:
Originally Posted by kevintse
it will produce the following if three files are the same:
Code:
 DWT26R 1 PEP1 CA 1 OH2 SKIPPED: 0 STEP: 1
3
 DWT26R 1 PEP1 CA 1 OH2 SKIPPED: 0 STEP: 1
3
 DWT26R 1 PEP1 CA 1 OH2 SKIPPED: 0 STEP: 1
3


Looks good! I am just wondering where the 3 is coming from (is the value rounded?). In the example the highest value is 0.34913E+01 for $2.

And another question just for my understanding. Where is specified that with starting to read a new file variable h is reset to 0 in order to extract the highest value of this specific file?


I try to grasp as much as possible that is why I ask so much Smilie

Cheers,
Daniel



Edit:
O maybe I got it! Right after begin you set h=0.

Edit2:

Oh now I got it, printf "%s\n%d\n" gives me header as a string and h as decimal expression and thus 3Smilie

Last edited by Daniel8472; 08-26-2010 at 11:47 AM..
# 6  
Old 08-26-2010
Quote:
Originally Posted by Daniel8472
Looks good! I am just wondering where the 3 is coming from (is the value rounded?). In the example the highest value is 0.34913E+01 for $2.

And another question just for my understanding. Where is specified that with starting to read a new file variable h is reset to 0 in order to extract the highest value of this specific file?


I try to grasp as much as possible that is why I ask so much Smilie

Cheers,
Daniel



Edit:
O maybe I got it! Right after begin you set h=0.

Edit2:

Oh now I got it, printf "%s\n%d\n" gives me header as a string and h as decimal expression and thus 3Smilie
Correct.

"BEGIN { h=0 }" initializes h(highest value) to zero so it can be used to compare with all $2 in the file.

And, actually, you can use printf "%e", number to print numbers in exponential format, and printf "%f", number in float point.
# 7  
Old 08-27-2010
Good morning!

I am back in the lab and just used the scipt (needed to midify some parts because some files contain "," instead of ".")

One problem seems to remain so far. While reading a new file the value for "h" is not reset to zero. Thus unless in the following file is a higher value in the $2 the highest $2 value of the previous files is kept and printed.

Code:
BEGIN { h=0 } 
    FNR == 1    { if(header!="") 
                {             
                #printf "%s\n%d\n", header, h;
                print header, h; 
                } 
            header=$0; next; 
            } \
         
    substr($2,3,5)/100000*10^(substr($2,9,3))>h         { h=substr($2,3,5)/100000*10^(substr($2,9,3)) } 

END     { 
    print header, h;
    #printf "%s\n%d\n", header, h;
    }

Output:


Code:
 DWT26R 1 PEP1 CA 10 OH2 SKIPPED: 0 STEP: 1 3,2829
 DWT26R 1 PEP2 CA 10 OH2 SKIPPED: 0 STEP: 1 3,5248
 DWT26R 1 PEP1 CA 11 OH2 SKIPPED: 0 STEP: 1 4,3229
 DWT26R 1 PEP2 CA 11 OH2 SKIPPED: 0 STEP: 1 4,3229
 DWT26R 1 PEP1 CA 12 OH2 SKIPPED: 0 STEP: 1 6,8575
 DWT26R 1 PEP2 CA 12 OH2 SKIPPED: 0 STEP: 1 6,8575
 DWT26R 1 PEP1 CA 13 OH2 SKIPPED: 0 STEP: 1 6,8575
 DWT26R 1 PEP2 CA 13 OH2 SKIPPED: 0 STEP: 1 6,8575
 DWT26R 1 PEP1 CA 14 OH2 SKIPPED: 0 STEP: 1 6,8575
 DWT26R 1 PEP2 CA 14 OH2 SKIPPED: 0 STEP: 1 6,8575

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

[Solved] Multiple input files and output files

Hi, I have many test*.ft1 files to which I want to read as input for a script called pipe2txt.tcl and print the output in each separate file. For example, pipe2txt.tcl < test001.ft1 > test001.txt How can I read many files in this maner? thank you very much, Best, Pahuja (5 Replies)
Discussion started by: Pahuja
5 Replies

2. Shell Programming and Scripting

Script to delete files older than x days and also taking an input for multiple paths

Hi , I am a newbie!!! I want to develop a script for deleting files older than x days from multiple paths. Now I could reach upto this piece of code which deletes files older than x days from a particular path. How do I enhance it to have an input from a .txt file or a .dat file? For eg:... (12 Replies)
Discussion started by: jhilmil
12 Replies

3. Shell Programming and Scripting

Script to delete files with an input for directories and an input for path/file

Hello, I'm trying to figure out how best to approach this script, and I have very little experience, so I could use all the help I can get. :wall: I regularly need to delete files from many directories. A file with the same name may exist any number of times in different subdirectories.... (3 Replies)
Discussion started by: *ShadowCat*
3 Replies

4. Shell Programming and Scripting

awk, multiple files input and multiple files output

Hi! I'm new in awk and I need some help. I have a folder with a lot of files and I need that awk do something in each file and print a new file with the output. The input file name should be modified when I print the outpu files. Thanks in advance for help! :-) ciao (5 Replies)
Discussion started by: gabrysfe
5 Replies

5. UNIX for Dummies Questions & Answers

Writing a loop to process multiple input files by a shell script

I have multiple input files that I want to manipulate using a shell script. The files are called 250.1 through 250.1000 but I only want the script to manipulate 250.300 through 250.1000. Before I was using the following script to manipulate the text files: for i in 250.*; do || awk... (4 Replies)
Discussion started by: evelibertine
4 Replies

6. UNIX for Advanced & Expert Users

Input for multiple files.

Hi, I am trying to come up with a script, and would like the script to pick all the files place within a folder and interactive take my yes/no before processing within the command. Could you someone help me in modifying the script : #!/bin/bash # LDIF_FILES="File Name" for MY_FILE... (5 Replies)
Discussion started by: john_prince
5 Replies

7. Shell Programming and Scripting

How to make an editing script work for multiple files?

Hey everybody, I have a script for making a string substitution in a file. I am trying to modify it in order to make the same modifcation to multiples files. here is what I have so far. #!/bin/csh set p1="$1" shift set p2="$1" shift foreach x ($*) if ( { grep -w -c "$p1" $x } ) then mv... (7 Replies)
Discussion started by: iwatk003
7 Replies

8. Shell Programming and Scripting

how to redirect multiple input files?

I have a program that runs like "cat f1 - f2 -", I need to write shell script to run the program whose standard input will be redirected from 2 files. I spend a whole day on it, but didn't figure out. Can someone help me out? Thanks! (8 Replies)
Discussion started by: microstarwwx
8 Replies

9. Shell Programming and Scripting

Splitting input files into multiple files through AWK command

Hi, I needs to split *.txt files from single directory depends on the some mutltiple input values. i have wrote the code like below for file in *.txt do grep -i -h "value1|value2" $file > $file; done. My requirment is more input values needs to be given in grep; let us say 50... (3 Replies)
Discussion started by: arund_01
3 Replies

10. UNIX for Dummies Questions & Answers

can you redirect multiple files for input?

I have a program that is reading strings into a vector from a file. Currently I am using this command: a.out < file1 The program runs and prints the contents of the vector to the screen, like its supposed to. The problem is that it needs to read in 3 files to fill the vector. Is there anyway... (4 Replies)
Discussion started by: Matrix_Prime
4 Replies
Login or Register to Ask a Question