[awk] line by line processing the same file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting [awk] line by line processing the same file
# 1  
Old 10-02-2012
[awk] line by line processing the same file

Hey, not too good at this, so I only managed a clumsy and SLOW solution to my problem that needs a drastic speed up. Any ideas how I write the following in awk only?

Code is supposed to do...
For every line read column values $6, $7, $8 and do a calculation with the same column values of every other line in the same file. If conditions are met, write information out to file.

CODE:
Code:
while read line; do                                                                                     
    XI=$(echo $line | awk '{print $6}')
    YI=$(echo $line | awk '{print $7}')
    ZI=$(echo $line | awk '{print $8}')
    ATOM_TYPE=$(echo $line | awk '{print $3}')
    awk -v xi="$XI" -v yi="$YI" -v zi="$ZI" -v atom="$ATOM_TYPE" -v cutoff="$DISTCUT" '{dist=sqrt(( xi- $6)^2 + ( yi- $7)^2 + ( zi- $8)^2); if (dist <= cutoff && dist != '0') print atom, $3, dist}' sub_oxy_high >> oxy_dist_all
done < sub_oxy_high

INPUT:
Code:
ATOM   5202   C3  TB   347      47.749   6.795 193.827
ATOM   5203   C4  TB   347      46.729   7.915 193.597
ATOM   5204   O5  TB   347      47.109   9.075 193.407
ATOM   5205   O6  TB   347      45.329   7.594 193.517
...

OUTPUT:
Code:
C3 C4 9.999
C3 O5 9.999
C3 O6 9.999
...

# 2  
Old 10-02-2012
And what's the value of DISTCUT for the output posted?

Try:
Code:
awk '{atom[NR]=$3;xi[NR]=$6;yi[NR]=$7;zi[NR]=$8}
END{
for(i=1;i<=NR;i++)
 for(j=1;j<=NR;j++)
 {
  if(j==i) continue
  dist=sqrt((xi[i]-xi[j])^2 + (yi[i]-yi[j])^2 + (zi[i]-zi[j])^2)
  if(dist!=0 && dist<=cutoff)
   print atom[i],atom[j],dist
 }
}' cutoff="$DISTCUT" sub_oxy_high > oxy_dist_all


Last edited by elixir_sinari; 10-02-2012 at 07:48 AM..
This User Gave Thanks to elixir_sinari For This Post:
# 3  
Old 10-02-2012
Code:
awk     '{for (i=3;i<=NF;i++) TMP[NR,i]=$i}
         END {for (i=1;i<=NR;i++)
                {for (j=NR;j>i;j--)
                  {dist = sqrt  ( (TMP[i,6]-TMP[j,6])^2 + (TMP[i,7]-TMP[j,7])^2 + (TMP[i,8]-TMP[j,8])^2 );
                   if (dist != 0 && dist <= co)  print TMP[i,3],TMP[j,3],dist
                  }
                }
             }
        ' co="$DISTCUT"

With the data from your example:
Code:
C3 O6 2.56728
C3 O5 2.40508
C3 C4 1.53222
C4 O6 1.43856
C4 O5 1.23535
O5 O6 2.31816

@elixir_sinari: too fast for me! But - you're outputting each pair of atoms twice; not sure if that's desired...
This User Gave Thanks to RudiC For This Post:
# 4  
Old 10-02-2012
Quote:
Originally Posted by RudiC
@elixir_sinari: too fast for me! But - you're outputting each pair of atoms twice; not sure if that's desired...
Is it? But, then that's a "faithful" conversion of that loop to an awk script. Smilie
Code:
C3 C4 1.53222
C3 O5 2.40508
C3 O6 2.56728
C4 C3 1.53222
C4 O5 1.23535
C4 O6 1.43856
O5 C3 2.40508
O5 C4 1.23535
O5 O6 2.31816
O6 C3 2.56728
O6 C4 1.43856
O6 O5 2.31816

is the output for the sample.
This User Gave Thanks to elixir_sinari For This Post:
# 5  
Old 10-02-2012
Quote:
Originally Posted by elixir_sinari
Is it?
Yes: e.g.
Code:
C3 C4 1.53222
C4 C3 1.53222

But maybe that's desired?
This User Gave Thanks to RudiC For This Post:
# 6  
Old 10-02-2012
Quote:
Originally Posted by RudiC
But maybe that's desired?
If it is not desired, a slight tweak will do the trick.
Code:
awk '{atom[NR]=$3;xi[NR]=$6;yi[NR]=$7;zi[NR]=$8}
END{
for(i=1;i<=NR;i++)
 for(j=i+1;j<=NR;j++)
 {
  dist=sqrt((xi[i]-xi[j])^2 + (yi[i]-yi[j])^2 + (zi[i]-zi[j])^2)
  if(dist!=0 && dist <=cutoff)
   print atom[i],atom[j],dist
 }
}' cutoff="$DISTCUT" sub_oxy_high > oxy_dist_all

This User Gave Thanks to elixir_sinari For This Post:
# 7  
Old 10-02-2012
You guys are awesome, thanks all around... Double entries were not desired, I just left the issue out because I didn't want to cause confusion.

DISTCUT=3.5 by the way, a geometric hydrogen bonding criterion in angstrom...

This forum is so good Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Printing string from last field of the nth line of file to start (or end) of each line (awk I think)

My file (the output of an experiment) starts off looking like this, _____________________________________________________________ Subjects incorporated to date: 001 Data file started on machine PKSHS260-05CP ********************************************************************** Subject 1,... (9 Replies)
Discussion started by: samonl
9 Replies

2. Shell Programming and Scripting

Individual Line processing in awk

Hi , I have a file like Activate your Membership now! Dear Cyrus Every relationship needs nurturing. Including ours. 2011-08-09T10:18:14Z 2011-08-09T10:18:14Z tag:gmail.google.com,2004:1376659800396305843 T League email@email.tleague.com How to refresh a graphical display through... (3 Replies)
Discussion started by: ddspark
3 Replies

3. Shell Programming and Scripting

AWK: Remove spaces before processing each line?

Hi, all I have a file containing the following data: name: PRODUCT_1 date: 2010-01-07 really_long_name: PRODUCT_ABCDEFG I want to get the date (it is "2010-01-07" here), I could use the following code to do that: awk... (6 Replies)
Discussion started by: kevintse
6 Replies

4. Shell Programming and Scripting

reading a file inside awk and processing line by line

Hi Sorry to multipost. I am opening the new thread because the earlier threads head was misleading to my current doubt. and i am stuck. list=`cat /u/Test/programs`; psg "ServTest" | awk -v listawk=$list '{ cmd_name=($5 ~ /^/)? $9:$8 for(pgmname in listawk) ... (6 Replies)
Discussion started by: Anteus
6 Replies

5. Shell Programming and Scripting

Awk not working due to missing new line character at last line of file

Hi, My awk program is failing. I figured out using command od -c filename that the last line of the file doesnt end with a new line character. Mine is an automated process because of this data is missing. How do i handle this? I want to append new line character at the end of last... (2 Replies)
Discussion started by: pinnacle
2 Replies

6. Shell Programming and Scripting

awk, perl Script for processing a single line text file

I need a script to process a huge single line text file: The sample of the text is: "forward_inline_item": "Inline", "options_region_Australia": "Australia", "server_event_err_msg": "There was an error attempting to save", "Token": "Yes", "family": "Family","pwd_login_tab": "Enter Your... (1 Reply)
Discussion started by: hmsadiq
1 Replies

7. Shell Programming and Scripting

Reading a file line by line and processing for each line

Hi, I am a beginner in shell scripting. I have written the following script, which is supposed to process the while loop for each line in the sid_home.txt file. But I'm getting the 'end of file' unexpected for the last line. The file sid_home.txt gets generated as expected, but the script... (6 Replies)
Discussion started by: sagarparadkar
6 Replies

8. Shell Programming and Scripting

File processing line by line

Hi, I am doing file processing line by line. while reading each line at a specified location I am searching for a particular character and then write that line to another file. Problem is while writing to another file it was supressing the spaces, which I don't want to do. Any help is... (1 Reply)
Discussion started by: suma
1 Replies

9. Shell Programming and Scripting

AWK Multi-Line Records Processing

I am an Awk newbie and cannot wrap my brain around my problem: Given multi-line records of varying lengths separated by a blank line I need to skip the first two lines of every record and extract every-other line in each record unless the first line of the record has the word "(CONT)" in the... (10 Replies)
Discussion started by: RacerX
10 Replies

10. Shell Programming and Scripting

processing line in file

Hi I amtrying to read the lines from a file, these lines are absolute paths in the system. I want to check if these paths exists, if they doesn't I want to create that path and put a file in that location/path. I had no trouble filtering these paths out using awk, grep, uniq etc but when it... (8 Replies)
Discussion started by: fablef00
8 Replies
Login or Register to Ask a Question