AWK code for finding distances between atoms in two different files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting AWK code for finding distances between atoms in two different files
# 1  
Old 09-15-2009
Question AWK code for finding distances between atoms in two different files

HiSmilie

I have two separate data files (.xyz) type and I want to see distances between the coordinates of atoms of the two files. For example:-
My first files contains
1 1 1 11.50910000 5.17730000 16.49360000
3 1 2 11.21790000 6.36062000 15.60660000
6 1 2 11.43950000 7.66053000 16.07200000
2 1 3 11.87750000 7.81529000 17.04670000
where the last three columns are the coordinates, 1st column is Atom ID, 2nd column is Molecule ID and 3rd column is Atom type. I have another file that has coordinates of different atoms like
14 1 7 9.22151000 9.21624000 11.08350000
21 1 8 8.24299000 10.25310000 11.12120000
7 1 6 9.68004000 8.92467000 9.65365000
22 1 2 11.06970000 3.75903000 16.75830000

I want to make an awk code where AWK will read both files and find distance between atom ID 1 and atom ID 14 and will show the output with the atom IDs and the distance like
1 14 7.1284841 where 7.1284841 is the distance between coordinates of atom ID 1 and atom ID 14. The distance calculating formula is
SQRT ((X1-X2)^2 + (Y1-Y2)^2 + (Z1-Z2)^2)
I want the code to check distance between each single atom in file 1 and all the atoms in file 2. Like, it should give outputs of distances between atom id1 from file 1 and atom IDs 14, 21 7 and 22 from file 2 and then do the same for atom IDs 3, 6 and 2 from file 1. I do not know how to proceed with writing a code for this. I am new to awk.
Please suggest.....Smilie

Last edited by ananyob; 09-15-2009 at 11:48 PM..
# 2  
Old 09-15-2009
i'd be happy if someone is generous enough to make this for you. this is quite long,
if you're new to awk i suggest some reading up.
The GNU Awk User's Guide
# 3  
Old 09-15-2009
Thanks ryandegreat25.....i had a look at the manual and I will search it again

I think i can make it easier by using all coordinates in one file......by appending one set to the end of the other.......in that case......i will have 8 lines and I can calculate distance between coordinates given in columns 3,4 and 5 from line 1 with that of the same columns of lines 5,6,7 and 8 and print them accordingly......i am still trying to figure out how to do it

any suggestions.....???????????
# 4  
Old 09-16-2009
Quote:
Originally Posted by ananyob
...
i am still trying to figure out how to do it
any suggestions...
Here's a way to do it in awk:

Code:
$ 
$ 
$ cat f1
1 1 1 11.50910000 5.17730000 16.49360000
3 1 2 11.21790000 6.36062000 15.60660000
6 1 2 11.43950000 7.66053000 16.07200000
2 1 3 11.87750000 7.81529000 17.04670000
$ 
$ cat f2
14 1 7 9.22151000 9.21624000 11.08350000
21 1 8 8.24299000 10.25310000 11.12120000
7 1 6 9.68004000 8.92467000 9.65365000
22 1 2 11.06970000 3.75903000 16.75830000
$ 
$ cat calcdist.awk
{
  # create associative arrays that store relevant information
  # from both files - atom id and xyz co-ordinates
  if (NR == FNR) {
    x[FNR] = $1":"$4":"$5":"$6
    n = FNR
  } else {
    y[FNR] = $1":"$4":"$5":"$6
  }
}
END {
  # The distance calculating formula is
  # SQRT ((X1-X2)^2 + (Y1-Y2)^2 + (Z1-Z2)^2)
  print "Distance between corresponding atom pairs" 
  print "-----------------------------------------" 
  for (i=1; i<=n; i++) {
    split(x[i], a, ":")
    split(y[i], b, ":")
    printf("%2d %2d %.8f\n",a[1],b[1],sqrt((a[2]-b[2])**2 + (a[3]-b[3])**2 + (a[4]-b[4])**2))
  }
  print "Distance between cross-product of atom pairs"
  print "--------------------------------------------" 
  for (i=1; i<=n; i++) {
    split(x[i], a, ":")
    for (j=1; j<=n; j++) {
      split(y[j], b, ":")
      printf("%2d %2d %.8f\n",a[1],b[1],sqrt((a[2]-b[2])**2 + (a[3]-b[3])**2 + (a[4]-b[4])**2))
    }
  }
}

$ 
$ awk -f calcdist.awk f1 f2
Distance between corresponding atom pairs
-----------------------------------------
 1 14 7.12848415
 3 21 6.64231159
 6  7 6.77413951
 2 22 4.14595714
Distance between cross-product of atom pairs
--------------------------------------------
 1 14 7.12848415
 1 21 8.08046422
 1  7 8.01081509
 1 22 1.50818707
 3 14 5.70951594
 3 21 6.64231159
 3  7 6.66160487
 3 22 2.84897291
 6 14 5.67669318
 6 21 6.43812985
 6  7 6.77413951
 6 22 3.97862564
 2 14 6.67657832
 2 21 7.36641913
 2  7 7.79209489
 2 22 4.14595714
$ 
$

HTH,
tyler_durden
# 5  
Old 09-16-2009
thanks a lottt for the code......now i will try to understand what each line of that code means...............
i will edit the code little bit and use it for my long list of coordinates
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Assistance with an awk code to split files but keep the header

---------- Post updated at 11:48 AM ---------- Previous update was at 11:46 AM ---------- Hello all I have an awk code that successfully creates separate text files based on the first six letters of the second field. What it doesn't do is preserve the header into each resulting file. ... (6 Replies)
Discussion started by: colecandoo
6 Replies

2. Shell Programming and Scripting

Finding out the common lines in two files using 4 fields with the help of awk and UNIX

Dear All, I have 2 files. If field 1, 2, 4 and 5 matches in both file1 and file2, I want to print the whole line of file1 and file2 one after another in my output file. File1: sc2/80 20 . A T 86 F=5;U=4 sc2/60 55 . G T ... (1 Reply)
Discussion started by: NamS
1 Replies

3. Shell Programming and Scripting

Eliminating sequences based on Distances

I have to remove sequences from a file based on the distance value. I am attaching the file containing the distances (Distance.xls) The second file looks something like this: Sequences.txt >Sample1 Freq 59 ggatatgatgatgaactggt >Sample1 Freq 54 ggatatgatgttgaactggt >Sample1 Freq 44... (2 Replies)
Discussion started by: Xterra
2 Replies

4. Shell Programming and Scripting

Finding files and sorting by date (find | awk)

I am wanting to search a directory tree and return files that are older than a specified datetime. So far straight forward with find, now I want to sort in date order and format the output. So far I have this, but is not working and there is a problem with "." in the file and/or path names. ... (2 Replies)
Discussion started by: larry2311
2 Replies

5. Shell Programming and Scripting

how to calculate all pairwise distances in two dimensions and transform them into a matrix

Hello to all, I am very new in the shell scripting and I need help. I have data for several individuals in several rows followed by a tag and by 5 values per row, with the name of the individual in the first column, e.g.: IND1 H1 12 13 12 15 14 IND2 H2 12 12 15 14 14 IND3 H1 12 15... (2 Replies)
Discussion started by: Bemar
2 Replies

6. Shell Programming and Scripting

Help fixing awk code to print values from 2 files

Hi everyone, Please help on this: I have file1: <file title="Title 1 and 2"> <report> <title>Title 1</title> <number>No. 1234</number> <address>Address 1</address> <date>October 07, 2009</date> <description>Some text</description> </report> ... (6 Replies)
Discussion started by: Ophiuchus
6 Replies

7. Shell Programming and Scripting

Removing distances from Newick tree format

I have a large numbers of files containing data that look like this: (ID31:0.01682,(ID-123:0.00000,(ID_24:0.00000,ID&890:0.00000):0.00000):0.00000,ID12876:0.00000); (ID_24:-0.00052,(ID31:0.01697,(ID-123:-0.00059,ID&890:0.03528):0.00037):0.00027,ID12876:0.03484); I need to find ":" anywhere... (6 Replies)
Discussion started by: Xterra
6 Replies

8. Shell Programming and Scripting

program to calculate distance between 5 atoms

Hello, I am a beginner with perl. I have a perl program to calculate the distance between 5 atoms or more. i have an array which looks like this: 6.324 32.707 50.379 5.197 32.618 46.826 4.020 36.132 46.259 7.131 38.210 45.919 6.719 38.935 42.270 2.986 39.221 ... (1 Reply)
Discussion started by: annie_singh
1 Replies

9. Shell Programming and Scripting

finding duplicate files by size and finding pattern matching and its count

Hi, I have a challenging task,in which i have to find the duplicate files by its name and size,then i need to take anyone of the file.Then i need to open the file and find for more than one pattern and count of that pattern. Note:These are the samples of two files,but i can have more... (2 Replies)
Discussion started by: jerome Sukumar
2 Replies
Login or Register to Ask a Question