Assign a particular value to all items in a group that have the same identifier


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Assign a particular value to all items in a group that have the same identifier
# 1  
Old 07-12-2017
Assign a particular value to all items in a group that have the same identifier

I have a pdb file with the following format:
Code:
ATOM     11  N   PRO A   1      23.223  20.197  14.441  1.00 12.21           N
ATOM     12  CA  PRO A   1      21.881  20.749  14.227  1.00 11.37           C
ATOM     13  C   PRO A   1      21.929  21.556  12.903  1.00 10.73           C
ATOM     14  O   PRO A   1      22.872  22.308  12.668  1.00 12.80           O
ATOM     15  CB  PRO A   1      21.641  21.649  15.437  1.00 10.87           C
ATOM     16  CG  PRO A   1      22.595  21.065  16.473  1.00 11.96           C
ATOM     17  CD  PRO A   1      23.844  20.738  15.680  1.00 12.56           C
ATOM     18  N   GLN A   2      20.920  21.358  12.090  1.00 10.01           N
ATOM     19  CA  GLN A   2      20.848  22.096  10.790  1.00 10.36           C
ATOM     20  C   GLN A   2      20.523  23.577  11.104  1.00  9.48           C
ATOM     21  O   GLN A   2      19.483  23.784  11.751  1.00  9.88           O
ATOM     22  CB  GLN A   2      19.839  21.439   9.860  1.00  9.46           C
ATOM     23  CG  GLN A   2      19.997  22.014   8.451  1.00 10.39           C
ATOM     24  CD  GLN A   2      19.124  21.359   7.433  1.00 11.31           C
ATOM     25  OE1 GLN A   2      18.853  20.153   7.480  1.00 11.96           O
ATOM     26  NE2 GLN A   2      18.609  22.130   6.468  1.00 10.45           N
ATOM     27  N   ALA A   3      21.334  24.475  10.596  1.00  9.29           N
ATOM     28  CA  ALA A   3      21.051  25.904  10.815  1.00  9.15           C
ATOM     29  C   ALA A   3      20.012  26.344   9.800  1.00  9.68           C
ATOM     30  O   ALA A   3      20.253  26.155   8.562  1.00 11.79           O
ATOM     31  CB  ALA A   3      22.339  26.688  10.684  1.00 11.79           C
ATOM     32  N   ILE A   4      18.911  26.884  10.201  1.00  8.39           N
ATOM     33  CA  ILE A   4      17.818  27.322   9.338  1.00  8.72           C
ATOM     34  C   ILE A   4      17.469  28.769   9.682  1.00  8.88           C
ATOM     35  O   ILE A   4      17.202  29.056  10.870  1.00 10.24           O
ATOM     36  CB  ILE A   4      16.576  26.401   9.508  1.00  9.53           C
ATOM     37  CG1 ILE A   4      16.904  24.950   9.073  1.00 10.08           C
ATOM     38  CG2 ILE A   4      15.347  26.971   8.765  1.00 10.36           C
ATOM     39  CD1 ILE A   4      15.720  23.972   9.288  1.00 11.15           C

Between columns 61 to 66 is the value for the B-factor;
between columns 23-26 is the residue number; and
between columns 13-15 is the atom name.

I need to take the B-factor (columns 61-66) for atom CA (columns 13-15) that corresponds to each residue number (columns 23-26), and write that value down in columns 68 to 73 for all rows with the matching residue number.

The pattern always puts CA as the second atom in the group of residues, but the complication is that the number of atoms for each residue varies.

For example, for the data above, I need to have the following output:
Code:
ATOM     11  N   PRO A   1      23.223  20.197  14.441  1.00 12.21 11.37     N    
ATOM     12  CA  PRO A   1      21.881  20.749  14.227  1.00 11.37 11.37     C    
ATOM     13  C   PRO A   1      21.929  21.556  12.903  1.00 10.73 11.37     C    
ATOM     14  O   PRO A   1      22.872  22.308  12.668  1.00 12.80 11.37     O    
ATOM     15  CB  PRO A   1      21.641  21.649  15.437  1.00 10.87 11.37     C    
ATOM     16  CG  PRO A   1      22.595  21.065  16.473  1.00 11.96 11.37     C    
ATOM     17  CD  PRO A   1      23.844  20.738  15.680  1.00 12.56 11.37     C    
ATOM     18  N   GLN A   2      20.920  21.358  12.090  1.00 10.01 10.36     N    
ATOM     19  CA  GLN A   2      20.848  22.096  10.790  1.00 10.36 10.36     C    
ATOM     20  C   GLN A   2      20.523  23.577  11.104  1.00  9.48 10.36     C    
ATOM     21  O   GLN A   2      19.483  23.784  11.751  1.00  9.88 10.36     O    
ATOM     22  CB  GLN A   2      19.839  21.439   9.860  1.00  9.46 10.36     C    
ATOM     23  CG  GLN A   2      19.997  22.014   8.451  1.00 10.39 10.36     C    
ATOM     24  CD  GLN A   2      19.124  21.359   7.433  1.00 11.31 10.36     C    
ATOM     25  OE1 GLN A   2      18.853  20.153   7.480  1.00 11.96 10.36     O    
ATOM     26  NE2 GLN A   2      18.609  22.130   6.468  1.00 10.45 10.36     N    
ATOM     27  N   ALA A   3      21.334  24.475  10.596  1.00  9.29  9.15     N    
ATOM     28  CA  ALA A   3      21.051  25.904  10.815  1.00  9.15  9.15     C    
ATOM     29  C   ALA A   3      20.012  26.344   9.800  1.00  9.68  9.15     C    
ATOM     30  O   ALA A   3      20.253  26.155   8.562  1.00 11.79  9.15     O    
ATOM     31  CB  ALA A   3      22.339  26.688  10.684  1.00 11.79  9.15     C    
ATOM     32  N   ILE A   4      18.911  26.884  10.201  1.00  8.39  8.72     N    
ATOM     33  CA  ILE A   4      17.818  27.322   9.338  1.00  8.72  8.72     C    
ATOM     34  C   ILE A   4      17.469  28.769   9.682  1.00  8.88  8.72     C    
ATOM     35  O   ILE A   4      17.202  29.056  10.870  1.00 10.24  8.72     O    
ATOM     36  CB  ILE A   4      16.576  26.401   9.508  1.00  9.53  8.72     C    
ATOM     37  CG1 ILE A   4      16.904  24.950   9.073  1.00 10.08  8.72     C    
ATOM     38  CG2 ILE A   4      15.347  26.971   8.765  1.00 10.36  8.72     C    
ATOM     39  CD1 ILE A   4      15.720  23.972   9.288  1.00 11.15  8.72     C

Can anyone help me? Very much appreciated for your time.
# 2  
Old 07-13-2017
Hi, try:
Code:
awk '
  NR==FNR {
    if($3=="CA")
      A[$6]=substr($0,61,6)
    next
  }
  {
    print substr($0,1,66) A[$6] substr($0,73)
  }
'  file.pdb file.pdb

--
Note: the file needs to be specified twice
This User Gave Thanks to Scrutinizer For This Post:
# 3  
Old 07-13-2017
Thanks very much! It works great.
Is it possible to have it write to a new file, rather than overwrite the current file? I tried to modify the command trivially, but couldn't get anything to work.
If not easy to do, it is fine, I can make just make two files.
Very much appreciated.
# 4  
Old 07-13-2017
You are welcome. Yes, you can simply redirect the output to a new file
Code:
  }
'  file.pdb file.pdb > new_file.pdb

Otherwise, I am not sure if I understand what you mean. The awk script itself was not overwriting a file, it was reading the same file twice, therefore the file name needs to be specified twice.
This User Gave Thanks to Scrutinizer For This Post:
# 5  
Old 07-13-2017
That worked! Thanks very much.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. AIX

Restvg does not assign the correct PP size to volume group

hello, i am running an AIX6.1 machine and i am trying to restore a volume group that i backed up using mkvgdata command from another server. although i checked file .data and i make sure that PP size for this volume group is 128, when i run restvg command to restore it, it fails because it... (2 Replies)
Discussion started by: omonoiatis9
2 Replies

2. Shell Programming and Scripting

need a one liner to grep a group info from /etc/group and use that result to search passwd file

/etc/group tiadm::345:mk789,po312,jo343,ju454,ko453,yx879,iy345,hn453 bin::2:root,daemon sys::3:root,bin,adm adm::4:root,daemon uucp::5:root /etc/passwd mk789:x:234:1::/export/home/dummy:/bin/sh po312:x:234:1::/export/home/dummy:/bin/sh ju454:x:234:1::/export/home/dummy:/bin/sh... (6 Replies)
Discussion started by: chidori
6 Replies

3. Shell Programming and Scripting

is not an identifier

Hi Guys... I am using the following codes in my script: SID_L=`cat /var/opt/oracle/oratab|grep -v "^#"|cut -f1 -d: -s` SID_VAR=$SID_L for SID_RUN in $SID_VAR do ORACLE_HOME=`grep ^$SID_RUN /var/opt/oracle/oratab | \ awk -F: '{print $2}'` ;export ORACLE_HOME export... (2 Replies)
Discussion started by: Phuti
2 Replies

4. Solaris

how to assign group policy to user in solaris

hi, how to assign group policy to user in solaris (1 Reply)
Discussion started by: meet2muneer
1 Replies

5. Shell Programming and Scripting

not an identifier

Hi I have already gone through this topic on this forum, but still i am getting same problem. I am using solaris 10. my login shell is /usr/bash i have got a script as below /home/gyan> cat 3.cm #!/usr/bin/ksh export PROG_NAME=rpaa001 if i run this script as below , it works fine... (3 Replies)
Discussion started by: gyanibaba
3 Replies

6. Shell Programming and Scripting

Merge group numbers and add a column containing group names

I have a file in the following format. Groups of data merge together and the group number is indicated above each group. 1 adrf dfgr dfg 2 dfgr dfgr 3 dfef dfr fd 4 fgrt fgr fgg 5 fgrt fgr (3 Replies)
Discussion started by: Lucky Ali
3 Replies

7. Solaris

-sh: is not an identifier

Hi , I am getting the following message when log into my unix account in sun solaris (version5.9)server. -sh: ORACLE_HOME=/apps/oracle/product/10.2.0/client_1: is not an identifier The ORACLE_HOME is set in .profile file. Another thing is that SID is also set inside .profile like... (4 Replies)
Discussion started by: megh
4 Replies

8. Shell Programming and Scripting

awk between items including items

OS=HP-UX ksh The following works, except I want to include the <start> and <end> in the output. awk -F '<start>' 'BEGIN{RS="<end>"; OFS="\n"; ORS=""} {print $2} somefile.log' The following work in bash but not in ksh sed -n '/^<start>/,/^<end>/{/LABEL$/!p}' somefile.log (4 Replies)
Discussion started by: Ikon
4 Replies

9. UNIX for Dummies Questions & Answers

File creation and auto-group assign

How do I make a file automatically get assigned a group that the user that created it is in? (1 Reply)
Discussion started by: dhinge
1 Replies
Login or Register to Ask a Question