Assign a particular value to all items in a group that have the same identifier
I have a pdb file with the following format:
Code:
ATOM 11 N PRO A 1 23.223 20.197 14.441 1.00 12.21 N
ATOM 12 CA PRO A 1 21.881 20.749 14.227 1.00 11.37 C
ATOM 13 C PRO A 1 21.929 21.556 12.903 1.00 10.73 C
ATOM 14 O PRO A 1 22.872 22.308 12.668 1.00 12.80 O
ATOM 15 CB PRO A 1 21.641 21.649 15.437 1.00 10.87 C
ATOM 16 CG PRO A 1 22.595 21.065 16.473 1.00 11.96 C
ATOM 17 CD PRO A 1 23.844 20.738 15.680 1.00 12.56 C
ATOM 18 N GLN A 2 20.920 21.358 12.090 1.00 10.01 N
ATOM 19 CA GLN A 2 20.848 22.096 10.790 1.00 10.36 C
ATOM 20 C GLN A 2 20.523 23.577 11.104 1.00 9.48 C
ATOM 21 O GLN A 2 19.483 23.784 11.751 1.00 9.88 O
ATOM 22 CB GLN A 2 19.839 21.439 9.860 1.00 9.46 C
ATOM 23 CG GLN A 2 19.997 22.014 8.451 1.00 10.39 C
ATOM 24 CD GLN A 2 19.124 21.359 7.433 1.00 11.31 C
ATOM 25 OE1 GLN A 2 18.853 20.153 7.480 1.00 11.96 O
ATOM 26 NE2 GLN A 2 18.609 22.130 6.468 1.00 10.45 N
ATOM 27 N ALA A 3 21.334 24.475 10.596 1.00 9.29 N
ATOM 28 CA ALA A 3 21.051 25.904 10.815 1.00 9.15 C
ATOM 29 C ALA A 3 20.012 26.344 9.800 1.00 9.68 C
ATOM 30 O ALA A 3 20.253 26.155 8.562 1.00 11.79 O
ATOM 31 CB ALA A 3 22.339 26.688 10.684 1.00 11.79 C
ATOM 32 N ILE A 4 18.911 26.884 10.201 1.00 8.39 N
ATOM 33 CA ILE A 4 17.818 27.322 9.338 1.00 8.72 C
ATOM 34 C ILE A 4 17.469 28.769 9.682 1.00 8.88 C
ATOM 35 O ILE A 4 17.202 29.056 10.870 1.00 10.24 O
ATOM 36 CB ILE A 4 16.576 26.401 9.508 1.00 9.53 C
ATOM 37 CG1 ILE A 4 16.904 24.950 9.073 1.00 10.08 C
ATOM 38 CG2 ILE A 4 15.347 26.971 8.765 1.00 10.36 C
ATOM 39 CD1 ILE A 4 15.720 23.972 9.288 1.00 11.15 C
Between columns 61 to 66 is the value for the B-factor;
between columns 23-26 is the residue number; and
between columns 13-15 is the atom name.
I need to take the B-factor (columns 61-66) for atom CA (columns 13-15) that corresponds to each residue number (columns 23-26), and write that value down in columns 68 to 73 for all rows with the matching residue number.
The pattern always puts CA as the second atom in the group of residues, but the complication is that the number of atoms for each residue varies.
For example, for the data above, I need to have the following output:
Code:
ATOM 11 N PRO A 1 23.223 20.197 14.441 1.00 12.21 11.37 N
ATOM 12 CA PRO A 1 21.881 20.749 14.227 1.00 11.37 11.37 C
ATOM 13 C PRO A 1 21.929 21.556 12.903 1.00 10.73 11.37 C
ATOM 14 O PRO A 1 22.872 22.308 12.668 1.00 12.80 11.37 O
ATOM 15 CB PRO A 1 21.641 21.649 15.437 1.00 10.87 11.37 C
ATOM 16 CG PRO A 1 22.595 21.065 16.473 1.00 11.96 11.37 C
ATOM 17 CD PRO A 1 23.844 20.738 15.680 1.00 12.56 11.37 C
ATOM 18 N GLN A 2 20.920 21.358 12.090 1.00 10.01 10.36 N
ATOM 19 CA GLN A 2 20.848 22.096 10.790 1.00 10.36 10.36 C
ATOM 20 C GLN A 2 20.523 23.577 11.104 1.00 9.48 10.36 C
ATOM 21 O GLN A 2 19.483 23.784 11.751 1.00 9.88 10.36 O
ATOM 22 CB GLN A 2 19.839 21.439 9.860 1.00 9.46 10.36 C
ATOM 23 CG GLN A 2 19.997 22.014 8.451 1.00 10.39 10.36 C
ATOM 24 CD GLN A 2 19.124 21.359 7.433 1.00 11.31 10.36 C
ATOM 25 OE1 GLN A 2 18.853 20.153 7.480 1.00 11.96 10.36 O
ATOM 26 NE2 GLN A 2 18.609 22.130 6.468 1.00 10.45 10.36 N
ATOM 27 N ALA A 3 21.334 24.475 10.596 1.00 9.29 9.15 N
ATOM 28 CA ALA A 3 21.051 25.904 10.815 1.00 9.15 9.15 C
ATOM 29 C ALA A 3 20.012 26.344 9.800 1.00 9.68 9.15 C
ATOM 30 O ALA A 3 20.253 26.155 8.562 1.00 11.79 9.15 O
ATOM 31 CB ALA A 3 22.339 26.688 10.684 1.00 11.79 9.15 C
ATOM 32 N ILE A 4 18.911 26.884 10.201 1.00 8.39 8.72 N
ATOM 33 CA ILE A 4 17.818 27.322 9.338 1.00 8.72 8.72 C
ATOM 34 C ILE A 4 17.469 28.769 9.682 1.00 8.88 8.72 C
ATOM 35 O ILE A 4 17.202 29.056 10.870 1.00 10.24 8.72 O
ATOM 36 CB ILE A 4 16.576 26.401 9.508 1.00 9.53 8.72 C
ATOM 37 CG1 ILE A 4 16.904 24.950 9.073 1.00 10.08 8.72 C
ATOM 38 CG2 ILE A 4 15.347 26.971 8.765 1.00 10.36 8.72 C
ATOM 39 CD1 ILE A 4 15.720 23.972 9.288 1.00 11.15 8.72 C
Can anyone help me? Very much appreciated for your time.
Thanks very much! It works great.
Is it possible to have it write to a new file, rather than overwrite the current file? I tried to modify the command trivially, but couldn't get anything to work.
If not easy to do, it is fine, I can make just make two files.
Very much appreciated.
You are welcome. Yes, you can simply redirect the output to a new file
Code:
}
' file.pdb file.pdb > new_file.pdb
Otherwise, I am not sure if I understand what you mean. The awk script itself was not overwriting a file, it was reading the same file twice, therefore the file name needs to be specified twice.
This User Gave Thanks to Scrutinizer For This Post:
hello, i am running an AIX6.1 machine and i am trying to restore a volume group that i backed up using mkvgdata command from another server. although i checked file .data and i make sure that PP size for this volume group is 128, when i run restvg command to restore it, it fails because it... (2 Replies)
Hi Guys...
I am using the following codes in my script:
SID_L=`cat /var/opt/oracle/oratab|grep -v "^#"|cut -f1 -d: -s`
SID_VAR=$SID_L
for SID_RUN in $SID_VAR
do
ORACLE_HOME=`grep ^$SID_RUN /var/opt/oracle/oratab | \
awk -F: '{print $2}'` ;export ORACLE_HOME
export... (2 Replies)
Hi
I have already gone through this topic on this forum, but still i am getting same problem.
I am using solaris 10. my login shell is /usr/bash
i have got a script as below
/home/gyan> cat 3.cm
#!/usr/bin/ksh
export PROG_NAME=rpaa001
if i run this script as below , it works fine... (3 Replies)
I have a file in the following format. Groups of data merge together and the group number is indicated above each group.
1
adrf
dfgr
dfg
2
dfgr
dfgr
3
dfef
dfr
fd
4
fgrt
fgr
fgg
5
fgrt
fgr (3 Replies)
Hi ,
I am getting the following message when log into my unix account in sun solaris (version5.9)server.
-sh: ORACLE_HOME=/apps/oracle/product/10.2.0/client_1: is not an identifier
The ORACLE_HOME is set in .profile file.
Another thing is that SID is also set inside .profile like... (4 Replies)
OS=HP-UX ksh
The following works, except I want to include the <start> and <end> in the output.
awk -F '<start>' 'BEGIN{RS="<end>"; OFS="\n"; ORS=""} {print $2} somefile.log'
The following work in bash but not in ksh
sed -n '/^<start>/,/^<end>/{/LABEL$/!p}' somefile.log (4 Replies)