how I can add a constant to a field without changing the file format


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting how I can add a constant to a field without changing the file format
# 1  
Old 05-26-2010
how I can add a constant to a field without changing the file format

Hi, I need to edit a file Protein Data Bank (pdb) and then open that file with the program VMD but when I edit the file with awk, it changes pdb format and the VMD program can not read it.
I need to subtract 34 to field 6 ($ 6).
this is a pdb file :
Code:
ATOM    918  N   GLY B 103     -11.855   8.675 -11.404  1.00  0.00           N  
ATOM    919  CA  GLY B 103     -13.297   8.831 -11.189  1.00  0.00           C  
ATOM    920  C   GLY B 103     -13.977   7.468 -11.253  1.00  0.00           C  
ATOM    921  O   GLY B 103     -14.817   7.213 -12.116  1.00  0.00           O  
ATOM    922  H   GLY B 103     -11.342   7.827 -11.132  1.00  0.00           H  
ATOM    923  HA2 GLY B 103     -13.483   9.303 -10.207  1.00  0.00           H  
ATOM    924  HA3 GLY B 103     -13.734   9.500 -11.952  1.00  0.00           H  
ATOM    925  N   MET B 104     -13.601   6.588 -10.332  1.00  0.00           N  
ATOM    926  CA  MET B 104     -14.127   5.222 -10.341  1.00  0.00           C  
ATOM    927  C   MET B 104     -15.232   5.094  -9.286  1.00  0.00           C  
ATOM    928  O   MET B 104     -16.396   4.872  -9.620  1.00  0.00           O  
ATOM    929  CB  MET B 104     -12.929   4.298 -10.093  1.00  0.00           C  
ATOM    930  CG  MET B 104     -13.316   2.824 -10.092  1.00  0.00           C  
ATOM    931  SD  MET B 104     -14.028   2.373 -11.684  1.00  0.00           S  
ATOM    932  CE  MET B 104     -14.384   0.636 -11.368  1.00  0.00           C  
ATOM    933  H   MET B 104     -12.859   6.880  -9.688  1.00  0.00           H  
ATOM    934  HA  MET B 104     -14.557   4.991 -11.333  1.00  0.00           H  
ATOM    935  HB2 MET B 104     -12.161   4.469 -10.870  1.00  0.00           H  
ATOM    936  HB3 MET B 104     -12.447   4.551  -9.131  1.00  0.00           H  
ATOM    937  HG2 MET B 104     -12.421   2.209  -9.894  1.00  0.00           H  
ATOM    938  HG3 MET B 104     -14.041   2.627  -9.283  1.00  0.00           H  
ATOM    939  HE1 MET B 104     -15.076   0.527 -10.513  1.00  0.00           H  
ATOM    940  HE2 MET B 104     -14.852   0.168 -12.252  1.00  0.00           H  
ATOM    941  HE3 MET B 104     -13.456   0.082 -11.135  1.00  0.00           H

when I try with this script:
Code:
#!/bin/bash

awk  '$6= $6 - 34'  > pba10.

I get this

Code:
ATOM 918 N GLY B 69 -11.855 8.675 -11.404 1.00 0.00 N
ATOM 919 CA GLY B 69 -13.297 8.831 -11.189 1.00 0.00 C
ATOM 920 C GLY B 69 -13.977 7.468 -11.253 1.00 0.00 C
ATOM 921 O GLY B 69 -14.817 7.213 -12.116 1.00 0.00 O
ATOM 922 H GLY B 69 -11.342 7.827 -11.132 1.00 0.00 H
ATOM 923 HA2 GLY B 69 -13.483 9.303 -10.207 1.00 0.00 H
ATOM 924 HA3 GLY B 69 -13.734 9.500 -11.952 1.00 0.00 H
ATOM 925 N MET B 70 -13.601 6.588 -10.332 1.00 0.00 N
ATOM 926 CA MET B 70 -14.127 5.222 -10.341 1.00 0.00 C
ATOM 927 C MET B 70 -15.232 5.094 -9.286 1.00 0.00 C
ATOM 928 O MET B 70 -16.396 4.872 -9.620 1.00 0.00 O
ATOM 929 CB MET B 70 -12.929 4.298 -10.093 1.00 0.00 C
ATOM 930 CG MET B 70 -13.316 2.824 -10.092 1.00 0.00 C
ATOM 931 SD MET B 70 -14.028 2.373 -11.684 1.00 0.00 S
ATOM 932 CE MET B 70 -14.384 0.636 -11.368 1.00 0.00 C
ATOM 933 H MET B 70 -12.859 6.880 -9.688 1.00 0.00 H
ATOM 934 HA MET B 70 -14.557 4.991 -11.333 1.00 0.00 H
ATOM 935 HB2 MET B 70 -12.161 4.469 -10.870 1.00 0.00 H
ATOM 936 HB3 MET B 70 -12.447 4.551 -9.131 1.00 0.00 H
ATOM 937 HG2 MET B 70 -12.421 2.209 -9.894 1.00 0.00 H
ATOM 938 HG3 MET B 70 -14.041 2.627 -9.283 1.00 0.00 H
ATOM 939 HE1 MET B 70 -15.076 0.527 -10.513 1.00 0.00 H
ATOM 940 HE2 MET B 70 -14.852 0.168 -12.252 1.00 0.00 H
ATOM 941 HE3 MET B 70 -13.456 0.082 -11.135 1.00 0.00 H

and thist pdb file can not read by VMD, i have tried replacing the spaces with a "\t" but the result is not good
# 2  
Old 05-26-2010
Looks like you're working with a fixed-width format on the input file, but not accounting for it on the way out. The awk util will assume you FS=' ' but won't necessarily count them on the output...whitespace is counted as one, hence the output shifting on you.

Are you familiar with the control file or file layout for the .pdb file? You can process things according to substr() or even printf() with much more better results...
# 3  
Old 05-26-2010
thanks for your help, another suggestion?
# 4  
Old 05-26-2010
Does this work?:
Code:
$ awk '{X = $6 - 34; sub($6, X)}1' file

# 5  
Old 05-26-2010
I'm guessing at field defs here, but something like-a-dis:
Code:
$ awk '{printf "%-4s%7d  %-4s%-4s%-2s%3.d\n",$1,$2,$3,$4,$5,($6-34)}' Edit8
ATOM    918  N   GLY B  69
ATOM    919  CA  GLY B  69
ATOM    920  C   GLY B  69
ATOM    921  O   GLY B  69
ATOM    922  H   GLY B  69
ATOM    923  HA2 GLY B  69
ATOM    924  HA3 GLY B  69
ATOM    925  N   MET B  70
ATOM    926  CA  MET B  70
ATOM    927  C   MET B  70
ATOM    928  O   MET B  70
ATOM    929  CB  MET B  70
ATOM    930  CG  MET B  70
ATOM    931  SD  MET B  70
ATOM    932  CE  MET B  70
ATOM    933  H   MET B  70
ATOM    934  HA  MET B  70
ATOM    935  HB2 MET B  70
ATOM    936  HB3 MET B  70
ATOM    937  HG2 MET B  70
ATOM    938  HG3 MET B  70
ATOM    939  HE1 MET B  70
ATOM    940  HE2 MET B  70
ATOM    941  HE3 MET B  70

or, if you prefer concise and/or succinct...scottn posted something short and sweet. Only seems to shift $6 just a tad left...
Code:
$ awk '{X = $6 - 34; sub($6, X)}1' Edit8
ATOM    918  N   GLY B 69     -11.855   8.675 -11.404  1.00  0.00           N
ATOM    919  CA  GLY B 69     -13.297   8.831 -11.189  1.00  0.00           C
ATOM    920  C   GLY B 69     -13.977   7.468 -11.253  1.00  0.00           C
ATOM    921  O   GLY B 69     -14.817   7.213 -12.116  1.00  0.00           O
ATOM    922  H   GLY B 69     -11.342   7.827 -11.132  1.00  0.00           H
ATOM    923  HA2 GLY B 69     -13.483   9.303 -10.207  1.00  0.00           H
ATOM    924  HA3 GLY B 69     -13.734   9.500 -11.952  1.00  0.00           H
ATOM    925  N   MET B 70     -13.601   6.588 -10.332  1.00  0.00           N
ATOM    926  CA  MET B 70     -14.127   5.222 -10.341  1.00  0.00           C
ATOM    927  C   MET B 70     -15.232   5.094  -9.286  1.00  0.00           C
ATOM    928  O   MET B 70     -16.396   4.872  -9.620  1.00  0.00           O
ATOM    929  CB  MET B 70     -12.929   4.298 -10.093  1.00  0.00           C
ATOM    930  CG  MET B 70     -13.316   2.824 -10.092  1.00  0.00           C
ATOM    931  SD  MET B 70     -14.028   2.373 -11.684  1.00  0.00           S
ATOM    932  CE  MET B 70     -14.384   0.636 -11.368  1.00  0.00           C
ATOM    933  H   MET B 70     -12.859   6.880  -9.688  1.00  0.00           H
ATOM    934  HA  MET B 70     -14.557   4.991 -11.333  1.00  0.00           H
ATOM    935  HB2 MET B 70     -12.161   4.469 -10.870  1.00  0.00           H
ATOM    936  HB3 MET B 70     -12.447   4.551  -9.131  1.00  0.00           H
ATOM    937  HG2 MET B 70     -12.421   2.209  -9.894  1.00  0.00           H
ATOM    938  HG3 MET B 70     -14.041   2.627  -9.283  1.00  0.00           H
ATOM    939  HE1 MET B 70     -15.076   0.527 -10.513  1.00  0.00           H
ATOM    940  HE2 MET B 70     -14.852   0.168 -12.252  1.00  0.00           H
ATOM    941  HE3 MET B 70     -13.456   0.082 -11.135  1.00  0.00           H

# 6  
Old 05-26-2010
Quote:
Originally Posted by curleb
I'm guessing at field defs here, but something like-a-dis:
Code:
$ awk '{printf "%-4s%7d  %-4s%-4s%-2s%3.d\n",$1,$2,$3,$4,$5,($6-34)}' Edit8
ATOM    918  N   GLY B  69
ATOM    919  CA  GLY B  69
ATOM    920  C   GLY B  69
ATOM    921  O   GLY B  69
ATOM    922  H   GLY B  69
ATOM    923  HA2 GLY B  69
ATOM    924  HA3 GLY B  69
ATOM    925  N   MET B  70
ATOM    926  CA  MET B  70
ATOM    927  C   MET B  70
ATOM    928  O   MET B  70
ATOM    929  CB  MET B  70
ATOM    930  CG  MET B  70
ATOM    931  SD  MET B  70
ATOM    932  CE  MET B  70
ATOM    933  H   MET B  70
ATOM    934  HA  MET B  70
ATOM    935  HB2 MET B  70
ATOM    936  HB3 MET B  70
ATOM    937  HG2 MET B  70
ATOM    938  HG3 MET B  70
ATOM    939  HE1 MET B  70
ATOM    940  HE2 MET B  70
ATOM    941  HE3 MET B  70

or, if you prefer concise and/or succinct...scottn posted something short and sweet. Only seems to shift $6 just a tad left...
Code:
$ awk '{X = $6 - 34; sub($6, X)}1' Edit8
ATOM    918  N   GLY B 69     -11.855   8.675 -11.404  1.00  0.00           N
ATOM    919  CA  GLY B 69     -13.297   8.831 -11.189  1.00  0.00           C
ATOM    920  C   GLY B 69     -13.977   7.468 -11.253  1.00  0.00           C
ATOM    921  O   GLY B 69     -14.817   7.213 -12.116  1.00  0.00           O
ATOM    922  H   GLY B 69     -11.342   7.827 -11.132  1.00  0.00           H
ATOM    923  HA2 GLY B 69     -13.483   9.303 -10.207  1.00  0.00           H
ATOM    924  HA3 GLY B 69     -13.734   9.500 -11.952  1.00  0.00           H
ATOM    925  N   MET B 70     -13.601   6.588 -10.332  1.00  0.00           N
ATOM    926  CA  MET B 70     -14.127   5.222 -10.341  1.00  0.00           C
ATOM    927  C   MET B 70     -15.232   5.094  -9.286  1.00  0.00           C
ATOM    928  O   MET B 70     -16.396   4.872  -9.620  1.00  0.00           O
ATOM    929  CB  MET B 70     -12.929   4.298 -10.093  1.00  0.00           C
ATOM    930  CG  MET B 70     -13.316   2.824 -10.092  1.00  0.00           C
ATOM    931  SD  MET B 70     -14.028   2.373 -11.684  1.00  0.00           S
ATOM    932  CE  MET B 70     -14.384   0.636 -11.368  1.00  0.00           C
ATOM    933  H   MET B 70     -12.859   6.880  -9.688  1.00  0.00           H
ATOM    934  HA  MET B 70     -14.557   4.991 -11.333  1.00  0.00           H
ATOM    935  HB2 MET B 70     -12.161   4.469 -10.870  1.00  0.00           H
ATOM    936  HB3 MET B 70     -12.447   4.551  -9.131  1.00  0.00           H
ATOM    937  HG2 MET B 70     -12.421   2.209  -9.894  1.00  0.00           H
ATOM    938  HG3 MET B 70     -14.041   2.627  -9.283  1.00  0.00           H
ATOM    939  HE1 MET B 70     -15.076   0.527 -10.513  1.00  0.00           H
ATOM    940  HE2 MET B 70     -14.852   0.168 -12.252  1.00  0.00           H
ATOM    941  HE3 MET B 70     -13.456   0.082 -11.135  1.00  0.00           H

Ah.. good point Smilie
# 7  
Old 05-26-2010
Code:
$ cat data.pdb
ATOM    918  N   GLY B 103     -11.855   8.675 -11.404  1.00  0.00           N  
ATOM    919  CA  GLY B 103     -13.297   8.831 -11.189  1.00  0.00           C  
ATOM    920  C   GLY B 103     -13.977   7.468 -11.253  1.00  0.00           C  
ATOM    921  O   GLY B 103     -14.817   7.213 -12.116  1.00  0.00           O  
ATOM    922  H   GLY B 103     -11.342   7.827 -11.132  1.00  0.00           H  
ATOM    923  HA2 GLY B 103     -13.483   9.303 -10.207  1.00  0.00           H  
ATOM    924  HA3 GLY B 103     -13.734   9.500 -11.952  1.00  0.00           H  
ATOM    925  N   MET B 104     -13.601   6.588 -10.332  1.00  0.00           N  
ATOM    926  CA  MET B 104     -14.127   5.222 -10.341  1.00  0.00           C  
ATOM    927  C   MET B 104     -15.232   5.094  -9.286  1.00  0.00           C  
ATOM    928  O   MET B 104     -16.396   4.872  -9.620  1.00  0.00           O  
ATOM    929  CB  MET B 104     -12.929   4.298 -10.093  1.00  0.00           C  
ATOM    930  CG  MET B 104     -13.316   2.824 -10.092  1.00  0.00           C  
ATOM    931  SD  MET B 104     -14.028   2.373 -11.684  1.00  0.00           S  
ATOM    932  CE  MET B 104     -14.384   0.636 -11.368  1.00  0.00           C  
ATOM    933  H   MET B 104     -12.859   6.880  -9.688  1.00  0.00           H  
ATOM    934  HA  MET B 104     -14.557   4.991 -11.333  1.00  0.00           H  
ATOM    935  HB2 MET B 104     -12.161   4.469 -10.870  1.00  0.00           H  
ATOM    936  HB3 MET B 104     -12.447   4.551  -9.131  1.00  0.00           H  
ATOM    937  HG2 MET B 104     -12.421   2.209  -9.894  1.00  0.00           H  
ATOM    938  HG3 MET B 104     -14.041   2.627  -9.283  1.00  0.00           H  
ATOM    939  HE1 MET B 104     -15.076   0.527 -10.513  1.00  0.00           H  
ATOM    940  HE2 MET B 104     -14.852   0.168 -12.252  1.00  0.00           H  
ATOM    941  HE3 MET B 104     -13.456   0.082 -11.135  1.00  0.00           H
$ ./modpdb.pl
ATOM    918  N   GLY B 69      -11.855   8.675 -11.404  1.00  0.00           N
ATOM    919  CA  GLY B 69      -13.297   8.831 -11.189  1.00  0.00           C
ATOM    920  C   GLY B 69      -13.977   7.468 -11.253  1.00  0.00           C
ATOM    921  O   GLY B 69      -14.817   7.213 -12.116  1.00  0.00           O
ATOM    922  H   GLY B 69      -11.342   7.827 -11.132  1.00  0.00           H
ATOM    923  HA2 GLY B 69      -13.483   9.303 -10.207  1.00  0.00           H
ATOM    924  HA3 GLY B 69      -13.734   9.500 -11.952  1.00  0.00           H
ATOM    925  N   MET B 70      -13.601   6.588 -10.332  1.00  0.00           N
ATOM    926  CA  MET B 70      -14.127   5.222 -10.341  1.00  0.00           C
ATOM    927  C   MET B 70      -15.232   5.094  -9.286  1.00  0.00           C
ATOM    928  O   MET B 70      -16.396   4.872  -9.620  1.00  0.00           O
ATOM    929  CB  MET B 70      -12.929   4.298 -10.093  1.00  0.00           C
ATOM    930  CG  MET B 70      -13.316   2.824 -10.092  1.00  0.00           C
ATOM    931  SD  MET B 70      -14.028   2.373 -11.684  1.00  0.00           S
ATOM    932  CE  MET B 70      -14.384   0.636 -11.368  1.00  0.00           C
ATOM    933  H   MET B 70      -12.859   6.880  -9.688  1.00  0.00           H
ATOM    934  HA  MET B 70      -14.557   4.991 -11.333  1.00  0.00           H
ATOM    935  HB2 MET B 70      -12.161   4.469 -10.870  1.00  0.00           H
ATOM    936  HB3 MET B 70      -12.447   4.551  -9.131  1.00  0.00           H
ATOM    937  HG2 MET B 70      -12.421   2.209  -9.894  1.00  0.00           H
ATOM    938  HG3 MET B 70      -14.041   2.627  -9.283  1.00  0.00           H
ATOM    939  HE1 MET B 70      -15.076   0.527 -10.513  1.00  0.00           H
ATOM    940  HE2 MET B 70      -14.852   0.168 -12.252  1.00  0.00           H
ATOM    941  HE3 MET B 70      -13.456   0.082 -11.135  1.00  0.00           H
$

Code:
#!/usr/bin/perl

use strict;
use warnings;

my @array;
my $infile='data.pdb';

open(IN,$infile) or die "Error opening input file $infile: $!\n";
while (<IN>) {
chomp;
$_ =~ s/\s+/ /g;
push @array, "$_\n";
}
close(IN);

foreach (@array) {
chomp;
my @arr=split(/ /);
$arr[5]=$arr[5]-34;
printf("%-7s %-4s %-3s %-3s %-1s %-7s %7s %7s %7s %5s %5s %11s\n", @arr);
}

You could simply redirect the output to a new file, instead of having it printed on the terminal:
Code:
$ ./modpdb.pl > newdata.pdb


Last edited by pseudocoder; 05-26-2010 at 07:47 PM.. Reason: optimized code, got rid of a half ton of $arr[0] $arr[1] ,....
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Help changing date format in the nth field

Hi, I have two (2) things that I want to do. First is to change the date format that is in the nth field from MM/DD/YY to YY/MM/DD. Preferably, I wish I know how to make it a 4-digit year but I don't. Problem is I can only assume it is a 20 century Second is somehow know how to figure out... (1 Reply)
Discussion started by: newbie_01
1 Replies

2. UNIX for Dummies Questions & Answers

Changing the file name format

Hello all, I am tryign to change the format of files (which are many in numbers). They at present are named like this: SomeProcess_M-130_100_1_3BR.root SomeProcess_M-130_101_2_3BX.root SomeProcess_M-130_103_3_3RY.root SomeProcess_M-130_105_1_3GH.root SomeProcess_M-130_99_1_3LF.root... (7 Replies)
Discussion started by: emily
7 Replies

3. Shell Programming and Scripting

Rows to columns with first field constant

Hi Friends, I have tried many options to convert rows to column in below fashion. Can you help me pls? input file: kamal|1,2,3|4,5,6|7,8,9 mich|4,7,4|6,7,9 jose|1,1,2|3,3,2|5,5,0 output required: kamal|1,2,3 kamal|4,5,6 kamal|7,8,9 mich|4,7,4 mich|6,7,9 jose|1,1,2 jose|3,3,2... (2 Replies)
Discussion started by: suresh3566
2 Replies

4. Linux

How do I format a Date field of a .CSV file with multiple commas in a string field?

I have a .CSV file (file.csv) whose data are all enclosed in double quotes. Sample format of the file is as below: column1,column2,column3,column4,column5,column6, column7, Column8, Column9, Column10 "12","B000QRIGJ4","4432","string with quotes, and with a comma, and colon: in... (3 Replies)
Discussion started by: dhruuv369
3 Replies

5. Shell Programming and Scripting

Changing field X in file

/etc/newsyslog.conf on a Mac OSX system contains: # configuration file for newsyslog # $FreeBSD: /repoman/r/ncvs/src/etc/newsyslog.conf,v 1.50 2005/03/02 00:40:55 brooks Exp $ # # Entries which do not specify the '/pid_file' field will cause the # syslogd process to be signalled when that... (3 Replies)
Discussion started by: jnojr
3 Replies

6. Shell Programming and Scripting

[Solved] Need help changing a field from MM/DD/YY to DD/MM/YY format

Hi, I need help changing a field from MM/DD/YY to DD/MM/YY format. Suppose a file a.csv. The record is "11/16/09","ABC"," 1","EU","520892414","1","600","31351000","1234567","ANR BANK CO. LTD" "11/16/09","PQR"," 2","EU","520892427","1","600","31351000","5467897","ANR BANK CO.... (4 Replies)
Discussion started by: Gangadhar Reddy
4 Replies

7. Shell Programming and Scripting

Howto add a constant column to the text file

Hi, I am converting a .DBF file to pipe delimited file my requirement is like lets say my .DBF is residing in path /a/b/c/d/f/abc.DBF I need my .txt file as having a column with source _cd =f sample data in .DBF in folder "f" c1 c2 c3 1 2 3 in txt file it should be... (4 Replies)
Discussion started by: angel12345
4 Replies

8. Shell Programming and Scripting

Problem with changing field separators in a file

I have a file with content as shown below. cat t2 : 100,100,"X",1234,"12A",,,"ab,c" Comma is the field seperator, however string fields will be within double quotes and comma within double quotes should not be treated as field seperator. I am trying to replace this field seperator to a... (7 Replies)
Discussion started by: mk1216
7 Replies

9. UNIX for Dummies Questions & Answers

how to add a constant value to a column in a file using unix command

I have a file like this 1 chr1 3661579 3662579 2 chr1 4350395 4351395 3 chr1 4399322 4400322 4 chr1 4486494 4487494 5 chr1 4775807 4776807 6 chr1 4775807 4776807 7 chr1 4775807 4776807 8 chr1 4796973 4797973 9 chr1 4846774 4847774... (3 Replies)
Discussion started by: sunsnow86
3 Replies

10. Shell Programming and Scripting

Changing particular field in fixed width file

I have a fixed width file and i need to change 36th field to "G" in for about random 20 records? How can I do it? (4 Replies)
Discussion started by: dsravan
4 Replies
Login or Register to Ask a Question