AWK print and retain original format


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting AWK print and retain original format
# 1  
Old 03-28-2012
AWK print and retain original format

I have a file with very specific column spacing formatting,

I wish to do the following:

Code:
awk '{print $1, $2, $3, $4, $5, $6, $19-$7, $20-$8, $21-$9, $10, $11, $12}' merge.pdb > vector.pdb

but the format gets ruined.


I have tried with print -f but to no avail....
# 2  
Old 03-28-2012
Please post sample before and after data, making it clear whether this is a fixed-length record and whether fields are left or right justified.
# 3  
Old 03-29-2012
Code:
ATOM      1  N   ALA B   1      -9.995  -6.835   2.255  0.00  0.00      BH      ATOM      1  N   ALA B   1     -13.079 -16.435   0.105  0.00  0.00      BH
ATOM      2  HT2 ALA B   1     -10.828  -7.444   2.585  0.00  0.00      BH      ATOM      2  HT2 ALA B   1     -12.045 -16.716  -0.054  0.00  0.00      BH
ATOM      3  HT3 ALA B   1     -10.119  -6.623   1.230  0.00  0.00      BH      ATOM      3  HT3 ALA B   1     -13.318 -16.970   0.909  0.00  0.00      BH
ATOM      4  CA  ALA B   1     -10.201  -5.652   3.107  0.00  0.00      BH      ATOM      4  CA  ALA B   1     -12.997 -14.938   0.408  0.00  0.00      BH
ATOM      5  HA  ALA B   1     -10.804  -4.989   2.587  0.00  0.00      BH      ATOM      5  HA  ALA B   1     -13.464 -14.356  -0.413  0.00  0.00      BH
ATOM      6  CB  ALA B   1     -10.761  -6.301   4.361  0.00  0.00      BH      ATOM      6  CB  ALA B   1     -13.865 -14.850   1.756  0.00  0.00      BH
ATOM      7  HB1 ALA B   1     -11.677  -6.864   4.178  0.00  0.00      BH      ATOM      7  HB1 ALA B   1     -14.845 -15.340   1.646  0.00  0.00      BH
ATOM      8  HB2 ALA B   1     -10.016  -7.019   4.786  0.00  0.00      BH      ATOM      8  HB2 ALA B   1     -13.341 -15.381   2.572  0.00  0.00      BH
ATOM      9  HB3 ALA B   1     -11.114  -5.604   5.113  0.00  0.00      BH      ATOM      9  HB3 ALA B   1     -14.123 -13.823   2.072  0.00  0.00      BH
ATOM     10  C   ALA B   1      -8.943  -4.890   3.362  0.00  0.00      BH      ATOM     10  C   ALA B   1     -11.715 -14.231   0.505  0.00  0.00      BH
ATOM     11  O   ALA B   1      -8.904  -3.716   3.057  0.00  0.00      BH      ATOM     11  O   ALA B   1     -10.624 -14.820   0.363  0.00  0.00      BH
ATOM     12  N   ALA B   2      -7.940  -5.583   3.917  0.00  0.00      BH      ATOM     12  N   ALA B   2     -11.810 -12.904   0.758  0.00  0.00      BH
ATOM     13  HN  ALA B   2      -8.114  -6.501   4.241  0.00  0.00      BH      ATOM     13  HN  ALA B   2     -12.700 -12.511   0.709  0.00  0.00      BH
ATOM     14  CA  ALA B   2      -6.616  -5.044   4.344  0.00  0.00      BH      ATOM     14  CA  ALA B   2     -10.704 -12.044   0.971  0.00  0.00      BH
ATOM     15  HA  ALA B   2      -6.828  -4.512   5.228  0.00  0.00      BH      ATOM     15  HA  ALA B   2     -10.083 -12.526   1.731  0.00  0.00      BH

Yes, fields are left-justified.

---------- Post updated at 09:51 AM ---------- Previous update was at 09:50 AM ----------

My output loses the format:

Code:
ATOM 1 N ALA B 1 -3.084 -9.6 -2.15 0.00 0.00 BH
ATOM 2 HT2 ALA B 1 -1.217 -9.272 -2.639 0.00 0.00 BH
ATOM 3 HT3 ALA B 1 -3.199 -10.347 -0.321 0.00 0.00 BH
ATOM 4 CA ALA B 1 -2.796 -9.286 -2.699 0.00 0.00 BH
ATOM 5 HA ALA B 1 -2.66 -9.367 -3 0.00 0.00 BH
ATOM 6 CB ALA B 1 -3.104 -8.549 -2.605 0.00 0.00 BH
ATOM 7 HB1 ALA B 1 -3.168 -8.476 -2.532 0.00 0.00 BH
ATOM 8 HB2 ALA B 1 -3.325 -8.362 -2.214 0.00 0.00 BH
ATOM 9 HB3 ALA B 1 -3.009 -8.219 -3.041 0.00 0.00 BH
ATOM 10 C ALA B 1 -2.772 -9.341 -2.857 0.00 0.00 BH
ATOM 11 O ALA B 1 -1.72 -11.104 -2.694 0.00 0.00 BH
ATOM 12 N ALA B 2 -3.87 -7.321 -3.159 0.00 0.00 BH
ATOM 13 HN ALA B 2 -4.586 -6.01 -3.532 0.00 0.00 BH
ATOM 14 CA ALA B 2 -4.088 -7 -3.373 0.00 0.00 BH
ATOM 15 HA ALA B 2 -3.255 -8.014 -3.497 0.00 0.00 BH
ATOM 16 CB ALA B 2 -4.279 -5.691 -5.06 0.00 0.00 BH
ATOM 17 HB1 ALA B 2 -3.994 -5.943 -4.604 0.00 0.00 BH
ATOM 18 HB2 ALA B 2 -5.97 -5.762 -6.094 0.00 0.00 BH

---------- Post updated 03-29-12 at 02:02 AM ---------- Previous update was 03-28-12 at 09:51 AM ----------

Here is an example of a script that deals with this kind of format:

Code:
# This file is fixpdb.awk.
# Useage awk -f fixpdb.awk [segid=wxyz] [chainID=X]   <pdbfile.in >file.out
#                                       [resname=abc] 
# Extracts segments from pdb files and converts to a format acceptable by charmm.
# In command line can specify up to a four character segid with wxyz, e.g. prot. This 
#  field is ignored by current CHARMM versions, but needed for older versions. 
# Can specify a one character chainID. If is specified on command line, extracts
#  only lines whose character in column 22 matches chainID X. Use to extract specific 
#  subunit from pdb file.
# Instead, can specify a three character resname to select HOH or ligands like ARA.
# If resname is specified, extracts only lines whose resname in columns 18-20 
#  matches resname abc value.
# Writes header line as a remark.
# Ignores all other lines not beginning with ATOM or HETATM.
# If a single coordinate value for an atom is present, takes that. 
# If multiple coordinates are present, signified by A, B, .. in column 17, takes only A.
# If protein and HOH lines are present and protein lacks a chainID, takes the 
#  protein lines only.
# Converts HOH to TIP and adds a 3, making TIP3, HIS to HSD, CD1 to CD_ for ILE, 
#  adds the segid in columns 73-76. Converts OXT or OCT1 to OT1 and OCT2 to OT2.
# Renumbers atoms starting from 1.
# Fields: Atom, Atom No, Space, Atom name, Alt Conf indic, Resname, Space, 
#  Chain Ident, Res Seq No, Spaces, x, y, z, Occup, Temp fact, Spaces, Segment ID

BEGIN {FIELDWIDTHS=" 6 5 1 4 1 3 1 1 4 1 3 8 8 8 6 6 6 4"} 
{
	if ($1 == "HEADER")
		print "REMARK" substr($0, 7, 69)
	if ($1 != "ATOM  " && $1 != "HETATM")
		endif	
	else if ($5 != " " && $5 != "A")
		endif
	else if ($6 == resname || $8 == chainID || ($8 == " " && $1 != "HETATM")) 
	{
		atomno++
		if ($6 == "HOH")
		{	$4 = " OH2"
			$6 = "TIP"
			$7 = "3"
		}
		if ($1 == "HETATM")
			$1 = "ATOM  "
		if ($6 == "HIS")
			$6 = "HSD"
		if ($6 == "ILE" && $4 == " CD1")
			$4 = " CD "
		if ($4 == " OXT" || $4 == "OCT1") 
			$4 = " OT2"
		if ($4 == "OCT2")
			$4 = " OT1"
		printf "%6s",$1
		printf "%5d", atomno
		printf "%1s", " "
		printf "%4s", $4
		printf "%1s", " "
		printf "%3s", $6
		printf "%1s", $7
		printf "%1s", " "
		printf "%4s", $9
		printf "%4s", "    "
		printf "%8s", $12
		printf "%8s", $13
		printf "%8s", $14
		printf "%6s", $15
		printf "%6s", $16
		printf "%6s", "      "
		printf "%4s\n", segid
	}

}
END {printf "%3s\n", "END"}

# 4  
Old 03-29-2012
Why not use printf, as in your example script?
# 5  
Old 03-29-2012
why dont you try another program...

---------- Post updated at 02:57 AM ---------- Previous update was at 02:56 AM ----------

or ask other people that has specialty in IT.

---------- Post updated at 02:57 AM ---------- Previous update was at 02:57 AM ----------

I'm sure they can help you..SmilieSmilie
# 6  
Old 03-29-2012
Ok I tried writing the script:

Code:
BEGIN {FIELDWIDTHS=" 6 5 1 4 1 3 1 1 4 1 3 8 8 8 6 6 6 13 6 5 1 4 1 3 1 1 4 1 3 8 8 8 6 6 6 4"}
{ printf "%6s",$1
                printf "%6d", $1
                printf "%5d", $2
                printf "%1s", $3
                printf "%4s", $4
                printf "%1s", $5
                printf "%3s", $6
                printf "%1s", $7
                printf "%1s", $8
                printf "%4s", $9
                printf "%1s", $10
                printf "%3s", $11
                printf "%8s", $12-$30
                printf "%8s", $13-$31
                printf "%8s", $14-$32
                printf "%6s", $15
                printf "%6s", $16
                printf "%6s", $17
                printf "%13s\n", $18
}
END {printf "%3s\n", "END"}


But the output is still poor:

Code:
  ATOM     0    1N ALAB  1-9.995-6.8352.2550.000.00       0       0       1     N   ALA     B            1
  ATOM     0    2HT2 ALAB  1-10.828-7.4442.5850.000.00       0       0       2   HT2   ALA     B            1
  ATOM     0    3HT3 ALAB  1-10.119-6.6231.2300.000.00       0       0       3   HT3   ALA     B            1
  ATOM     0    4CA ALAB  1-10.201-5.6523.1070.000.00       0       0       4    CA   ALA     B            1
  ATOM     0    5HA ALAB  1-10.804-4.9892.5870.000.00       0       0       5    HA   ALA     B            1
  ATOM     0    6CB ALAB  1-10.761-6.3014.3610.000.00       0       0       6    CB   ALA     B            1
  ATOM     0    7HB1 ALAB  1-11.677-6.8644.1780.000.00       0       0       7   HB1   ALA     B            1
  ATOM     0    8HB2 ALAB  1-10.016-7.0194.7860.000.00       0       0       8   HB2   ALA     B            1
  ATOM     0    9HB3 ALAB  1-11.114-5.6045.1130.000.00       0       0       9   HB3   ALA     B            1
  ATOM     0   10C ALAB  1-8.943-4.8903.3620.000.00       0       0      10     C   ALA     B            1
  ATOM     0   11O ALAB  1-8.904-3.7163.0570.000.00       0       0      11     O   ALA     B            1
  ATOM     0   12N ALAB  2-7.940-5.5833.9170.000.00       0       0      12     N   ALA     B            2
  ATOM     0   13HN ALAB  2-8.114-6.5014.2410.000.00       0       0      13    HN   ALA     B            2
  ATOM     0   14CA ALAB  2-6.616-5.0444.3440.000.00       0       0      14    CA   ALA     B            2
  ATOM     0   15HA ALAB  2-6.828-4.5125.2280.000.00       0       0      15    HA   ALA     B            2
END

# 7  
Old 03-29-2012
Alternatively, try this:
Code:
awk 'sub(" *"$7" *"$8" *"$9,sprintf("%12.3f%8.3f%8.3f",$19-$7, $20-$8, $21-$9))' infile | cut -c-74

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to retain header lines in output

The awk below executes and produces the current output, which is correct, except I can not seem to include the header lines # and ## in the output as well. I tried adding !/^#/ thinking that it would skip the lines with # and output them but the entire file prints as is. Thank you :). file ... (8 Replies)
Discussion started by: cmccabe
8 Replies

2. UNIX for Beginners Questions & Answers

Count multiple columns and print original file

Hello, I have two tab files with headers File1: with 4 columns header1 header2 header3 header4 44 a bb 1 57 c ab 4 64 d d 5 File2: with 26 columns header1.. header5 header6 header7 ... header 22...header26 id1 44 a bb id2 57 ... (6 Replies)
Discussion started by: nans
6 Replies

3. Shell Programming and Scripting

Print hex Ip address in decimal format inside awk script

Hi to all, May someone help me with the following. The awk script below is part of a bigger awk script, and this part attempts to print an Ip address that is in hex format in decimal format. I'm trying with following code but I'm getting 0.0.0.0 and the correct answer is 192.168.140.100 ... (9 Replies)
Discussion started by: Ophiuchus
9 Replies

4. UNIX and Linux Applications

Lpr send to print a4 format and print letter format

Hi! How we are? I have an A4 PDF in my server, and i must send it to phisically printer. I use the comand: lpr -P printername -o media=A4 archive.pdf And the printer prints it in letter format, i don't know why. ¿Have ideas or solution? Thanks, my best regards. (6 Replies)
Discussion started by: dcastellini
6 Replies

5. Shell Programming and Scripting

Need to print duplicate row along with highest version of original

There are some duplicate field on description column .I want to print duplicate row along with highest version of number and corresponding description column. file1.txt number Description === ============ 34567 nl21a00is-centerdb001:ncdbareq:Error in loading init 34577 ... (7 Replies)
Discussion started by: vijay_rajni
7 Replies

6. Shell Programming and Scripting

Match ids and print original file

Hello, I have two files Original: ( 5000 entries) Chr Position chr1 879108 chr1 881918 chr1 896874 ... and a file with allele freq ( 2000 entries) Chr Position MAF chr1 881918 0.007 chr1 979748 0.007 chr1... (9 Replies)
Discussion started by: nans
9 Replies

7. Shell Programming and Scripting

How to retain blank spaces in AWK?

Hi all, I have space delimated file which look like this 1 2 3 4 5 6 7 8 9 1 0 11 I am using simple awk command to read the second column awk '{print $2}' input_file but i got the output like this which also read 10 from the third column 2 6... (8 Replies)
Discussion started by: bsn2011
8 Replies

8. Shell Programming and Scripting

mv command to rename multiple files that retain some portion of the original file nam

Well the title is not too good, so I will explain. I need to move (rename) files using a simple AIX script. ???file1.txt ???file2.txt ???file1a.txt ???file2a.txt to be: ???renamedfile1'date'.txt ???renamedfile2'date'.txt ???renamedfile1a'date'.txt ???renamedfile2a'date'.txt ... (4 Replies)
Discussion started by: grimace15
4 Replies

9. Shell Programming and Scripting

awk help required to group output and print a part of group line and original line

Hi, Need awk help to group and print lines to format the output as shown below INPUT FORMAT set echo on set heading on set spool on /* SCHEMA1 */ CREATE TABLE T1; /* SCHEMA1 */ CREATE TABLE T2; /* SCHEMA1 */ CREATE TABLE T3; /* SCHEMA1 */ CREATE TABLE T4; /* SCHEMA1 */ CREATE TABLE T5;... (5 Replies)
Discussion started by: rajan_san
5 Replies

10. Shell Programming and Scripting

How to print log file in column format using awk

Hi Friends, I have a log file as below siteid = HYD spc = 100 rset = RS_D_M siteid = DEL spc = 200 rset = RS_K_L siteid = DEL2 spc = 210 rset = RS_D_M Now I need a output like column wise as below. siteid SPC rset HYD 100 RS_D_M (2 Replies)
Discussion started by: suresh3566
2 Replies
Login or Register to Ask a Question