Removal of extra spaces in *.log files to allow extraction of frequencies


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Removal of extra spaces in *.log files to allow extraction of frequencies
# 1  
Old 04-21-2013
Removal of extra spaces in *.log files to allow extraction of frequencies

Our university has upgraded its version of a computational chemistry program that our group uses quite regularly. In the past we have been able to extract frequency spectra from log files that are generated. Since the upgrade, the viewing program errors out. I've been able to trace down the changes between the old and new log file formats. The new program adds two extra spaces to the following lines:
 
 
Code:
  Atom  AN X Y Z X Y Z X Y Z
     1 30 0.00 0.00 0.03 0.09 -0.02 0.04 0.10 -0.01 -0.03
     2 8 0.03 -0.01 -0.12 -0.14 -0.08 -0.06 -0.14 -0.05 0.06

 
 
One space before Atom and one of the spaces between AN and Atom needs removed as there are two and should be only one space prior to each. On the next line there are five spaces and there should only be three before the atom number or the number starting the second and each successive line. The log file has each frequency broken up into columns of three frequencies so depending on how complex the molecular system is, this can become an impossible job to complete on thousands of files. Once the format is corrected, then everything opens fine and the frequencies can be extracted. The software developer is aware of this issue but advises the generation of a special portable file that would normally be used to transport data across platforms. The above data does have its tabs removed between each column as when I copied and pasted the contents, its formatting was removed. There are multiple instances of the above lines that then have a varied number of atoms listed below each column of three and the entire data set ends with a blank line where the thermochemistry data starts. Since these job files are processed via a bash script to extra thermodynamics and electronic energies, I thought it would be fairly simple to incorporate any new commands into the torque execution script. Any help would be greatly appreciated.

Last edited by Scrutinizer; 04-22-2013 at 01:22 AM.. Reason: code tags
# 2  
Old 04-21-2013
Hi can you post ,

exact sample data and ouput you look for using codes

so that it would be helpful to solve your problem
# 3  
Old 04-21-2013
The following was copied and pasted directly out of the log file but all of the tabs and spaces were removed in transit for some reason. I put the data in the code blocks but I added the spaces that I need edited, if the whole data block needs reformatted manually let me know.
Code:
1 2 3
A A A
Frequencies -- 73.6186 95.0148 177.9910
Red. masses -- 2.5506 3.7026 3.3055
Frc consts -- 0.0081 0.0197 0.0617
IR Inten -- 7.9374 9.9457 8.1890
  Atom  AN X Y Z X Y Z X Y Z
     1 30 0.00 0.00 0.02 -0.07 -0.08 0.00 0.04 -0.02 0.00
     2 8 0.00 0.00 -0.11 0.28 0.09 0.00 0.12 -0.01 0.00
     3 8 0.00 0.00 0.25 -0.06 0.00 0.00 -0.13 0.24 0.00
     4 1 0.00 0.00 -0.07 -0.12 -0.07 0.00 0.40 -0.10 0.00
     5 6 0.00 0.00 -0.19 0.00 0.24 0.00 -0.24 -0.14 0.00
     6 1 0.00 0.00 -0.23 0.32 0.36 0.00 0.14 0.14 0.00
     7 1 0.00 0.00 -0.89 0.21 0.30 0.00 -0.58 -0.24 0.00
     8 1 0.00 0.00 0.12 -0.15 0.39 0.00 0.02 -0.40 0.00
     9 1 0.00 0.00 0.18 0.53 -0.04 0.00 0.26 -0.08 0.00
4 5 6
A A A
Frequencies -- 231.0559 251.4928 255.6673
Red. masses -- 2.8839 1.1192 1.0754
Frc consts -- 0.0907 0.0417 0.0414
IR Inten -- 82.8162 113.2879 160.2404
  Atom  AN X Y Z X Y Z X Y Z
     1 30 0.10 -0.01 0.00 0.00 0.00 0.00 0.00 0.00 -0.01
     2 8 -0.03 0.13 0.00 0.00 0.00 0.06 0.00 0.00 0.06
     3 8 -0.18 -0.13 0.00 0.00 0.00 0.05 0.00 0.00 0.01
     4 1 -0.85 0.21 0.00 0.00 0.00 -0.29 0.00 0.00 0.09
     5 6 -0.14 0.01 0.00 0.00 0.00 -0.04 0.00 0.00 0.00
     6 1 -0.04 0.02 0.00 0.00 0.00 -0.75 0.00 0.00 0.01
     7 1 -0.01 0.05 0.00 0.00 0.00 0.31 0.00 0.00 -0.27
     8 1 -0.25 0.12 0.00 0.00 0.00 -0.46 0.00 0.00 0.27
     9 1 -0.15 0.19 0.00 0.00 0.00 -0.18 0.00 0.00 -0.92

# 4  
Old 04-21-2013
Does the original file have tabs? Or just space characters?

The space characters will transfer fine with copy / paste.

The tab characters I'm not sure.
# 5  
Old 04-21-2013
They are all spaces. There aren't any tabs that I can find. I have been grabbing the text with WinSCP's internal editor. Wordpad also shows everthything aligned via spaces.
# 6  
Old 04-21-2013
Take a look at this and see if I am following your logic correctly:
Code:
$ cat atoms.txt
  Atom  AN X Y Z X Y Z X Y Z
     1 30 0.00 0.00 0.02 -0.07 -0.08 0.00 0.04 -0.02 0.00
     2 8 0.00 0.00 -0.11 0.28 0.09 0.00 0.12 -0.01 0.00
     3 8 0.00 0.00 0.25 -0.06 0.00 0.00 -0.13 0.24 0.00
     4 1 0.00 0.00 -0.07 -0.12 -0.07 0.00 0.40 -0.10 0.00
     5 6 0.00 0.00 -0.19 0.00 0.24 0.00 -0.24 -0.14 0.00
     6 1 0.00 0.00 -0.23 0.32 0.36 0.00 0.14 0.14 0.00
     7 1 0.00 0.00 -0.89 0.21 0.30 0.00 -0.58 -0.24 0.00
     8 1 0.00 0.00 0.12 -0.15 0.39 0.00 0.02 -0.40 0.00
     9 1 0.00 0.00 0.18 0.53 -0.04 0.00 0.26 -0.08 0.00

Code:
$ sed -e "s/ Atom /Atom/" -e "s/^     /   /" atoms.txt
 Atom AN X Y Z X Y Z X Y Z
   1 30 0.00 0.00 0.02 -0.07 -0.08 0.00 0.04 -0.02 0.00
   2 8 0.00 0.00 -0.11 0.28 0.09 0.00 0.12 -0.01 0.00
   3 8 0.00 0.00 0.25 -0.06 0.00 0.00 -0.13 0.24 0.00
   4 1 0.00 0.00 -0.07 -0.12 -0.07 0.00 0.40 -0.10 0.00
   5 6 0.00 0.00 -0.19 0.00 0.24 0.00 -0.24 -0.14 0.00
   6 1 0.00 0.00 -0.23 0.32 0.36 0.00 0.14 0.14 0.00
   7 1 0.00 0.00 -0.89 0.21 0.30 0.00 -0.58 -0.24 0.00
   8 1 0.00 0.00 0.12 -0.15 0.39 0.00 0.02 -0.40 0.00
   9 1 0.00 0.00 0.18 0.53 -0.04 0.00 0.26 -0.08 0.00

The first substitution removes the space before and after "Atom".
The second substitution changes 5 blanks at beginning of line to 3 blanks.
This User Gave Thanks to hanson44 For This Post:
# 7  
Old 04-21-2013
Your set of commands performs the necessary corrections perfectly. Now I need a command set that can be put into a bash script and will search through the log file and make the corrections automatically in the log file so that when it is opened it has the correct formatting.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Grep string causes extra spaces

Hello, I have an xml file and my aim is to grab each line in keywords file and search the string in another file. When keyword is found in xml file,I expect the script to go to previous line in the xml file and grab the string/value between two strings. It's almost working with an error. tab... (6 Replies)
Discussion started by: baris35
6 Replies

2. Shell Programming and Scripting

Removing extra unwanted spaces

hi, i need to remove the extra spaces in the filed. Sample: abc~bd ~bkd123 .. 1space abc~badf ~bakdsf123 .. 2space abc~bqed ~bakuowe .. 3space output: abc~bd ~bkd123 .. 1space abc~badf~bakdsf123 .. 2space abc~bqed~bakuowe .. 3space i used the following command, (2 Replies)
Discussion started by: anshaa
2 Replies

3. Shell Programming and Scripting

What extra Parameters I can use for archiving log files

Hello All, I have developed a script which takes following parameter from the input file to archive log files 1)Input Path 2)File pattern(*.csv) 3)Number of days(+1) Following is the algorithm of my script Read the input file go to that path and search for particular n days older... (3 Replies)
Discussion started by: mitsyjohn
3 Replies

4. Shell Programming and Scripting

Remove of extra spaces from the trailing

HI, I need the help from the experts like I have created one file with text like: a b c d e f g h i j k l So my question is that i have to write the script in which like in the first sentence it will take only one space after d and remove all the extra space in the end.I dont... (8 Replies)
Discussion started by: bhanudhingra
8 Replies

5. Shell Programming and Scripting

Help with removal of spaces between operators and operands

Hi I'm trying to remove blank spaces in expressions and function calls.. Consider the following example printf ("Hello"); a = a + b; I'm trying to eliminate space in between the function name and the opening brace. And also eliminate space between operators and operands.. That is, I'm... (19 Replies)
Discussion started by: abk07
19 Replies

6. Shell Programming and Scripting

Help with removal of blank spaces from the second field!

Hi everyone.. I'm trying to eliminate multiple whitespaces from a file.. I must make use of shell script to eliminate whitespaces.. Take a look at the sample file 1 int main() 2 { 3 int a,b; 4 printf("Enter the values of a and b"); 5 scanf("%d%d",&a,&b); 6 if(a>b) ... (6 Replies)
Discussion started by: abk07
6 Replies

7. Shell Programming and Scripting

Help with removal of blank spaces in a file

Hello.. I have a text file. I want to remove all the blank spaces(except tab) from the file.. I tried using sed command as shown below sed 's/ //g' file1 But the problem with the above command is that it also eliminates 'tab' which is between the columns.. For example if the contents... (7 Replies)
Discussion started by: abk07
7 Replies

8. UNIX for Dummies Questions & Answers

selective removal of blank spaces in string

Hi, I'm a newbie to shell scripting and I have the following problem: I need all spaces between two letters or a letter and a number exchanged for an underscore, but all spaces between a letter and other characters need to remain. Searching forums didn't help... One example for clarity: ... (3 Replies)
Discussion started by: Cpt_Cell
3 Replies

9. Shell Programming and Scripting

How to remove extra spaces from a string??

Hi, I have a string like this and i want to remove extra spaces that exists between the words. Here is the sentence. $string="The small DNA genome of hepadnaviruses is replicated by reverse transcription via an RNA intermediate. This RNA "pregenome" contains ... (2 Replies)
Discussion started by: vanitham
2 Replies

10. UNIX for Dummies Questions & Answers

To remove the extra spaces in unix

Hi... I am quite new to Unix and would like an issue to be resolved. I have a file in the format below; 4,Reclaim,ECXTEST02,abc123,Harry Potter,5432 6730 0327 5469,0603,,MC,,1200,EUR,sho-001,,1,,,abc123,1223 I would like my output to be as follows; 4,Reclaim,ECXTEST02,abc123,Harry... (4 Replies)
Discussion started by: Sho
4 Replies
Login or Register to Ask a Question