Extract a block of text??


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract a block of text??
# 1  
Old 01-25-2011
Extract a block of text??

Hello all,

I have a large output file from which I would like to extract a single block of text.

An example block of text is shown below:

Code:
      ***** EQUILIBRIUM GEOMETRY LOCATED *****
 COORDINATES OF ALL ATOMS ARE (ANGS)
   ATOM   CHARGE       X              Y              Z
 ------------------------------------------------------------
 MOLYBDENUM 42.0   5.9067578125   5.0087332497  17.4699146400
 SULFUR     16.0   7.9742837782   3.7588015097  17.3910898169
 SULFUR     16.0   5.0973219622   3.0091611327  16.3724427108
 SULFUR     16.0   3.8536412225   4.7600928861  18.7261323168
 SULFUR     16.0   6.7241053728   5.6252659948  19.6631739883
 SULFUR     16.0   4.4480017991   6.0998251866  15.8770432027
 SULFUR     16.0   7.3603883558   6.8227401283  16.8054397187
 FLUORINE    9.0   5.8587551406  -0.4318887949  16.7077822115
 FLUORINE    9.0   4.8684829005   0.5366582782  15.0410139777
 FLUORINE    9.0   6.9652096608   0.0213710608  14.8874157686
 FLUORINE    9.0   9.8286766391   1.6474365190  17.5253067335
 FLUORINE    9.0   9.3721734932   1.1952324810  15.4562302461
 FLUORINE    9.0   2.3592720767   6.5544854427  21.3357174293
 FLUORINE    9.0   3.1699656631   4.9028713258  22.5014597961
 FLUORINE    9.0   1.9319720986   4.5053494937  20.7732346644
 FLUORINE    9.0   4.7451572178   7.0343764808  22.6351068043
 FLUORINE    9.0   6.8559715258   6.9105198521  22.1746011766
 FLUORINE    9.0   5.8455988240   5.1957182885  23.0296196726
 CARBON      6.0   7.6418735826   2.1732663119  16.7663636854
 CARBON      6.0   6.3910488857   1.8518385954  16.3063208287
 CARBON      6.0   8.8417049356   1.2443576492  16.7001587395
 CARBON      6.0   6.0232071075   0.4940281964  15.7351792136
 CARBON      6.0   4.1288361674   5.3066236659  20.3512668012
 CARBON      6.0   5.3747193873   5.7159036535  20.7514082208
 CARBON      6.0   2.9007274452   5.3180135016  21.2426525584
 CARBON      6.0   5.7024203684   6.2124662475  22.1483988647
 CARBON      6.0   5.3190614021   7.4034151598  15.1297202687
 CARBON      6.0   6.5982269835   7.7062300006  15.5200187491
 FLUORINE    9.0   8.5364306907  -0.0238472200  17.0558426396
 CARBON      6.0   4.5748883318   8.1326876115  14.0248882589
 FLUORINE    9.0   4.7379257177   9.4734646857  14.0919813199
 FLUORINE    9.0   4.9873657426   7.7398577115  12.7976197748
 FLUORINE    9.0   3.2469470625   7.9053664599  14.0777358797
 CARBON      6.0   7.4147633291   8.8527083808  14.9501654595
 FLUORINE    9.0   7.0294168591  10.0458152013  15.4588563333
 FLUORINE    9.0   8.7276249402   8.7205894905  15.2268083921
 FLUORINE    9.0   7.3114031179   8.9380862348  13.6046918391

What I need is the text under the line
Code:
      ***** EQUILIBRIUM GEOMETRY LOCATED *****

until the next blank line i.e, the line right after
Code:
 FLUORINE    9.0   7.3114031179   8.9380862348  13.6046918391

. Also, each of the columns of text should be separated by spaces or tabs i.e. the first column should be the atom name "MOLYBEDUM" the second column should be the atomic number "42.0" etc...


Thanks in advance

Last edited by marcozd; 01-25-2011 at 04:15 PM..
# 2  
Old 01-25-2011
Ok, I can do the first thing...
Code:
 awk '{ if($0 ~ /EQUILIBRIUM GEOMETRY LOCATED/) { while($0 !~ /^\S*$/ && getline) { print $0; } } }' filename.txt

However I dont understand what you mean by "each of the columns of text should be seperated"....by what ? tabs, spaces (justified ?), commas ?

Cheers...
# 3  
Old 01-25-2011
It is always helpful to know what system you're on, so getting in the habit of posting it would be good so we don't have to ask... Smilie grep -m makes starting at a certain point easy but AIX and Solaris may not have it.
# 4  
Old 01-25-2011
Using sed:

Code:
sed -n '/EQUILIBRIUM GEOMETRY LOCATED/,/^ *$/p' infile

# 5  
Old 01-25-2011
Quote:
Originally Posted by Corona688
It is always helpful to know what system you're on, so getting in the habit of posting it would be good so we don't have to ask... Smilie grep -m makes starting at a certain point easy but AIX and Solaris may not have it.
uname -a gives:

Linux 2.6.18-194.21.1.e15 x86_64 GNU/LINUX
# 6  
Old 01-25-2011
A perl solution that includes some column spacing:
Code:
$
$ cat form
#! /usr/bin/perl -wn
BEGIN {$state="skip";};
/^-+$/           and do {$state="proc"; print; next LINE};
/EQUIL/          and do {$state="copy";};
$state eq "skip" and do {next LINE;};
/^\s*$/          and do {last LINE;};
$state eq "copy" and do {print ; next LINE;};
$state eq "proc" and do { printf "%-10s   %6s  %16s %16s %16s\n", split(" ", $_);};
$
$
$
$ ./form < datafile | head
***** EQUILIBRIUM GEOMETRY LOCATED *****
COORDINATES OF ALL ATOMS ARE (ANGS)
ATOM CHARGE X Y Z
------------------------------------------------------------
MOLYBDENUM     42.0      5.9067578125     5.0087332497    17.4699146400
SULFUR         16.0      7.9742837782     3.7588015097    17.3910898169
SULFUR         16.0      5.0973219622     3.0091611327    16.3724427108
SULFUR         16.0      3.8536412225     4.7600928861    18.7261323168
SULFUR         16.0      6.7241053728     5.6252659948    19.6631739883
$


Last edited by Perderabo; 01-25-2011 at 04:13 PM.. Reason: Remove that "NOW COPY" debugging statement.
# 7  
Old 01-25-2011
Quote:
Originally Posted by citaylor
Ok, I can do the first thing...
Code:
 awk '{ if($0 ~ /EQUILIBRIUM GEOMETRY LOCATED/) { while($0 !~ /^\S*$/ && getline) { print $0; } } }' filename.txt

However I dont understand what you mean by "each of the columns of text should be seperated"....by what ? tabs, spaces (justified ?), commas ?

Cheers...

Hi

Thanks very much for your reply but it doesn't seem to work for me.

Nothing happens when I type in this line of code. The sed version doesn't do anything either.

Thanks though

---------- Post updated at 03:12 PM ---------- Previous update was at 03:03 PM ----------

Quote:
Originally Posted by Perderabo
A perl solution that includes some column spacing:
Code:
$
$ cat form
#! /usr/bin/perl -wn
BEGIN {$state="skip";};
/^-+$/           and do {$state="proc"; print; next LINE};
/EQUIL/          and do {print "NOW COPY\n";$state="copy";};
$state eq "skip" and do {next LINE;};
/^\s*$/          and do {last LINE;};
$state eq "copy" and do {print ; next LINE;};
$state eq "proc" and do { printf "%-10s   %6s  %16s %16s %16s\n", split(" ", $_);};
$
$
$
$ ./form < datafile | head
NOW COPY
***** EQUILIBRIUM GEOMETRY LOCATED *****
COORDINATES OF ALL ATOMS ARE (ANGS)
ATOM CHARGE X Y Z
------------------------------------------------------------
MOLYBDENUM     42.0      5.9067578125     5.0087332497    17.4699146400
SULFUR         16.0      7.9742837782     3.7588015097    17.3910898169
SULFUR         16.0      5.0973219622     3.0091611327    16.3724427108
SULFUR         16.0      3.8536412225     4.7600928861    18.7261323168
SULFUR         16.0      6.7241053728     5.6252659948    19.6631739883
$


How do I use this? What part do I place in the shell script to make this usable?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract a block of text

Hello all, I am working on a script which should parse a large file called input.txt which contains table definitions, index definitions and comments like these ones: ------------------------------------------------ -- DDL Statements for table "CMWSYS"."CMWD_TEC_SUIVI_TRT"... (12 Replies)
Discussion started by: kiki_riki_miki
12 Replies

2. Shell Programming and Scripting

Grepping text block by block by using for loop

Hei buddies, Need ur help once again. I have a file which has bunch of lines which starts from a fixed pattern and ends with another fixed pattern. I want to make use of these fixed starting and ending patterns to select the bunch, one at a time. The input file is as follows. Hi welcome... (12 Replies)
Discussion started by: anushree.a
12 Replies

3. Shell Programming and Scripting

How to extract block from a file?

I have siebel log file as following EventContext ....... 123 ....... SELECT ... .. EventConext <---- Question 1 , I should get this line 345 ...... SELECT <----- Question 2 , print this line Test..... <----- Question 2 , print this line .... <----- Question 2 , print... (5 Replies)
Discussion started by: ran123
5 Replies

4. Shell Programming and Scripting

[Awk] Extract block of with a particular pattern

Hi, I have some CVS log files, which are divided into blocks. Each block has many fields of information and I want to extract those blocks with a pattern. Here is the sample input. RCS file: /cvsroot/eclipse/org.eclipse.debug.core/core/org/eclipse/debug/core/DebugPlugin.java,v head: 1.174... (7 Replies)
Discussion started by: sandeepk1611
7 Replies

5. Shell Programming and Scripting

Extract selective block from XML file

Hi, There's an xml file produced from a front-end tool as shown below: <INPUT DATABASE ="ORACLE" DBNAME ="UNIX" NAME ="FACT_TABLE" OWNERNAME ="DIPS"> <INPUTFIELD DATATYPE ="double" DEFAULTVALUE ="" DESCRIPTION ="" NAME ="STORE_KEY" PICTURETEXT ="" PORTTYPE ="INPUT" PRECISION ="15" SCALE... (6 Replies)
Discussion started by: dips_ag
6 Replies

6. Shell Programming and Scripting

Extract value from a text

Hi all, my problem is extract a value from a text, i mean, I have this text: > ala Nr of active alarms are: 16 ================================================================================================ Sever Specific Problem Cause Mo-Reference... (15 Replies)
Discussion started by: marimovo
15 Replies

7. Shell Programming and Scripting

Extract block of data and the error reason too. So so urgent

Hi , this is my first enty in our forum. Problem scenario: Using informatica tool am loding records from source DB to target DB. While loading some records getting rejected due to some reason. Informatica will capture those rejected records in session log file.now the session log ll be... (2 Replies)
Discussion started by: Gopal_Engg
2 Replies

8. Shell Programming and Scripting

Extract particular text

I executed a following sed command => echo "a/b/c/d/e/f/g/h" | sed 's/\/*$//g' a/b/c/d/e/f/g Now what if I want to extract "g" from "a/b/c/d/e/f/g/h" . That is second last string using SED. (4 Replies)
Discussion started by: Shell_Learner
4 Replies

9. Programming

c program to extract text between two delimiters from some text file

needa c program to extract text between two delimiters from some text file. and then storing them in to diffrent variables ? text file like 0: abc.txt ========= aaaaaa|11111111|sssssssssss|333333|ddddddddd|34343454564|asass aaaaaa|11111111|sssssssssss|333333|ddddddddd|34343454564|asass... (7 Replies)
Discussion started by: kukretiabhi13
7 Replies

10. UNIX for Dummies Questions & Answers

extract block in file

I need to extract a particular block from a file whose locations are not known but the only identity is a word. For example in a file I have ABC asdklf asdfk FGH dfdfg asdlfk asdfl ... JHK (5 Replies)
Discussion started by: sskb
5 Replies
Login or Register to Ask a Question