Using awk to find next word available

02-08-2010

Registered User

645, 19

Join Date: May 2008

Last Activity: 7 August 2017, 4:42 AM EDT

Location: Amman, Jordan

Posts: 645

Thanks Given: 2

Thanked 19 Times in 19 Posts

try using perl as below:-

Code:

perl  -wnl -e ' /Start/ ... /[a-zA-Z]/ and /\d+\s+\d+/ and print  ;'  infile.txt

---------- Post updated at 18:37 ---------- Previous update was at 18:33 ----------

or even better

Code:

perl  -wnl -e ' /Start/ ... /^\D+/ and /\d+\s+\d+/ and print  ;'  infile.txt

ahmad.diab

View Public Profile for ahmad.diab

Find all posts by ahmad.diab

02-08-2010

Moderator

12,296, 3,792

Join Date: Nov 2008

Last Activity: 1 January 2021, 1:47 AM EST

Location: Amsterdam

Posts: 12,296

Thanks Given: 679

Thanked 3,792 Times in 3,282 Posts

NickC, in you sample the first character of the lines that need to be printed are starting with a digit. I am testing for that: /^[a-zA-Z]/ and /^[^0-9]/ mean testing for the case that the first character is a letter or not a digit respectively.

IMO, the construct that you suggest would not work since it would print the labels as well and not every occurence.

Scrutinizer

View Public Profile for Scrutinizer

Find all posts by Scrutinizer

02-08-2010

Registered User

13, 0

Join Date: Dec 2005

Last Activity: 26 July 2010, 9:30 AM EDT

Location: Oregon, U.S.A.

Posts: 13

Thanks Given: 0

Thanked 0 Times in 0 Posts

ahmad.diab: the perl script worked well, but only printed out the first instance of the number string after /Start/

Scrutinizer: I'm sorry - what I failed to mention was that when the number string (that includes the exponents) was finished, the next word does not always start with the first character. There might be one or two blank characters before the letters begin. I don't know if your code is looking for something as the first character.

This is why I thought it might be easier to start the search with /Start/ and stop the next time awk encounters a line that does NOT contain the string "E+" in it - and then print out all the lines in between!

And then repeat this throughout the file.

For more clarity, I generated some output files. So here are a couple of samples (in red color are the lines I want printed):

Code:

               EPSILON              EXTERNAL WORK      START
                1         -5.1077400E-06          2.3338035E+03
                2         -1.2286651E-05          2.6901846E+03
                3         -2.4254409E-06          4.0875334E+03   
 *** USER INFORMATION MESSAGE 4114 (OUTPX2)
     DATA BLOCK OUGV1    WRITTEN ON FORTRAN UNIT 12, TRL =

Code:

         EPSILON              EXTERNAL WORK      START
               1         -4.3371111E-09          8.6807975E+05 
1    OUTPUT FILES                  PAGE    22

0     BASELINE                                                                                   
 *** SYSTEM INFORMATION MESSAGE 6916 (DFMSYN)
     DECOMP ORDERING
 *** USER INFORMATION MESSAGE 5293 (SSG3A)
    FOR DATA BLOCK KLL
 EPSILON              EXTERNAL WORK      START
                2         -3.1694217E-10          2.4290496E+07 
1    OUTPUT FILES                         PAGE    23

0     BASELINE                                                                            
 *** SYSTEM INFORMATION MESSAGE 6916 (DFMSYN)
     DECOMP ORDERING 
 *** USER INFORMATION MESSAGE 5293 (SSG3A)
    FOR DATA BLOCK KLL
 EPSILON              EXTERNAL WORK      START
                3         -1.1569729E-09          1.5256892E+07 
1    OUTPUT FILES            PAGE    24

0     BASELINE                                                                            
 *** SYSTEM INFORMATION MESSAGE 6916 (DFMSYN)
     DECOMP ORDERING

It is the differing formats that made it difficult for me. Initially, I had it working with a simple awk script that just searched between /START/ and /USER/ and printed the lines in between. But then I came across an output format like the second example! Thanks again for everyone's patience!

Last edited by NickC; 02-08-2010 at 02:05 PM..

NickC

View Public Profile for NickC

Find all posts by NickC

02-08-2010

Moderator

12,296, 3,792

Join Date: Nov 2008

Last Activity: 1 January 2021, 1:47 AM EST

Location: Amsterdam

Posts: 12,296

Thanks Given: 679

Thanked 3,792 Times in 3,282 Posts

That is quite different from your original input spec.

Try this:

Code:

awk '$1=="START" {flag=1;next} !/^  / {flag=0} flag' infile

try this to remove the spacing:

Code:

awk '!/^  /{p=0}{$1=$1}p;$1=="START"{p=1}' infile

Scrutinizer

View Public Profile for Scrutinizer

Find all posts by Scrutinizer

02-08-2010

Registered User

13, 0

Join Date: Dec 2005

Last Activity: 26 July 2010, 9:30 AM EDT

Location: Oregon, U.S.A.

Posts: 13

Thanks Given: 0

Thanked 0 Times in 0 Posts

Yeah, sorry - when I posted the first thread I didn't think it made such a big difference but now I know! Live and learn...

I tried those two commands but I'm getting no output at all. Not sure if the !/^ sequence depends on the version of UNIX (like I said above, I'm using AIX and ksh.

Going back to my question above - would it be too complicated or difficult to start the search at START - print out every line that has E+ in it, and stop when a line without E+ is encountered? From what I can tell, that sequence should capture what I'm trying to get. I was trying something like :

awk '/START/ {flag=1;next} /^E\+/ {flag=0} flag {print}' <filename>

hoping that would work - where the E\+ would mean look for "E+" without attaching special character to +. And the ^ would say until E+ is no longer found. But I think you said above that this command wouldn't work. And you're right that it didn't!

But I'm not sure why.

NickC

View Public Profile for NickC

Find all posts by NickC

02-08-2010

Registered User

7,747, 559

Join Date: Feb 2007

Last Activity: 20 April 2020, 11:28 AM EDT

Location: The Netherlands

Posts: 7,747

Thanks Given: 139

Thanked 559 Times in 520 Posts

Try this one:

Code:

awk '/START/{p=1;next}p && $2 !~ /E-..$/{p=0} p' file

Franklin52

View Public Profile for Franklin52

Find all posts by Franklin52

02-08-2010

Registered User

13, 0

Join Date: Dec 2005

Last Activity: 26 July 2010, 9:30 AM EDT

Location: Oregon, U.S.A.

Posts: 13

Thanks Given: 0

Thanked 0 Times in 0 Posts

Got it - that works!!

Now I have to figure out just what you did!

Thanks again all of you for your patience and help!

Last edited by NickC; 02-08-2010 at 04:45 PM..

NickC

View Public Profile for NickC

Find all posts by NickC

Shell Programming and Scripting

Using awk to find next word available

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to search for a word in column header that fully matches the word not partially in awk?

Discussion started by: Atta

2. Shell Programming and Scripting

Find a word and increment the number in the word & save into new files

Discussion started by: jypark22

3. Shell Programming and Scripting

Find word in a line and output in which line the word occurs / no. of times it occurred

Discussion started by: anuragpgtgerman

4. Shell Programming and Scripting

Shell Script @ Find a key word and If the key word matches then replace next 7 lines only

Discussion started by: Rajeev_hbk

5. Shell Programming and Scripting

Find repeated word and take sum of the second field to it ,for all the repeated words in awk

Discussion started by: 100bees

6. Shell Programming and Scripting

awk to find lines containing word that occur multiple times

Discussion started by: SkySmart

7. Shell Programming and Scripting

perl lwp find word and print next word :)

Discussion started by: vogueestylee

8. UNIX for Dummies Questions & Answers

Find EXACT word in files, just the word: no prefix, no suffix, no 'similar', just the word

Discussion started by: chicchan

9. Shell Programming and Scripting

Find and replace a word in all the files (that contain the word) under a directory

Discussion started by: filter

10. Shell Programming and Scripting

find a word in a file, and change a word beneath it ??

Discussion started by: vikas027