The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
ksh pattern matching ripat Shell Programming and Scripting 5 02-10-2008 04:44 PM
help need for pattern matching HIMANI UNIX for Dummies Questions & Answers 10 01-22-2008 07:30 AM
pattern matching mercuryshipzz Shell Programming and Scripting 4 01-14-2008 11:01 PM
pattern matching in an if-then lumix Shell Programming and Scripting 4 12-14-2007 04:25 PM
Pattern matching sed leemjesse Shell Programming and Scripting 3 03-23-2005 04:06 AM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 12-21-2007
smb_uk smb_uk is offline
Registered User
  
 

Join Date: Oct 2007
Posts: 7
AWK pattern matching, first and last

In a nutshell, I need to work out how to return the last matching pattern from an awk //,// search. I can bring back the first, but am unsure how to obtain the last, and a simple tail won't work as the match could be over multiple lines.

Secondly I would like some way of pattern matching, a pattern matched sequence. The program should work by returning the first section matching /*H,H## from a file, and then performing multiple search's for other tags within (#P,P# #C,C#). I did think of returning the /*H pattern into a variable, and searching that, but that converts it to a string without line breaks (I think), and I want to maintain the format of the input.

The code below works in part (albeit not as described above and the last incorrectly handled #M, M#), but there must be a much more efficient way to do it, as it's searching the full file every time.
Any general advice on optimising the code would also be useful, as I appreciate there's a lot of piping going on.

The current code, input and ouput follow:

Code:
#!/bin/ksh 

for file in `find . -type f -name "*.code"` 
do 
  echo ">>>>>>> PROGRAM: " $file 
  echo ================================================================================ 
  awk '/#P/ && ++m==1,/P#/ {print $0}' $file | sed 's/^#[A-Z#] //g' | sed 's/ [A-Z#]#$//g' 
  awk '/#A/ && ++m==1,/A#/ {print $0}' $file | sed 's/^#[A-Z#] //g' | sed 's/ [A-Z#]#$//g' 
  awk '/#C/ && ++m==1,/C#/ {print $0}' $file | sed 's/^#[A-Z#] //g' | sed 's/ [A-Z#]#$//g' 
  awk '/#I/ && ++m==1,/I#/ {print $0}' $file | sed 's/^#[A-Z#] //g' | sed 's/ [A-Z#]#$//g' 
  echo -------------------------------------------------------------------------------- 
  awk '/#D/ && ++m==1,/D#/ {print $0}' $file | sed 's/^#[A-Z#] //g' | sed 's/ [A-Z#]#$//g' 
  awk '/#M/,/M#/ {print $0}' $file | tail -1 | sed 's/^#[A-Z#] //g' | sed 's/ [A-Z#]#$//g' 
  echo 
done
Code:
/*H#############################################################################
##                                                                            ## 
#P Purpose : Creates generation datasets from the original source file        P# 
##                                                                            ## 
#A Author  : D Turpin                                                         A# 
#C Date    : 1st November 2007                                                C# 
##                                                                            ## 
#I Inputs  : data.table1                                                      ## 
##           data.table2                                                      I# 
#O Outputs : data.table3                                                      O# 
##                                                                            ## 
################################################################################ 
## Change History                                                             ## 
#D Who When       Why                                                Version  D# 
##                                                                            ## 
#M DT  01.11.2007 Initial Development                                1.00     M# 
#M DT  15.11.2007 Modified to include new reqs                       1.01     ## 
##                and other things                                            M# 
##                                                                            ## 
#############################################################################H##
*...+....1....+....2....+....3....+....4....+....5....+....6....+....7....+...*/
Code:
>>>>>>> PROGRAM:  ./header.code 
================================================================================ 
Purpose : Creates generation datasets from the original source file 
Author  : D Turpin 
Date    : 1st November 2007 
Inputs  : data.table1 
          data.table2 
-------------------------------------------------------------------------------- 
Who When       Why                                                Version 
               and other things
--------------------------------------------------------------------------------
  #2 (permalink)  
Old 12-21-2007
shamrock shamrock is offline Forum Advisor  
Registered User
  
 

Join Date: Oct 2007
Location: USA
Posts: 750
Code:
#!/bin/ksh 

for file in `find . -type f -name "*.code"` 
do
   echo ">>>>>>> PROGRAM: " $file 
   echo ================================================================================
   awk '/^#P/ || /P#/ {gsub("#P|P#|##",""); print}
        /^#A/ || /A#/ {gsub("#A|A#|##",""); print}
        /^#C/ || /C#/ {gsub("#C|C#|##",""); print}
        /^#I/ || /I#/ {gsub("#I|I#|##",""); print}
        /^#D/ || /D#/ {gsub("#D|D#|##",""); print}
        /^#M/ || /M#/ {gsub("#M|M#|##",""); print}' $file
done
  #3 (permalink)  
Old 12-22-2007
smb_uk smb_uk is offline
Registered User
  
 

Join Date: Oct 2007
Posts: 7
That's much cleaner, and I assume quicker as it's doing a single scan. I can't test this at the moment, but I assume this is addressing the single pass, and the clean, and not the returning of either first or last?
The main problem remaining, is the ability to return the last #M,M# (or nth if possible for future reference), as I want to show just the last modification to a file.

Furthermore, is there any way to contain the searching to within the top section of the file, between the /*H and H## ? as if there are many large files to document, it would make it much quicker.

Thanks for the help so far.
  #4 (permalink)  
Old 12-22-2007
Franklin52 Franklin52 is offline Forum Staff  
Moderator
  
 

Join Date: Feb 2007
Posts: 4,300
This should give the desired output without trailing spaces:


Code:
#!/bin/ksh

for file in `find . -type f -name "*.code"` 
do
  echo ">>>> Program :" $file
  echo "========================================================================="
  sed -n '/\#P /,/H..$/p' $file |
  sed 's/.*###.*/-------------------------------------------------------------------------/
  s/^#. //g
  s/.#$//g
  s/[ \t]*$//g
  /^$/d'
done

Regards
  #5 (permalink)  
Old 12-22-2007
smb_uk smb_uk is offline
Registered User
  
 

Join Date: Oct 2007
Posts: 7
This solution loses the ability to handle the tags in a custom manner, it simply displays what's already there in a different format, although maybe I should have pointed that requirement out earlier.

Also, I still have the issue of not being able to return only the last #M,M# tag.

In summary, what I want to do is get the /*H, H## section, and individually retrieve tags within this block, choosing all, the first or the last (or nth/-nth if possible) of the tag (e.g. only the modification on the 15th).
  #6 (permalink)  
Old 12-22-2007
fpmurphy's Avatar
fpmurphy fpmurphy is offline Forum Staff  
Moderator
  
 

Join Date: Dec 2003
Location: Florida
Posts: 1,917
Why not simply use CVS or some other version control software?
  #7 (permalink)  
Old 12-22-2007
shamrock shamrock is offline Forum Advisor  
Registered User
  
 

Join Date: Oct 2007
Location: USA
Posts: 750
Quote:
Originally Posted by smb_uk View Post
That's much cleaner, and I assume quicker as it's doing a single scan. I can't test this at the moment, but I assume this is addressing the single pass, and the clean, and not the returning of either first or last?
The main problem remaining, is the ability to return the last #M,M# (or nth if possible for future reference), as I want to show just the last modification to a file.

Furthermore, is there any way to contain the searching to within the top section of the file, between the /*H and H## ? as if there are many large files to document, it would make it much quicker.

Thanks for the help so far.
So out of all the M#,M# tags present you want to show just the last one as it is the last modification made to the file.

Code:
#!/bin/ksh

for file in `find . -type f -name "*.code"`
do
   echo ">>>>>>> PROGRAM: " $file
   echo ================================================================================
   awk '/^#P/ || /P#/ { gsub("#P|P#|##",""); print }
        /^#A/ || /A#/ { gsub("#A|A#|##",""); print }
        /^#C/ || /C#/ { gsub("#C|C#|##",""); print }
        /^#I/ || /I#/ { gsub("#I|I#|##",""); print }
        /^#D/ || /D#/ { gsub("#D|D#|##",""); print }
        /^#M/ || /M#/ { gsub("#M|M#|##",""); mtag[M] = $0 }
        END { print mtag[M] }' $file
done
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 08:11 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0