The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
Google UNIX.COM


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
ld: fatal: relocations remain against allocatable but non-writable sections tdallagn SUN Solaris 0 05-21-2008 04:58 AM
extract multiple sections of a file rgentis UNIX for Advanced & Expert Users 1 03-18-2008 05:40 PM
retrieved multiple lines on multiple places in a file dala Shell Programming and Scripting 8 03-14-2008 12:28 PM
Handle Configuration File with same name of Parameter in multiple Sections potro Shell Programming and Scripting 7 03-05-2008 08:36 AM
extract one file form .tar.gz without uncompressing .tar.gz file balireddy_77 Shell Programming and Scripting 2 07-10-2007 01:23 AM

Reply
 
Submit Tools LinkBack Thread Tools Search this Thread Display Modes
  #1  
Old 03-18-2008
Registered User
 

Join Date: Mar 2008
Posts: 4
extract multiple sections of file

I have a file that I need to parse multiple sections from the file.

The file contains multiple lines that start with ST (Abunch of data)
Then the file contains multiple lines that start with SE (Abunch of data)

SE*30*0001
ST*810*0002

I need all of the lines between and including these.
They are invoices.
The invoice starts with the ST line and ends with the SE line.

I need to break out all of the invoices into separate files.

Can someone please help me. I know Grep, sed, or AWK can do this, but not sure how.
Thank you


Here is an example:
ST*810*0001
BIG*20080315*1220680417**SUPPLY***DI
N1*SF*MCLANE HIGH PLAINS*92*46120004
N1*ST*SWC 7-11 #57134*91*571315
N3*2712 E 8TH ST
N4*ODESSA*TX*79761
REF*ST*000134
ITD*05*3*****7*****NET 7
IT1**1*CA*20.09**CB*649251*PI*093*UP*099299711018*RA*NA
TXI*ZZ*1.53****2
CTP**RES*0***CSR*1
PID*F****7-11 T-SHIRT BAG 1/7 BBL
PO4*1000
IT1**1*EA*33.72**CB*834861*PI*093*UP*012253022401*RA*NA
TXI*ZZ*2.57****2
CTP**RES*0***CSR*1
PID*F****KIT CONCRETE CHAMP
PO4*1
IT1**1*EA*0.03**CB*192849*PI*093*UP*000000192842*RA*NA
CTP**RES*0***CSR*1
PID*F****SCS 711 BK 200
PO4*1
IT1**30*EA*2.59**CB*001511*PI*093*UP*025215102776*RA*NA
CTP**RES*0***CSR*1
PID*F****MAXELL T-160 PLUS VIDEO
PO4*1
TDS*18454
SAC*C*G740***5300*******06***SERVICE
CTT*4
SE*30*0001
Reply With Quote
Forum Sponsor
  #2  
Old 03-18-2008
Registered User
 

Join Date: Oct 2007
Location: USA
Posts: 567
Code:
awk '/^ST/,/^SE/' file
Reply With Quote
  #3  
Old 03-18-2008
Registered User
 

Join Date: Mar 2008
Posts: 4
Thank you for your prompt response.

It did what I wanted. However the three sections need to be parsed to to different files.

So you have
ST
data
SE
This should be taken to file 1
ST
data
SE
This should be taken to file 2

ETC.....

Also I noticed that the ST and SE are numbered.

ST*810*0004
Then
SE*(Number)*0004
Thank you

Last edited by rgentis; 03-18-2008 at 05:07 PM. Reason: Added something
Reply With Quote
  #4  
Old 03-18-2008
Registered User
 

Join Date: Jun 2007
Location: Beijing China
Posts: 488
nawk 'BEGIN{n=1}
$0 ~ /^ST/ {f=1}
$0 ~ /^SE/ {invoice[n]=sprintf("%s\n%s",invoice[n],$0);f=0;n=n+1}
{
if (f==1)
invoice[n]=sprintf("%s\n%s",invoice[n],$0)
}
END{
for (i in invoice)
print invoice[i] >> i
close(i)
}' filename
Reply With Quote
  #5  
Old 03-18-2008
drl's Avatar
drl drl is offline
Registered User
 

Join Date: Apr 2007
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 550
Hi.

An alternate awk solution:
Code:
#!/usr/bin/env sh

# @(#) s1       Demonstrate extraction of range to separate files.

#  ____
# /
# |   Infrastructure BEGIN

echo
set -o nounset

debug=":"
debug="echo"

## The shebang using "env" line is designed for portability. For
#  higher security, use:
#
#  #!/bin/sh -

## Use local command version for the commands in this demonstration.

set +o nounset
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1) awk my-nl
set -o nounset

# Use nawk or /usr/xpg4/bin/awk on Solaris.

echo

FILE=${1-data1}
echo " Input file $FILE:"
cat $FILE

# |   Infrastructure END
# \
#  ---

echo
echo " Results from processing:"
awk '
BEGIN   { i = 0 }
/ST/            { i++ ; name = "file" i }
/ST/,/SE/       { print > name }
' $FILE

my-nl file?

exit 0
Producing:
Code:
% ./s1

(Versions displayed with local utility "version")
Linux 2.6.11-x1
GNU bash, version 2.05b.0(1)-release (i386-pc-linux-gnu)
GNU Awk 3.1.4
my-nl (local) 296

 Input file data1:
ST
first invoice
SE
ST
second invoice
SE
ST
third invoice
SE

 Results from processing:

==> file1 <==

  1 ST
  2 first invoice
  3 SE

==> file2 <==

  1 ST
  2 second invoice
  3 SE

==> file3 <==

  1 ST
  2 third invoice
  3 SE
Choose the base file name you wish in variable "name" ... cheers, drl
Reply With Quote
  #6  
Old 03-18-2008
Registered User
 

Join Date: Jan 2008
Posts: 8
extract multiple sections of file

#-- Use ST values as output filename.
awk -v out="/dev/null" '
/^ST/ {gsub("\\*","-",$0); out=$0".txt"}
/^SE/ { close(out) }
{ printf "%s\n",$0 >> out }
' $INFILE

Output will be
ST-810-0001.txt
so on ...

-Ramesh
Reply With Quote
  #7  
Old 03-24-2008
Registered User
 

Join Date: Mar 2008
Posts: 4
I wanted to thank all of you for your response.

One issue, I am porting the awk utility to windows. So I do not think all of the functionality is there.
For instance when I used Ramesh's example, I received numerous errors.
Here is the code:

c:\tools\gnuwin32\bin\awk -v '/^ST/ {gsub("\\*","-",$0); out=$0".txt"}
/^SE/ { close(out) }
{ printf "%s\n",$0 >> out }
' %input%edifile.dat

Here is the result:
awk: `/ST/' argument to `-v' not in `var=value' form

Usage: awk [POSIX or GNU style options] -f progfile [--] file .
Usage: awk [POSIX or GNU style options] [--] 'program' file ...
POSIX options: GNU long options:
-f progfile --file=progfile
-F fs --field-separator=fs
-v var=val --assign=var=val
-m[fr] val
-W compat --compat
-W copyleft --copyleft
-W copyright --copyright
-W dump-variables[=file] --dump-variables[=file]
-W exec=file --exec=file
-W gen-po --gen-po
-W help --help
-W lint[=fatal] --lint[=fatal]
-W lint-old --lint-old
-W non-decimal-data --non-decimal-data
-W profile[=file] --profile[=file]
-W posix --posix
-W re-interval --re-interval
-W source=program-text --source=program-text
-W traditional --traditional
-W usage --usage
-W use-lc-numeric --use-lc-numeric
-W version --version

To report bugs, see node `Bugs' in `gawk.info', which is
section `Reporting Problems and Bugs' in the printed version.

gawk is a pattern scanning and processing language.
By default it reads standard input and writes standard output.

Examples:
gawk '{ sum += $1 }; END { print sum }' file
gawk -F: '{ print $1 }' /etc/passwd
'/SE/' is not recognized as an internal or external command,
operable program or batch file.
'{' is not recognized as an internal or external command,
operable program or batch file.
''' is not recognized as an internal or external command,
operable program or batch file.
C:\tools>edi
awk: `'/ST/' argument to `-v' not in `var=value' form

Usage: awk [POSIX or GNU style options] -f progfile [--] file .
Usage: awk [POSIX or GNU style options] [--] 'program' file ...
POSIX options: GNU long options:
-f progfile --file=progfile
-F fs --field-separator=fs
-v var=val --assign=var=val
-m[fr] val
-W compat --compat
-W copyleft --copyleft
-W copyright --copyright
-W dump-variables[=file] --dump-variables[=file]
-W exec=file --exec=file
-W gen-po --gen-po
-W help --help
-W lint[=fatal] --lint[=fatal]
-W lint-old --lint-old
-W non-decimal-data --non-decimal-data
-W profile[=file] --profile[=file]
-W posix --posix
-W re-interval --re-interval
-W source=program-text --source=program-text
-W traditional --traditional
-W usage --usage
-W use-lc-numeric --use-lc-numeric
-W version --version

To report bugs, see node `Bugs' in `gawk.info', which is
section `Reporting Problems and Bugs' in the printed version.

gawk is a pattern scanning and processing language.
By default it reads standard input and writes standard output.

Examples:
gawk '{ sum += $1 }; END { print sum }' file
gawk -F: '{ print $1 }' /etc/passwd
'/SE/' is not recognized as an internal or external command,
operable program or batch file.
'{' is not recognized as an internal or external command,
operable program or batch file.
''' is not recognized as an internal or external command,
operable program or batch file.


Thank you again for your help.
Reply With Quote
Google The UNIX and Linux Forums
Reply

Tags
linux, solaris

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes




All times are GMT -7. The time now is 07:56 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008. All Rights Reserved.Ad Management by RedTyger Visit The Complex Event Processing Blog

Content Relevant URLs by vBSEO 3.2.0