Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
Search Forums:



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 03-18-2008
Registered User
 

Join Date: Mar 2008
Posts: 4
Thanks: 0
Thanked 0 Times in 0 Posts
extract multiple sections of file

I have a file that I need to parse multiple sections from the file.

The file contains multiple lines that start with ST (Abunch of data)
Then the file contains multiple lines that start with SE (Abunch of data)

SE*30*0001
ST*810*0002

I need all of the lines between and including these.
They are invoices.
The invoice starts with the ST line and ends with the SE line.

I need to break out all of the invoices into separate files.

Can someone please help me. I know Grep, sed, or AWK can do this, but not sure how.
Thank you


Here is an example:
ST*810*0001
BIG*20080315*1220680417**SUPPLY***DI
N1*SF*MCLANE HIGH PLAINS*92*46120004
N1*ST*SWC 7-11 #57134*91*571315
N3*2712 E 8TH ST
N4*ODESSA*TX*79761
REF*ST*000134
ITD*05*3*****7*****NET 7
IT1**1*CA*20.09**CB*649251*PI*093*UP*099299711018*RA*NA
TXI*ZZ*1.53****2
CTP**RES*0***CSR*1
PID*F****7-11 T-SHIRT BAG 1/7 BBL
PO4*1000
IT1**1*EA*33.72**CB*834861*PI*093*UP*012253022401*RA*NA
TXI*ZZ*2.57****2
CTP**RES*0***CSR*1
PID*F****KIT CONCRETE CHAMP
PO4*1
IT1**1*EA*0.03**CB*192849*PI*093*UP*000000192842*RA*NA
CTP**RES*0***CSR*1
PID*F****SCS 711 BK 200
PO4*1
IT1**30*EA*2.59**CB*001511*PI*093*UP*025215102776*RA*NA
CTP**RES*0***CSR*1
PID*F****MAXELL T-160 PLUS VIDEO
PO4*1
TDS*18454
SAC*C*G740***5300*******06***SERVICE
CTT*4
SE*30*0001
Sponsored Links
    #2  
Old 03-18-2008
Registered User
 

Join Date: Oct 2007
Location: USA
Posts: 1,174
Thanks: 6
Thanked 69 Times in 68 Posts

Code:
awk '/^ST/,/^SE/' file

Sponsored Links
    #3  
Old 03-18-2008
Registered User
 

Join Date: Mar 2008
Posts: 4
Thanks: 0
Thanked 0 Times in 0 Posts
Thank you for your prompt response.

It did what I wanted. However the three sections need to be parsed to to different files.

So you have
ST
data
SE
This should be taken to file 1
ST
data
SE
This should be taken to file 2

ETC.....

Also I noticed that the ST and SE are numbered.

ST*810*0004
Then
SE*(Number)*0004
Thank you

Last edited by rgentis; 03-18-2008 at 07:07 PM.. Reason: Added something
    #4  
Old 03-18-2008
Registered User
 

Join Date: Jun 2007
Location: Beijing China
Posts: 1,248
Thanks: 0
Thanked 14 Times in 14 Posts
nawk 'BEGIN{n=1}
$0 ~ /^ST/ {f=1}
$0 ~ /^SE/ {invoice[n]=sprintf("%s\n%s",invoice[n],$0);f=0;n=n+1}
{
if (f==1)
invoice[n]=sprintf("%s\n%s",invoice[n],$0)
}
END{
for (i in invoice)
print invoice[i] >> i
close(i)
}' filename
Sponsored Links
    #5  
Old 03-18-2008
drl's Avatar
drl drl is offline Forum Advisor  
Registered Voter
 

Join Date: Apr 2007
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 1,278
Thanks: 9
Thanked 86 Times in 77 Posts
Hi.

An alternate awk solution:

Code:
#!/usr/bin/env sh

# @(#) s1       Demonstrate extraction of range to separate files.

#  ____
# /
# |   Infrastructure BEGIN

echo
set -o nounset

debug=":"
debug="echo"

## The shebang using "env" line is designed for portability. For
#  higher security, use:
#
#  #!/bin/sh -

## Use local command version for the commands in this demonstration.

set +o nounset
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1) awk my-nl
set -o nounset

# Use nawk or /usr/xpg4/bin/awk on Solaris.

echo

FILE=${1-data1}
echo " Input file $FILE:"
cat $FILE

# |   Infrastructure END
# \
#  ---

echo
echo " Results from processing:"
awk '
BEGIN   { i = 0 }
/ST/            { i++ ; name = "file" i }
/ST/,/SE/       { print > name }
' $FILE

my-nl file?

exit 0

Producing:

Code:
% ./s1

(Versions displayed with local utility "version")
Linux 2.6.11-x1
GNU bash, version 2.05b.0(1)-release (i386-pc-linux-gnu)
GNU Awk 3.1.4
my-nl (local) 296

 Input file data1:
ST
first invoice
SE
ST
second invoice
SE
ST
third invoice
SE

 Results from processing:

==> file1 <==

  1 ST
  2 first invoice
  3 SE

==> file2 <==

  1 ST
  2 second invoice
  3 SE

==> file3 <==

  1 ST
  2 third invoice
  3 SE

Choose the base file name you wish in variable "name" ... cheers, drl
Sponsored Links
    #6  
Old 03-18-2008
Registered User
 

Join Date: Jan 2008
Posts: 11
Thanks: 0
Thanked 0 Times in 0 Posts
extract multiple sections of file

#-- Use ST values as output filename.
awk -v out="/dev/null" '
/^ST/ {gsub("\\*","-",$0); out=$0".txt"}
/^SE/ { close(out) }
{ printf "%s\n",$0 >> out }
' $INFILE

Output will be
ST-810-0001.txt
so on ...

-Ramesh
Sponsored Links
    #7  
Old 03-24-2008
Registered User
 

Join Date: Mar 2008
Posts: 4
Thanks: 0
Thanked 0 Times in 0 Posts
I wanted to thank all of you for your response.

One issue, I am porting the awk utility to windows. So I do not think all of the functionality is there.
For instance when I used Ramesh's example, I received numerous errors.
Here is the code:

c:\tools\gnuwin32\bin\awk -v '/^ST/ {gsub("\\*","-",$0); out=$0".txt"}
/^SE/ { close(out) }
{ printf "%s\n",$0 >> out }
' %input%edifile.dat

Here is the result:
awk: `/ST/' argument to `-v' not in `var=value' form

Usage: awk [POSIX or GNU style options] -f progfile [--] file .
Usage: awk [POSIX or GNU style options] [--] 'program' file ...
POSIX options: GNU long options:
-f progfile --file=progfile
-F fs --field-separator=fs
-v var=val --assign=var=val
-m[fr] val
-W compat --compat
-W copyleft --copyleft
-W copyright --copyright
-W dump-variables[=file] --dump-variables[=file]
-W exec=file --exec=file
-W gen-po --gen-po
-W help --help
-W lint[=fatal] --lint[=fatal]
-W lint-old --lint-old
-W non-decimal-data --non-decimal-data
-W profile[=file] --profile[=file]
-W posix --posix
-W re-interval --re-interval
-W source=program-text --source=program-text
-W traditional --traditional
-W usage --usage
-W use-lc-numeric --use-lc-numeric
-W version --version

To report bugs, see node `Bugs' in `gawk.info', which is
section `Reporting Problems and Bugs' in the printed version.

gawk is a pattern scanning and processing language.
By default it reads standard input and writes standard output.

Examples:
gawk '{ sum += $1 }; END { print sum }' file
gawk -F: '{ print $1 }' /etc/passwd
'/SE/' is not recognized as an internal or external command,
operable program or batch file.
'{' is not recognized as an internal or external command,
operable program or batch file.
''' is not recognized as an internal or external command,
operable program or batch file.
C:\tools>edi
awk: `'/ST/' argument to `-v' not in `var=value' form

Usage: awk [POSIX or GNU style options] -f progfile [--] file .
Usage: awk [POSIX or GNU style options] [--] 'program' file ...
POSIX options: GNU long options:
-f progfile --file=progfile
-F fs --field-separator=fs
-v var=val --assign=var=val
-m[fr] val
-W compat --compat
-W copyleft --copyleft
-W copyright --copyright
-W dump-variables[=file] --dump-variables[=file]
-W exec=file --exec=file
-W gen-po --gen-po
-W help --help
-W lint[=fatal] --lint[=fatal]
-W lint-old --lint-old
-W non-decimal-data --non-decimal-data
-W profile[=file] --profile[=file]
-W posix --posix
-W re-interval --re-interval
-W source=program-text --source=program-text
-W traditional --traditional
-W usage --usage
-W use-lc-numeric --use-lc-numeric
-W version --version

To report bugs, see node `Bugs' in `gawk.info', which is
section `Reporting Problems and Bugs' in the printed version.

gawk is a pattern scanning and processing language.
By default it reads standard input and writes standard output.

Examples:
gawk '{ sum += $1 }; END { print sum }' file
gawk -F: '{ print $1 }' /etc/passwd
'/SE/' is not recognized as an internal or external command,
operable program or batch file.
'{' is not recognized as an internal or external command,
operable program or batch file.
''' is not recognized as an internal or external command,
operable program or batch file.


Thank you again for your help.
Sponsored Links
Closed Thread

Tags
linux, linux commands, solaris

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
ld: fatal: relocations remain against allocatable but non-writable sections tdallagn Solaris 0 05-21-2008 07:58 AM
extract multiple sections of a file rgentis UNIX for Advanced & Expert Users 1 03-18-2008 07:40 PM
retrieved multiple lines on multiple places in a file dala Shell Programming and Scripting 8 03-14-2008 02:28 PM
Handle Configuration File with same name of Parameter in multiple Sections potro Shell Programming and Scripting 7 03-05-2008 10:36 AM
extract one file form .tar.gz without uncompressing .tar.gz file balireddy_77 Shell Programming and Scripting 2 07-10-2007 04:23 AM



All times are GMT -4. The time now is 05:12 AM.