Home
Man
Search
Today's Posts
Register

BSD, Linux, and UNIX shell scripting Post awk, bash, csh, ksh, perl, php, python, sed, sh, shell scripts, and other shell scripting languages questions here.

extract multiple sections of file

Tags
extract, file, linux, linux commands, multiple, shell scripts, solaris

Login to Reply

 
Thread Tools Search this Thread
# 1  
Old 03-18-2008
extract multiple sections of file

I have a file that I need to parse multiple sections from the file.

The file contains multiple lines that start with ST (Abunch of data)
Then the file contains multiple lines that start with SE (Abunch of data)

SE*30*0001
ST*810*0002

I need all of the lines between and including these.
They are invoices.
The invoice starts with the ST line and ends with the SE line.

I need to break out all of the invoices into separate files.

Can someone please help me. I know Grep, sed, or AWK can do this, but not sure how.
Thank you


Here is an example:
ST*810*0001
BIG*20080315*1220680417**SUPPLY***DI
N1*SF*MCLANE HIGH PLAINS*92*46120004
N1*ST*SWC 7-11 #57134*91*571315
N3*2712 E 8TH ST
N4*ODESSA*TX*79761
REF*ST*000134
ITD*05*3*****7*****NET 7
IT1**1*CA*20.09**CB*649251*PI*093*UP*099299711018*RA*NA
TXI*ZZ*1.53****2
CTP**RES*0***CSR*1
PID*F****7-11 T-SHIRT BAG 1/7 BBL
PO4*1000
IT1**1*EA*33.72**CB*834861*PI*093*UP*012253022401*RA*NA
TXI*ZZ*2.57****2
CTP**RES*0***CSR*1
PID*F****KIT CONCRETE CHAMP
PO4*1
IT1**1*EA*0.03**CB*192849*PI*093*UP*000000192842*RA*NA
CTP**RES*0***CSR*1
PID*F****SCS 711 BK 200
PO4*1
IT1**30*EA*2.59**CB*001511*PI*093*UP*025215102776*RA*NA
CTP**RES*0***CSR*1
PID*F****MAXELL T-160 PLUS VIDEO
PO4*1
TDS*18454
SAC*C*G740***5300*******06***SERVICE
CTT*4
SE*30*0001
# 2  
Old 03-18-2008
Code:
awk '/^ST/,/^SE/' file

# 3  
Old 03-18-2008
Thank you for your prompt response.

It did what I wanted. However the three sections need to be parsed to to different files.

So you have
ST
data
SE
This should be taken to file 1
ST
data
SE
This should be taken to file 2

ETC.....

Also I noticed that the ST and SE are numbered.

ST*810*0004
Then
SE*(Number)*0004
Thank you

Last edited by rgentis; 03-18-2008 at 09:07 PM.. Reason: Added something
# 4  
Old 03-18-2008
nawk 'BEGIN{n=1}
$0 ~ /^ST/ {f=1}
$0 ~ /^SE/ {invoice[n]=sprintf("%s\n%s",invoice[n],$0);f=0;n=n+1}
{
if (f==1)
invoice[n]=sprintf("%s\n%s",invoice[n],$0)
}
END{
for (i in invoice)
print invoice[i] >> i
close(i)
}' filename
# 5  
Old 03-19-2008
Hi.

An alternate awk solution:
Code:
#!/usr/bin/env sh

# @(#) s1       Demonstrate extraction of range to separate files.

#  ____
# /
# |   Infrastructure BEGIN

echo
set -o nounset

debug=":"
debug="echo"

## The shebang using "env" line is designed for portability. For
#  higher security, use:
#
#  #!/bin/sh -

## Use local command version for the commands in this demonstration.

set +o nounset
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1) awk my-nl
set -o nounset

# Use nawk or /usr/xpg4/bin/awk on Solaris.

echo

FILE=${1-data1}
echo " Input file $FILE:"
cat $FILE

# |   Infrastructure END
# \
#  ---

echo
echo " Results from processing:"
awk '
BEGIN   { i = 0 }
/ST/            { i++ ; name = "file" i }
/ST/,/SE/       { print > name }
' $FILE

my-nl file?

exit 0

Producing:
Code:
% ./s1

(Versions displayed with local utility "version")
Linux 2.6.11-x1
GNU bash, version 2.05b.0(1)-release (i386-pc-linux-gnu)
GNU Awk 3.1.4
my-nl (local) 296

 Input file data1:
ST
first invoice
SE
ST
second invoice
SE
ST
third invoice
SE

 Results from processing:

==> file1 <==

  1 ST
  2 first invoice
  3 SE

==> file2 <==

  1 ST
  2 second invoice
  3 SE

==> file3 <==

  1 ST
  2 third invoice
  3 SE

Choose the base file name you wish in variable "name" ... cheers, drl
# 6  
Old 03-19-2008
extract multiple sections of file

#-- Use ST values as output filename.
awk -v out="/dev/null" '
/^ST/ {gsub("\\*","-",$0); out=$0".txt"}
/^SE/ { close(out) }
{ printf "%s\n",$0 >> out }
' $INFILE

Output will be
ST-810-0001.txt
so on ...

-Ramesh
# 7  
Old 03-25-2008
I wanted to thank all of you for your response.

One issue, I am porting the awk utility to windows. So I do not think all of the functionality is there.
For instance when I used Ramesh's example, I received numerous errors.
Here is the code:

c:\tools\gnuwin32\bin\awk -v '/^ST/ {gsub("\\*","-",$0); out=$0".txt"}
/^SE/ { close(out) }
{ printf "%s\n",$0 >> out }
' %input%edifile.dat

Here is the result:
awk: `/ST/' argument to `-v' not in `var=value' form

Usage: awk [POSIX or GNU style options] -f progfile [--] file .
Usage: awk [POSIX or GNU style options] [--] 'program' file ...
POSIX options: GNU long options:
-f progfile --file=progfile
-F fs --field-separator=fs
-v var=val --assign=var=val
-m[fr] val
-W compat --compat
-W copyleft --copyleft
-W copyright --copyright
-W dump-variables[=file] --dump-variables[=file]
-W exec=file --exec=file
-W gen-po --gen-po
-W help --help
-W lint[=fatal] --lint[=fatal]
-W lint-old --lint-old
-W non-decimal-data --non-decimal-data
-W profile[=file] --profile[=file]
-W posix --posix
-W re-interval --re-interval
-W source=program-text --source=program-text
-W traditional --traditional
-W usage --usage
-W use-lc-numeric --use-lc-numeric
-W version --version

To report bugs, see node `Bugs' in `gawk.info', which is
section `Reporting Problems and Bugs' in the printed version.

gawk is a pattern scanning and processing language.
By default it reads standard input and writes standard output.

Examples:
gawk '{ sum += $1 }; END { print sum }' file
gawk -F: '{ print $1 }' /etc/passwd
'/SE/' is not recognized as an internal or external command,
operable program or batch file.
'{' is not recognized as an internal or external command,
operable program or batch file.
''' is not recognized as an internal or external command,
operable program or batch file.
C:\tools>edi
awk: `'/ST/' argument to `-v' not in `var=value' form

Usage: awk [POSIX or GNU style options] -f progfile [--] file .
Usage: awk [POSIX or GNU style options] [--] 'program' file ...
POSIX options: GNU long options:
-f progfile --file=progfile
-F fs --field-separator=fs
-v var=val --assign=var=val
-m[fr] val
-W compat --compat
-W copyleft --copyleft
-W copyright --copyright
-W dump-variables[=file] --dump-variables[=file]
-W exec=file --exec=file
-W gen-po --gen-po
-W help --help
-W lint[=fatal] --lint[=fatal]
-W lint-old --lint-old
-W non-decimal-data --non-decimal-data
-W profile[=file] --profile[=file]
-W posix --posix
-W re-interval --re-interval
-W source=program-text --source=program-text
-W traditional --traditional
-W usage --usage
-W use-lc-numeric --use-lc-numeric
-W version --version

To report bugs, see node `Bugs' in `gawk.info', which is
section `Reporting Problems and Bugs' in the printed version.

gawk is a pattern scanning and processing language.
By default it reads standard input and writes standard output.

Examples:
gawk '{ sum += $1 }; END { print sum }' file
gawk -F: '{ print $1 }' /etc/passwd
'/SE/' is not recognized as an internal or external command,
operable program or batch file.
'{' is not recognized as an internal or external command,
operable program or batch file.
''' is not recognized as an internal or external command,
operable program or batch file.


Thank you again for your help.
Login to Reply

« Previous Thread | Next Thread »
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Extract certain sections of a line senormarquez Shell Programming and Scripting 6 11-20-2013 05:38 PM
Extract a pattern from multiple lines in a file Viernes Shell Programming and Scripting 4 01-09-2013 06:12 AM
Omitting sections of file that contain word SkySmart Shell Programming and Scripting 5 10-03-2012 02:02 PM
Extract columns from multiple files with a file name as heading Unilearn UNIX for Dummies Questions & Answers 1 08-24-2011 06:49 AM
Extract strings from multiple lines into one csv file satish.vampire Shell Programming and Scripting 8 05-19-2011 09:37 AM
Extract strings from multiple lines into one file - satish.vampire Shell Programming and Scripting 5 03-21-2011 02:38 PM
awk removing sections of a file BeefStu Shell Programming and Scripting 5 08-05-2010 03:25 PM
extract different sections of a file raghu_shekar Programming 1 06-29-2010 04:46 AM
Help please, extract multiple lines from a text file johnshembb UNIX for Dummies Questions & Answers 6 04-14-2010 10:02 AM
How to edit file sections that cross multiple lines? Narnie Shell Programming and Scripting 8 01-29-2010 12:30 PM
Modify sections of the line in a file chiru_h Shell Programming and Scripting 4 07-30-2009 04:31 AM
Extract multiple repeated data from a text file apjneeraj Shell Programming and Scripting 5 04-14-2009 03:40 AM
Parsing file, yaml file? Extracting specific sections Rhije Shell Programming and Scripting 3 01-22-2009 06:36 PM
extract multiple sections of a file rgentis UNIX for Advanced & Expert Users 1 03-18-2008 09:40 PM
Handle Configuration File with same name of Parameter in multiple Sections potro Shell Programming and Scripting 7 03-05-2008 11:36 AM


All times are GMT -4. The time now is 04:06 PM.

Unix & Linux Forums Content Copyright 1993-2018. All Rights Reserved.
UNIX.COM Login
Username:
Password:  
Show Password