![]() |
|
|
|
|
|||||||
| Forums | Portal | Register | Forum Rules | FAQ | Contribute | Members List | Arcade | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here. |
|
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| ld: fatal: relocations remain against allocatable but non-writable sections | tdallagn | SUN Solaris | 0 | 05-21-2008 04:58 AM |
| extract multiple sections of a file | rgentis | UNIX for Advanced & Expert Users | 1 | 03-18-2008 05:40 PM |
| retrieved multiple lines on multiple places in a file | dala | Shell Programming and Scripting | 8 | 03-14-2008 12:28 PM |
| Handle Configuration File with same name of Parameter in multiple Sections | potro | Shell Programming and Scripting | 7 | 03-05-2008 08:36 AM |
| extract one file form .tar.gz without uncompressing .tar.gz file | balireddy_77 | Shell Programming and Scripting | 2 | 07-10-2007 01:23 AM |
|
|
Submit Tools | LinkBack | Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
extract multiple sections of file
I have a file that I need to parse multiple sections from the file.
The file contains multiple lines that start with ST (Abunch of data) Then the file contains multiple lines that start with SE (Abunch of data) SE*30*0001 ST*810*0002 I need all of the lines between and including these. They are invoices. The invoice starts with the ST line and ends with the SE line. I need to break out all of the invoices into separate files. Can someone please help me. I know Grep, sed, or AWK can do this, but not sure how. Thank you Here is an example: ST*810*0001 BIG*20080315*1220680417**SUPPLY***DI N1*SF*MCLANE HIGH PLAINS*92*46120004 N1*ST*SWC 7-11 #57134*91*571315 N3*2712 E 8TH ST N4*ODESSA*TX*79761 REF*ST*000134 ITD*05*3*****7*****NET 7 IT1**1*CA*20.09**CB*649251*PI*093*UP*099299711018*RA*NA TXI*ZZ*1.53****2 CTP**RES*0***CSR*1 PID*F****7-11 T-SHIRT BAG 1/7 BBL PO4*1000 IT1**1*EA*33.72**CB*834861*PI*093*UP*012253022401*RA*NA TXI*ZZ*2.57****2 CTP**RES*0***CSR*1 PID*F****KIT CONCRETE CHAMP PO4*1 IT1**1*EA*0.03**CB*192849*PI*093*UP*000000192842*RA*NA CTP**RES*0***CSR*1 PID*F****SCS 711 BK 200 PO4*1 IT1**30*EA*2.59**CB*001511*PI*093*UP*025215102776*RA*NA CTP**RES*0***CSR*1 PID*F****MAXELL T-160 PLUS VIDEO PO4*1 TDS*18454 SAC*C*G740***5300*******06***SERVICE CTT*4 SE*30*0001 |
| Forum Sponsor | ||
|
|
|
#2
|
|||
|
|||
|
Code:
awk '/^ST/,/^SE/' file |
|
#3
|
|||
|
|||
|
Thank you for your prompt response.
It did what I wanted. However the three sections need to be parsed to to different files. So you have ST data SE This should be taken to file 1 ST data SE This should be taken to file 2 ETC..... Also I noticed that the ST and SE are numbered. ST*810*0004 Then SE*(Number)*0004 Thank you Last edited by rgentis; 03-18-2008 at 05:07 PM. Reason: Added something |
|
#4
|
|||
|
|||
|
nawk 'BEGIN{n=1}
$0 ~ /^ST/ {f=1} $0 ~ /^SE/ {invoice[n]=sprintf("%s\n%s",invoice[n],$0);f=0;n=n+1} { if (f==1) invoice[n]=sprintf("%s\n%s",invoice[n],$0) } END{ for (i in invoice) print invoice[i] >> i close(i) }' filename |
|
#5
|
||||
|
||||
|
Hi.
An alternate awk solution: Code:
#!/usr/bin/env sh
# @(#) s1 Demonstrate extraction of range to separate files.
# ____
# /
# | Infrastructure BEGIN
echo
set -o nounset
debug=":"
debug="echo"
## The shebang using "env" line is designed for portability. For
# higher security, use:
#
# #!/bin/sh -
## Use local command version for the commands in this demonstration.
set +o nounset
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1) awk my-nl
set -o nounset
# Use nawk or /usr/xpg4/bin/awk on Solaris.
echo
FILE=${1-data1}
echo " Input file $FILE:"
cat $FILE
# | Infrastructure END
# \
# ---
echo
echo " Results from processing:"
awk '
BEGIN { i = 0 }
/ST/ { i++ ; name = "file" i }
/ST/,/SE/ { print > name }
' $FILE
my-nl file?
exit 0
Code:
% ./s1 (Versions displayed with local utility "version") Linux 2.6.11-x1 GNU bash, version 2.05b.0(1)-release (i386-pc-linux-gnu) GNU Awk 3.1.4 my-nl (local) 296 Input file data1: ST first invoice SE ST second invoice SE ST third invoice SE Results from processing: ==> file1 <== 1 ST 2 first invoice 3 SE ==> file2 <== 1 ST 2 second invoice 3 SE ==> file3 <== 1 ST 2 third invoice 3 SE |
|
#6
|
|||
|
|||
|
extract multiple sections of file
#-- Use ST values as output filename.
awk -v out="/dev/null" ' /^ST/ {gsub("\\*","-",$0); out=$0".txt"} /^SE/ { close(out) } { printf "%s\n",$0 >> out } ' $INFILE Output will be ST-810-0001.txt so on ... -Ramesh |
|
#7
|
|||
|
|||
|
I wanted to thank all of you for your response.
One issue, I am porting the awk utility to windows. So I do not think all of the functionality is there. For instance when I used Ramesh's example, I received numerous errors. Here is the code: c:\tools\gnuwin32\bin\awk -v '/^ST/ {gsub("\\*","-",$0); out=$0".txt"} /^SE/ { close(out) } { printf "%s\n",$0 >> out } ' %input%edifile.dat Here is the result: awk: `/ST/' argument to `-v' not in `var=value' form Usage: awk [POSIX or GNU style options] -f progfile [--] file . Usage: awk [POSIX or GNU style options] [--] 'program' file ... POSIX options: GNU long options: -f progfile --file=progfile -F fs --field-separator=fs -v var=val --assign=var=val -m[fr] val -W compat --compat -W copyleft --copyleft -W copyright --copyright -W dump-variables[=file] --dump-variables[=file] -W exec=file --exec=file -W gen-po --gen-po -W help --help -W lint[=fatal] --lint[=fatal] -W lint-old --lint-old -W non-decimal-data --non-decimal-data -W profile[=file] --profile[=file] -W posix --posix -W re-interval --re-interval -W source=program-text --source=program-text -W traditional --traditional -W usage --usage -W use-lc-numeric --use-lc-numeric -W version --version To report bugs, see node `Bugs' in `gawk.info', which is section `Reporting Problems and Bugs' in the printed version. gawk is a pattern scanning and processing language. By default it reads standard input and writes standard output. Examples: gawk '{ sum += $1 }; END { print sum }' file gawk -F: '{ print $1 }' /etc/passwd '/SE/' is not recognized as an internal or external command, operable program or batch file. '{' is not recognized as an internal or external command, operable program or batch file. ''' is not recognized as an internal or external command, operable program or batch file. C:\tools>edi awk: `'/ST/' argument to `-v' not in `var=value' form Usage: awk [POSIX or GNU style options] -f progfile [--] file . Usage: awk [POSIX or GNU style options] [--] 'program' file ... POSIX options: GNU long options: -f progfile --file=progfile -F fs --field-separator=fs -v var=val --assign=var=val -m[fr] val -W compat --compat -W copyleft --copyleft -W copyright --copyright -W dump-variables[=file] --dump-variables[=file] -W exec=file --exec=file -W gen-po --gen-po -W help --help -W lint[=fatal] --lint[=fatal] -W lint-old --lint-old -W non-decimal-data --non-decimal-data -W profile[=file] --profile[=file] -W posix --posix -W re-interval --re-interval -W source=program-text --source=program-text -W traditional --traditional -W usage --usage -W use-lc-numeric --use-lc-numeric -W version --version To report bugs, see node `Bugs' in `gawk.info', which is section `Reporting Problems and Bugs' in the printed version. gawk is a pattern scanning and processing language. By default it reads standard input and writes standard output. Examples: gawk '{ sum += $1 }; END { print sum }' file gawk -F: '{ print $1 }' /etc/passwd '/SE/' is not recognized as an internal or external command, operable program or batch file. '{' is not recognized as an internal or external command, operable program or batch file. ''' is not recognized as an internal or external command, operable program or batch file. Thank you again for your help. |
|||
| Google The UNIX and Linux Forums |