Here is a similar script that does a little bit more. It reads the input file, counts the number of <BASIC> tags it finds, and compares that to the data in the <Basic-Record-Count> tag field. It will split your input file into any number of output files (not just three). It will set the data in the <Basic-Record-Count> tag field in each output file to match the number of <BASIC> tags in that file. And, it only reads the input file once. This is a long script but it is mostly comments. Look at the definition of the Usage variable for a man page of this script:
Code:
#!/usr/bin/ksh
# Set variables:
IAm=${0##*/}
ERF="$IAm.stderr.$$"
INPUT_PATHNAME=${1:-CWS.xml}
MINIMUM_BASIC=${3:-11}
NUM_OUTPUT_FILES=${2:-3}
Usage="Usage: %s [input_file [#_of_output_files [minimum_to_split]]]
DESCRIPTION:
Split the <BASIC> tags in a given XML input file into a given number
of XML output files with each output file containing the same number
of <BASIC> tags (adjusted for rounding). Output files will be named
output<seq#>.xml where <seq#> is a 3 digit, leading zero filled
sequence number starting with 001 that is is incremented for each
subsequent output file. The input file will not be split if less
than a given number of <BASIC> tags are found in the input file.
The data in the <Basic-Record-Count> tag in each output file's trailer
will be updated to correctly specify the number of <BASIC> tags
present in that output file.
OPERANDS:
input_file pathname of the input file (default ./CWS.xml)
#_of_output_files number of output files to create (default 3)
minimum_to_split minimum # of <BASIC> tags that must be in the
input file before it will be split (default 11)
EXIT STATUS:
0 All requested output files were successfully created
1 Command syntax invalid
2 The input file doesn't exist, is not a regular file, or is unreadable
3 The input file didn't contain enough <BASIC> tags
>3 I/O error reading the input file or creating or writing output files"
# Set up to remove the error log file on exit:
trap 'rm -rf "$ERF"' EXIT
# Verify command line arguments.
if [ $# -gt 3 ] || [ "${1#-}" != "$1" ]
then printf "$Usage: %s %s\n" "$IAm" "$Usage" "Operands:\n" >&2
exit 1
fi
if [ ! -f "$INPUT_PATHNAME" ] || [ ! -r "$INPUT_PATHNAME" ]
then printf "%s: input (%s) is not a readable regular file\n" "$IAm" \
"$INPUT_PATHNAME" >&2
exit 2
fi
# Verifying that $MINIMUM_BASIC and $NUM_OUTPUT_FILES are positive numeric
# strings is left as an exercise for the reader... Any failures on these tests
# should use exit code 2.
# Use awk to identify the header segment, <BASIC>...</BASIC> segments, and
# trailer segment; to verify the input <Basic-Record-Count> tag data in the
# trailer; and to create and write the header, selected <BASIC>...</BASIC>
# segments, and the trailier segment for each output file with the output
# <Basic-Record-Count> tag data for each output file set correctly for the
# number of <BASIC>...</BASIC> segments present in that output file.
awk -v IAm="$IAm" -v MBT=$MINIMUM_BASIC -v NOF=$NUM_OUTPUT_FILES -v ERF="$ERF" '
/<BASIC>/ {
# Increment BASIC record count:
brc++
}
/<TRAILER>/ {
# Note that we have found the start of the trailer lines and save the
# input file name for diagnostic messages in the END section.
trailer++
ifn = FILENAME
}
brc == 0 {
# We have not seen the 1st BASIC tag yet; add line to Header:
Header = Header $0 "\n"
next
}
trailer == 0 {
# We have seen a BASIC tag and we have not seen the TRAILER tag; add
# current line to the current BASIC record:
Basic[brc] = Basic[brc] $0 "\n"
next
}
{ # To get here we must have seen the TRAILER tag; add current line to
# Trailer:
Trailer = Trailer $0 "\n"
}
END { # Verify that <Basic-Record-Count> matches # of <BASIC> tags found:
BRCstart = index(Trailer, "<Basic-Record-Count>") + 20
BRClength = index(Trailer, "</Basic-Record-Count>") - BRCstart
if((BRC = substr(Trailer, BRCstart, BRClength)) != brc)
printf("%s: WARNING: %s <Basic-Record-Count>(%d)%s(%d)\n",
IAm, ifn, BRC, "!=<BASIC> tag count", brc) > ERF
# Get 1st and last part of Trailer (before and after the data in the
# <Basic-Record-Count> field.
Trailer1 = substr(Trailer, 1, BRCstart - 1)
Trailer3 = substr(Trailer, BRCstart + BRClength)
# Calculate <BASIC> records per output file and number of output file
# that need an additional record.
nbpf = int(brc / NOF)
nrem = brc % NOF
# Verify that # of <BASIC> tags found is >= MBT:
if(brc < MBT) {
printf("%s: ERROR: <BASIC> tags found(%d) < minimum(%d)\n",
IAm, brc, MBT) > ERF
exit 3
}
# Create output files:
obrc = 1
for(i = 1; i <= NOF; i++) {
# Break out if we have more output files than BASIC records:
if(obrc > brc) break
# Create output filename.
ofn = sprintf("output%03d.xml", i)
# Copy file header to output file.
printf("%s", Header) > ofn
# Note that the standards do not speccify any way to detect
# errors when creating and writing output files. (In these
# cases, awk will print a diagnostic and exit with an
# unspecified value. The awk system() function could be used
# to use touch to create output files and test to verify that a
# newly created file is writeable and set the awk exit code to
# 4 or more (as specified in the EXIT STATUS section of the
# message for this utiiity), and close() could be chnecked for
# Usage failure, but all of this is left as an excercise for the
# reader.)
# Copy BASIC records to output file.
for(j = 1; j <= nbpf + (i <= nrem); j++)
printf("%s", Basic[obrc++]) > ofn
# Copy file trailer (with updated <Basic-Record-Count> tag
# data) to output file.
printf("%s%d%s", Trailer1, nbpf + (i <= nrem), Trailer3) > ofn
close(ofn)
printf("%s: STATUS: %s created with %d BASIC record(s) from %s\n",
IAm, ofn, nbpf + (i <= nrem), ifn)
}
}' "$INPUT_PATHNAME"
exit_code=$?
# Print diagnostics and warnings if there are any:
[ -s "$ERF" ] && cat "$ERF"
exit $exit_code
If you want to try this on a Solaris/SunOS system use /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk instead of the default /usr/bin/awk.
This User Gave Thanks to Don Cragun For This Post:
Hi, ive just been given a HP UX c180 to play with, along with a few external hd`s, can anyone recommend me a good place to start learning unix please as i havent got a clue what im doing when it comes to unix.
Either web links or book recomendations would be great.
I would like to set it up with... (1 Reply)
I'm trying to write a script, named "worfo" which will read a file in and return, on the screen, the number of words in the file. Also, I am looking to do several extras with this script. I need to allow it to accept the option "-n" which will use all non-alpha characters to delimit words. Also,... (2 Replies)
I have a irritating problem with a "if"-statement or what you should call it, in sed. I would love some help here since I am very, very stuck.
I have this statement that I want to do:
if line contains a:
do this
if none of the lines contained a:
do this
The problem with this... (8 Replies)
hi there
i am very new to the world of CRON. i would like to know from below the ground up how i can learn abouot cron. where the best tutorials lie. i have an assignment that requires i set up a cron job to download a price file. am i in control of where the price file is downloaded to? (eg to... (2 Replies)
Hi
How to call a shell scripting through a Perl scripting? Actually I need some value from Shell scripting and passes in the Perl scripting. So how can i do this? (2 Replies)
Hi all,
I would like to start developping some good scripting skills. Do you think it would be best to start with shell scripting or Perl? I already got a fundation, really basics, in perl. but I am wondering what would be best to be good at first.
Can you please help me determine which one to... (14 Replies)
Hello my friends , i am totally stuck in ssh port forwarding topic
i had learn iptables and other networking topic without any problem but ssh port forwarding is headache
1. local port = what is this ? is this incoming traffic or outgoing traffic
2. remote port = same as above
3. dynamic... (2 Replies)