spliting up a large file

spliting up a large file

Dear All,

I have a very large file which which i would like split into indvidual frames evrytime the line ends with "ENDMDL" and then name frame1.pdb frame2.pdb etc
can any one give me a few sugeestions? ideally i would like to have ENDMDL at the end of each frame or not pressent at all.

an example of the file is below:-
atom 1 thx 5
atom 1 thx 5
atom 1 thx 5
atom 1 thx 5
atom 1 thx 8
atom 1 thx 8
atom 1 thx 8
atom 1 thx 7
atom 1 thx 6
atom 1 thx 6
atom 1 thx 6
atom 1 thx 6


You can do something like that ;
awk '
BEGIN    { frame = 1 }
1        { print > "frame" frame ".pdb" }
/ENDMDL/ { frame++ }
' inputfile

Another one, just in case Smilie
awk '/ENDMDL/{i++;next}{name="frame"i".pdb";print > name}' i=1 file

Standard utility csplit is designed for this kind of task:
#!/usr/bin/env bash

# @(#) s1	Demonstrate csplit, "context split".

set +o nounset
echo "Environment: LC_ALL = $LC_ALL, LANG = $LANG"
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version "=o" $(_eat $0 $1) csplit
set -o nounset


# Remove debris from previous run.
rm -f frame*

echo " Data file $FILE:"
cat $FILE

echo " Results:"
csplit --silent -z -k --prefix=frame --suffix-format="%d.pdb" $FILE /ENDMDL/+1 '{*}'

echo " Files created:"
ls frame*

echo " Sample: frame2.pdb:"
cat frame2.pdb

exit 0

% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0 
GNU bash 3.2.39
csplit (GNU coreutils) 6.10

 Data file data1:
atom 1 thx 5
atom 1 thx 5
atom 1 thx 5
atom 1 thx 5
atom 1 thx 8
atom 1 thx 8
atom 1 thx 8
atom 1 thx 7
atom 1 thx 6
atom 1 thx 6
atom 1 thx 6
atom 1 thx 6


 Files created:
frame0.pdb  frame1.pdb	frame2.pdb

 Sample: frame2.pdb:
atom 1 thx 6
atom 1 thx 6
atom 1 thx 6
atom 1 thx 6

IF you are not using Linux (e.g. you may be using Solaris), all the options may not be available ... cheers, drl
Use csplit command to this (Context based spliting a file)

man csplit

