spliting up a large file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting spliting up a large file
# 1  
Old 10-08-2009
spliting up a large file

Dear All,

I have a very large file which which i would like split into indvidual frames evrytime the line ends with "ENDMDL" and then name frame1.pdb frame2.pdb etc
can any one give me a few sugeestions? ideally i would like to have ENDMDL at the end of each frame or not pressent at all.

an example of the file is below:-
Code:
atom 1 thx 5
atom 1 thx 5
atom 1 thx 5
atom 1 thx 5
ENDMDL
atom 1 thx 8
atom 1 thx 8
atom 1 thx 8
atom 1 thx 7
ENDMDL
atom 1 thx 6
atom 1 thx 6
atom 1 thx 6
atom 1 thx 6
ENDMDL


thanks


Mish
# 2  
Old 10-08-2009
You can do something like that ;
Code:
awk '
BEGIN    { frame = 1 }
1        { print > "frame" frame ".pdb" }
/ENDMDL/ { frame++ }
' inputfile

Jean-Pierre.
# 3  
Old 10-08-2009
Another one, just in case Smilie
Code:
awk '/ENDMDL/{i++;next}{name="frame"i".pdb";print > name}' i=1 file

# 4  
Old 10-08-2009
Hi.

Standard utility csplit is designed for this kind of task:
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate csplit, "context split".

echo
set +o nounset
LC_ALL=C ; LANG=C ; export LC_ALL LANG
echo "Environment: LC_ALL = $LC_ALL, LANG = $LANG"
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version "=o" $(_eat $0 $1) csplit
set -o nounset
echo

FILE=${1-data1}

# Remove debris from previous run.
rm -f frame*

echo " Data file $FILE:"
cat $FILE

echo
echo " Results:"
csplit --silent -z -k --prefix=frame --suffix-format="%d.pdb" $FILE /ENDMDL/+1 '{*}'

echo
echo " Files created:"
ls frame*

echo
echo " Sample: frame2.pdb:"
cat frame2.pdb

exit 0

Producing:
Code:
% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0 
GNU bash 3.2.39
csplit (GNU coreutils) 6.10

 Data file data1:
atom 1 thx 5
atom 1 thx 5
atom 1 thx 5
atom 1 thx 5
ENDMDL
atom 1 thx 8
atom 1 thx 8
atom 1 thx 8
atom 1 thx 7
ENDMDL
atom 1 thx 6
atom 1 thx 6
atom 1 thx 6
atom 1 thx 6
ENDMDL

 Results:

 Files created:
frame0.pdb  frame1.pdb	frame2.pdb

 Sample: frame2.pdb:
atom 1 thx 6
atom 1 thx 6
atom 1 thx 6
atom 1 thx 6
ENDMDL

IF you are not using Linux (e.g. you may be using Solaris), all the options may not be available ... cheers, drl
# 5  
Old 10-08-2009
Use csplit command to this (Context based spliting a file)

man csplit

Last edited by jambesh; 10-08-2009 at 01:39 PM.. Reason: add
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Requirement of Spliting a text file in UNIX Programing

Hi, There is a requirement, needs to split the text file based on RC code present in text file. For this, needs to write a unix shell programing script for the above requirement. For example in text file, if there are distinct RC codes, then we needs to split into multiple text files. In... (1 Reply)
Discussion started by: Chandra2678
1 Replies

2. Shell Programming and Scripting

Spliting log file

Hello, I want to split or cut a large size log file by year wise(eg 2009, 2010) .But the source file must not have the splited or cut lines after this process ,all of them must move to the destination folder.Does grep command have the fuctionality like cut and paste? I used grep -Ev command but... (17 Replies)
Discussion started by: jobycxa
17 Replies

3. Shell Programming and Scripting

Spliting file based field pattern

Hi all, i have file that looks like as below 2263881188,24570896,439,SOLO,SOLO_UNBEATABLE,E,+3.13,+0.00 2263881964,24339077,439,SOLO,SOLO_UNBEATABLE,F,-0.67,+0.00 2263883220,22619162,228,Bell,Bell_MONTHLY,E,-2.04,+0.00 2263883220,22619162,228,Bell,Bell_MONTHLY,F,-2.04,+0.00... (3 Replies)
Discussion started by: raghavendra.cse
3 Replies

4. Shell Programming and Scripting

File Spliting problem

I have a very big log file which looks like this: I need to split this file and create files with "+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+" as the delimiter. The file names need to be the contents of the next line after the delimiter(FIRST_ITEM,SECOND_ITEM...so on..). (7 Replies)
Discussion started by: engineer
7 Replies

5. UNIX for Dummies Questions & Answers

spliting a file

how would i split the file "file1" into smaller files containg lines of 15 (1 Reply)
Discussion started by: JamieMurry
1 Replies

6. Shell Programming and Scripting

Spliting a file and renaming it's out put

Hi friends, I am new to unix. With the help of this site I wrote a script. But it's returning some errors pls help.:confused: Here is my specifications: I have a file obsrec_31583_090212.xls at /home/lingalma/temp/cdl I want it to mv to another temp folder called /home/lingalma/temp/obsrec... (6 Replies)
Discussion started by: raaj.manoj
6 Replies

7. UNIX for Dummies Questions & Answers

Caching while spliting a large file

Hi The scenario is like this. I need to split 5 files having size 3GB, 2GB, 4GB, 30GB and 20 GB respectively. The machine has 15GB heap space. Before starting split proces it was showing 15gb free space. Once the split process completed it showed 100 mb free and around 12GB cached. My... (3 Replies)
Discussion started by: siba.s.nayak
3 Replies

8. Shell Programming and Scripting

Spliting the file dynamically

i am creating the file , when this file reaches the size 2 GB, i need one message or fire (4 Replies)
Discussion started by: kingganesh04
4 Replies

9. Shell Programming and Scripting

Spliting file based on condition

Hi, I have a comma separated file with millions of records in it. I have a requirement to split the file based on the value in a one of the columns. Suppose i have a text file with columns like C1, C2,C3,C4 Column C4 can hold the values either 01 or 02 03 or 04. I nned to extract... (2 Replies)
Discussion started by: Raamc
2 Replies

10. UNIX for Dummies Questions & Answers

spliting up a huge file

I have a file {filename} which contains 65000 records I need to split into 6 smaller files roughly 11000 records each. Can someone advise me of the Unix command to do so ? Many thanks (2 Replies)
Discussion started by: grinder182533
2 Replies
Login or Register to Ask a Question