Split a large file with patterns and size


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Split a large file with patterns and size
# 1  
Old 06-23-2008
Bug Split a large file with patterns and size

Hi,
I have a large file with a repeating pattern in it. Now i want the file split into the block of patterns with a specified no. of lines in each file.
i.e. The file is like

1...
2...
2...
3...
1...
2...
3...
1...
2...
2...
2...
2...
2...
3...

where 1 is the start of the block and 3 is the end of the block.
Now i want the file to be split up in blocks. But each file can have more than one block, but specified no. of lines or less.

What i have done is i split up the file using
Code:
awk '/^1/{close("file"f);f++}{print $0 > "file"f}'  testfile

But this gives one file for each block. I needed to concat the files to get a file with some specified no. of lines, say 10000.

Please give me an efficient way to do this. I feel we can do it in the awk command only.
Thanks,
CSS

Last edited by Yogesh Sawant; 06-24-2008 at 04:36 AM.. Reason: added code tags
# 2  
Old 06-24-2008
Code:
awk 'NR%10000==1{close("file"f);f++}{print $0 > "file"f}'  testfile

Regards
# 3  
Old 06-24-2008
But that is writing 10000 line per file. But not patternwise.
# 4  
Old 06-25-2008
Maybe this will help:

Code:
awk  '/^1/{f=1} f{ print $0 > "file_"n ; c++} c==x+1{ n++; c=1; close("file_"n)} /^3/{f=0}'  x=10000 c=1 n=1  filename

Change x to whatever number of lines per file you need to have. Whereas c and n are just counters, that don't need to be changed.
# 5  
Old 06-25-2008
Quote:
Originally Posted by rubin
Maybe this will help:

Code:
awk  '/^1/{f=1} f{ print $0 > "file_"n ; c++} c==x+1{ n++; c=1; close("file_"n)} /^3/{f=0}'  x=10000 c=1 n=1  filename

Change x to whatever number of lines per file you need to have. Whereas c and n are just counters, that don't need to be changed.
Even this is giving files with 10000 lines but i want to split in patterns.
# 6  
Old 06-25-2008
MySQL Atlast Success

Quote:
Originally Posted by sudhamacs
Even this is giving files with 10000 lines but i want to split in patterns.
Thanks Rubin,
But i had to make some minor changes and this worked.
awk '/^1/{f=1} f{ print $0 > "file_"n ; c++} c>10000 && /^3/ { n++; c=1; close("file_"n) }' c=1 n=1 testfile
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Trying To Split a Large File

Trying to split a 35gb file into 1000mb parts. My research shows I should you this. split -b 1000m file.txt and my return is "split: cannot open 'crunch1.txt' for reading: No such file or directory" so I tried split -b 1000m Documents/Wordlists/file.txt and I get nothing other than the curser just... (3 Replies)
Discussion started by: sub terra
3 Replies

2. UNIX for Advanced & Expert Users

How to split large file with different record delimiter?

Hi, I have received a file which is 20 GB. We would like to split the file into 4 equal parts and process it to avoid memory issues. If the record delimiter is unix new line, I could use split command either with option l or b. The problem is that the line terminator is |##| How to use... (5 Replies)
Discussion started by: Ravi.K
5 Replies

3. UNIX for Beginners Questions & Answers

sed awk: split a large file to unique file names

Dear Users, Appreciate your help if you could help me with splitting a large file > 1 million lines with sed or awk. below is the text in the file input file.txt scaffold1 928 929 C/T + scaffold1 942 943 G/C + scaffold1 959 960 C/T +... (6 Replies)
Discussion started by: kapr0001
6 Replies

4. UNIX for Advanced & Expert Users

How to split a large file with the first 100 lines of each condition?

I have a huge file with the following input: Case1 Specific_Info Specific_Info Case1 Specific_Info Specific_Info Case3 Specific_Info Specific_Info Case4 Specific_Info Specific_Info Case1 Specific_Info Specific_Info Case2 Specific_Info Specific_Info Case2 Specific_Info Specific_Info... (2 Replies)
Discussion started by: laurigo
2 Replies

5. UNIX for Dummies Questions & Answers

Split large file to smaller fastly

hi , I have a requirement input file: 1 1111111111111 108 1 1111111111111 109 1 1111111111111 109 1 1111111111111 110 1 1111111111111 111 1 1111111111111 111 1 1111111111111 111 1 1111111111111 112 1 1111111111111 112 1 1111111111111 112 The output should be, (19 Replies)
Discussion started by: mechvijays
19 Replies

6. Shell Programming and Scripting

Split a large file

I have a 3 GB text file that I would like to split. How can I do this? It's a giant comma-separated list of numbers. I would like to make it into about 20 files of ~100 MB each, with a custom header and footer. The file can only be split on commas, but they're plentiful. Something like... (3 Replies)
Discussion started by: CRGreathouse
3 Replies

7. Shell Programming and Scripting

Splitting a large file, split command will not do.

Hello Everyone, I have a large file that needs to be split into many seperate files, however the text in between the blank lines need to be intact. The file looks like SomeText SomeText SomeText SomeOtherText SomeOtherText .... Since the number of lines of text are different for... (3 Replies)
Discussion started by: jwillis0720
3 Replies

8. Shell Programming and Scripting

split large file based on field criteria

I have a file containing date/time sorted data of the form ... 2009/06/10,20:59:59.950,XAG/USD,Q,1,1115, 14.3025,100,1,1 2009/06/10,20:59:59.950,XAG/USD,Q,1,1116, 14.3026,125,1,1 2009/06/10,20:59:59.950,XAG/USD,R,0,0, , 0,0,0 2009/06/10,20:59:59.950,XAG/USD,R,1,0, 14.1910,100,1,1... (6 Replies)
Discussion started by: asriva
6 Replies

9. Shell Programming and Scripting

Split Large File

HI, i've to split a large file which inputs seems like : Input file name_file.txt 00001|AAAA|MAIL|DATEOFBIRTHT|....... 00001|AAAA|MAIL|DATEOFBIRTHT|....... 00002|BBBB|MAIL|DATEOFBIRTHT|....... 00002|BBBB|MAIL|DATEOFBIRTHT|....... 00003|CCCC|MAIL|DATEOFBIRTHT|.......... (1 Reply)
Discussion started by: AMARA
1 Replies

10. Shell Programming and Scripting

Split A Large File

Hi, I have a large file(csv format) that I need to split into 2 files. The file looks something like Original_file.txt first name, family name, address a, b, c, d, e, f, and so on for over 100,00 lines I need to create two files from this one file. The condition is i need to ensure... (4 Replies)
Discussion started by: nbvcxzdz
4 Replies
Login or Register to Ask a Question