Sed: Splitting A large File into smaller files based on recursive Regular Expression match
I will simplify the explaination a bit, I need to parse through a 87m file -
I have a single text file in the form of :
I want to extract <NAME>, </script>, and all lines between the two and place them into respectives files
ending up with
file1.txt
file2.txt
file3.txt
I have searched sed one liners, used the search feature here, looked in my Oreilly sed/awk pocket guide but nothing really provides a solution.
Thanks in advance. SORRY FOR THE REEDIT !!!
Last edited by Scrutinizer; 03-29-2013 at 06:07 PM..
Reason: code tags
hi all
im new to this forum..excuse me if anythng wrong.
I have a file containing 600 MB data in that. when i do parse the data in perl program im getting out of memory error.
so iam planning to split the file into smaller files and process one by one.
can any one tell me what is the code... (1 Reply)
Hi Everyone,
I am using a centos 5.2 server as an sflow log collector on my network. Currently I am using inmons free sflowtool to collect the packets sent by my switches. I have a bash script running on an infinate loop to stop and start the log collection at set intervals - currently one... (2 Replies)
I have a file with a simple list of ids. 750,000 rows. I have to break it down into multiple 50,000 row files to submit in a batch process.. Is there an easy script I could write to accomplish this task? (2 Replies)
I need to write a shell script for below scenario
My input file has data in format:
qwerty0101TWE 12345 01022005 01022005 datainala alanfernanded 26
qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28
qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43
qwerty0101CFG 12345... (19 Replies)
Dear all,
I have a specific problem that I don't quite understand how to solve. I have two files, both of the same format:
XXXXXX_FIND1 bla bla bla
bla
bla
bla
bla
bla
bla
bla
bla
bla
========
(return)
XXXXXX_FIND2 bla bla bla
bla
bla
bla (10 Replies)
Hi Experts,
I have to split huge file based on the pattern to create smaller files. The pattern which is expected in the file is:
Master.....
First...
second....
second...
third..
third...
Master...
First..
second...
third...
Master...
First...
second..
second..
second..... (2 Replies)
Hi,
I'm trying to split a large file into several smaller files
the script will have two input arguments argument1=filename and argument2=no of files to be split.
In my large input file I have a header followed by 100009 records
The first line is a header; I want this header in all my... (9 Replies)
I am trying to update an older program on a small cluster. It uses individual files to send jobs to each node. However the newer database comes as one large file, containing over 10,000 records. I therefore need to split this file. It looks like this:
HMMER3/b
NAME 1-cysPrx_C
ACC ... (2 Replies)
Hi Everybody!
I need some help with a regular expression in Perl that will match files named messages, but also files named message.1, message.2 and so on. So really I need one that will find messages and messages that might be followed by a period and a digit without matching other files like... (2 Replies)
Help needed urgently please.
I have a large file - a few hundred thousand lines.
Sample
CP START ACCOUNT
1234556
name 1
CP END ACCOUNT
CP START ACCOUNT
2224444
name 1
CP END ACCOUNT
CP START ACCOUNT
333344444
name 1
CP END ACCOUNT
I need to split this file each time "CP START... (7 Replies)
Discussion started by: frustrated1
7 Replies
LEARN ABOUT CENTOS
paps
PAPS(1) General Commands Manual PAPS(1)NAME
paps - UTF-8 to PostScript converter using Pango
SYNOPSIS
paps [options] files...
DESCRIPTION
paps reads a UTF-8 encoded file and generates a PostScript language rendering of the file. The rendering is done by creating outline curves
through the pango ft2 backend.
OPTIONS
These programs follow the usual GNU command line syntax, with long options starting with two dashes (`-'). A summary of options is
included below.
--landscape
Landscape output. Default is portrait.
--columns=cl
Number of columns output. Default is 1.
--font=desc
Set the font description. Default is Monospace 12.
--rtl Do rtl layout.
--paper ps
Choose paper size. Known paper sizes are legal, letter, a4. Default is A4.
--bottom-margin=bm
Set bottom margin in postscript points (1/72 inch). Default is 36.
--top-margin=tm
Set top margin. Default is 36.
--left-margin=lm
Set left margin. Default is 36.
--right-margin=rm
Set right margin. Default is 36.
--help Show summary of options.
--header
Draw page header for each page.
--markup
Interpret the text as pango markup.
--encoding=ENCODING
Assume the documentation encoding is ENCODING.
--lpi Set the lines per inch. This determines the line spacing.
--cpi Set the characters per inch. This is an alternative method of specifying the font size.
--stretch-chars
Indicates that characters should be stretched in the y-direction to fill up their vertical space. This is similar to the texttops
behaviour.
AUTHOR
paps was written by Dov Grobgeld <dov.grobgeld@gmail.com>.
This manual page was written by Lior Kaplan <kaplan@debian.org>, for the Debian project (but may be used by others).
April 17, 2006 PAPS(1)