Help needed - Split large file into smaller files based on pattern match Post: 302758077

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

splitting the large file into smaller files

hi all im new to this forum..excuse me if anythng wrong. I have a file containing 600 MB data in that. when i do parse the data in perl program im getting out of memory error. so iam planning to split the file into smaller files and process one by one. can any one tell me what is the code...

2. Shell Programming and Scripting

Split a file into multiple files based on the input pattern

I have a file with lines something like. ...... 123_start ...... ....... 123_end .... ..... 456_start ...... ..... 456_end .... ..... 789_start .... .... 789_end

3. Shell Programming and Scripting

Splitting large file into multiple files in unix based on pattern

I need to write a shell script for below scenario My input file has data in format: qwerty0101TWE 12345 01022005 01022005 datainala alanfernanded 26 qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28 qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43 qwerty0101CFG 12345...

4. Shell Programming and Scripting

Split large file into smaller file

hi Guys i need some help here.. i have a file which has > 800,000 lines in it. I need to split this file into smaller files with 25000 lines each. please help thanks

5. Shell Programming and Scripting

split XML file into multiple files based on pattern

Hello, I am using awk to split a file into multiple files using command: nawk '{ if ( $1 == "<process" ) { n=split($2, arr, "\""); file=arr } print > file }' processes.xml <process name="Process1.process"> ...

6. Shell Programming and Scripting

Sed: Splitting A large File into smaller files based on recursive Regular Expression match

I will simplify the explaination a bit, I need to parse through a 87m file - I have a single text file in the form of : <NAME>house........ SOMETEXT SOMETEXT SOMETEXT . . . . </script> MORETEXT MORETEXT . . .

7. UNIX for Dummies Questions & Answers

Split a huge 7 GB File Based on Pattern into 4 files

Hi, I have a Huge 7 GB file which has around 1 million records, i want to split this file into 4 files to contain around 250k messages each. Please help me as Split command cannot work here as it might miss tags.. Format of the file is as below ...

8. Shell Programming and Scripting

Split Large Files Based On Row Pattern..

Hi all. I've tried searching the web but could not find similar problem to mine. I have one large file to be splitted into several files based on the matching pattern found in each row. For example, let's say the file content: ...

9. UNIX for Dummies Questions & Answers

Split large file to smaller fastly

hi , I have a requirement input file: 1 1111111111111 108 1 1111111111111 109 1 1111111111111 109 1 1111111111111 110 1 1111111111111 111 1 1111111111111 111 1 1111111111111 111 1 1111111111111 112 1 1111111111111 112 1 1111111111111 112 The output should be,

10. UNIX for Beginners Questions & Answers

Split large file into smaller files without disturbing the entry chunks

Dears, Need you help with the below file manipulation. I want to split the file into 8 smaller files but without cutting/disturbing the entries (meaning every small file should start with a entry and end with an empty line). It will be helpful if you can provide a one liner command for this...

LEARN ABOUT SUNOS

egrep

egrep(1)																  egrep(1)

NAME

       egrep - search a file for a pattern using full regular expressions

SYNOPSIS

       /usr/bin/egrep [-bchilnsv] [-e pattern_list] [-f file] [strings] [file...]

       /usr/xpg4/bin/egrep [-bchilnsvx] [-e pattern_list] [-f file] [strings] [file...]

       The  egrep  (expression grep) utility searches files for a pattern of characters and prints all lines that contain that pattern. egrep uses
       full regular expressions (expressions that have string values that use the full set of alphanumeric and special characters)  to	match  the
       patterns.  It uses a fast deterministic algorithm that sometimes needs exponential space.

       If  no  files  are  specified,  egrep  assumes standard input. Normally, each line found is copied to the standard output. The file name is
       printed before each line found if there is more than one input file.

   /usr/bin/egrep
       The /usr/bin/egrep utility accepts full regular expressions as described on the regexp(5) manual page, except for ( and ), ( and ),	{
       and }, < and >, and 
, and with the addition of:

       1.  A full regular expression followed by + that matches one or more occurrences of the full regular expression.

       2.  A full regular expression followed by ? that matches 0 or 1 occurrences of the full regular expression.

       3.  Full regular expressions separated by | or by a NEWLINE that match strings that are matched by any of the expressions.

       4.  A full regular expression that can be enclosed in parentheses ()for grouping.

       Be careful using the characters $, *, [, ^, |, (, ), and  in full regular expression, because they are also meaningful to the shell. It is
       safest to enclose the entire full regular expression in single quotes  '... '.

       The order of precedence of operators is [], then *?+, then concatenation, then | and NEWLINE.

   /usr/xpg4/bin/egrep
       The /usr/xpg4/bin/egrep utility uses the regular expressions described in the EXTENDED REGULAR EXPRESSIONS  section of the regex(5)  manual
       page.

       The following options are supported for both /usr/bin/egrep and /usr/xpg4/bin/egrep:

       -b	       Precede each line by the block number on which it was found. This can be useful in locating block numbers by context (first
		       block is 0).

       -c	       Print only a count of the lines that contain the pattern.

       -e pattern_list Search for a pattern_list (full regular expression that begins with a -).

       -f file	       Take the list of full regular expressions from file.

       -h	       Suppress printing of filenames when searching multiple files.

       -i	       Ignore upper/lower case distinction during comparisons.

       -l	       Print the names of files with matching lines once, separated by NEWLINEs. Does not repeat the names of files when the  pat-
		       tern is found more than once.

       -n	       Precede each line by its line number in the file (first line is 1).

       -s	       Work silently, that is, display nothing except error messages. This is useful for checking the error status.

       -v	       Print all lines except those that contain the pattern.

   /usr/xpg4/bin/egrep
       The following option is supported for /usr/xpg4/bin/egrep only:

       -x	Consider only input lines that use all characters in the line to match an entire fixed string or regular expression to be matching
		lines.

       The following operands are supported:

       file	       A path name of a file to be searched for the patterns. If no file operands are specified, the standard input is used.

   /usr/bin/egrep
       pattern	       Specify a pattern to be used during the search for input.

   /usr/xpg4/bin/egrep
       pattern	       Specify one or more patterns to be used during the search for input. This operand is treated as if  it  were  specified	as
		       -epattern_list.

USAGE

       See largefile(5) for the description of the behavior of egrep when encountering files greater than or equal to 2 Gbyte ( 2**31 bytes).

       See  environ(5)	for  descriptions of the following environment variables that affect the execution of egrep: LC_COLLATE, LC_CTYPE, LC_MES-
       SAGES, and NLSPATH.

       The following exit values are returned:

       0	If any matches are found.

       1	If no matches are found.

       2	For syntax errors or inaccessible files (even if matches were found).

       See attributes(5) for descriptions of the following attributes:

   /usr/bin/egrep
       +-----------------------------+-----------------------------+
       |      ATTRIBUTE TYPE	     |	    ATTRIBUTE VALUE	   |
       +-----------------------------+-----------------------------+
       |Availability		     |SUNWcsu			   |
       +-----------------------------+-----------------------------+
       |CSI			     |Not Enabled		   |
       +-----------------------------+-----------------------------+

   /usr/xpg4/bin/egrep
       +-----------------------------+-----------------------------+
       |      ATTRIBUTE TYPE	     |	    ATTRIBUTE VALUE	   |
       +-----------------------------+-----------------------------+
       |Availability		     |SUNWxcu4			   |
       +-----------------------------+-----------------------------+
       |CSI			     |Enabled			   |
       +-----------------------------+-----------------------------+

       fgrep(1), grep(1), sed(1), sh(1), attributes(5), environ(5), largefile(5), regex(5), regexp(5), XPG4(5)

       Ideally there should be only one grep command, but there is not a single algorithm that spans a wide enough range of space-time tradeoffs.

       Lines are limited only by the size of the available virtual memory.

   /usr/xpg4/bin/egrep
       The /usr/xpg4/bin/egrep utility is identical to /usr/xpg4/bin/grep -E (see grep(1)). Portable applications  should  use	/usr/xpg4/bin/grep
       -E.

								    23 May 2005 							  egrep(1)

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

splitting the large file into smaller files

Discussion started by: vsnreddy

2. Shell Programming and Scripting

Split a file into multiple files based on the input pattern

Discussion started by: abinash

3. Shell Programming and Scripting

Splitting large file into multiple files in unix based on pattern

Discussion started by: jimmy12

4. Shell Programming and Scripting

Split large file into smaller file

Discussion started by: sitaldip