Sponsored Content
Top Forums Shell Programming and Scripting Can I split a 10GB file into 1 GB sizes using my repeating data pattern Post 302333001 by vgersh99 on Friday 10th of July 2009 02:10:26 PM
Old 07-10-2009
Code:
nawk '
   !FNR%chunk {limit=1}
   /^100/ {cut=1}
   FNR==1 || (limit && cut) {close(out);out=FILENAME "_" ++cnt;limit=cut=0}
   { print >> out }' chunk=100000 myHugeFile

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Split a file with no pattern -- Split, Csplit, Awk

I have gone through all the threads in the forum and tested out different things. I am trying to split a 3GB file into multiple files. Some files are even larger than this. For example: split -l 3000000 filename.txt This is very slow and it splits the file with 3 million records in each... (10 Replies)
Discussion started by: madhunk
10 Replies

2. Shell Programming and Scripting

Split a file based on a pattern

Dear all, I have a large file which is composed of 8000 frames, what i would like to do is split the file into 8000 single files names file.pdb.1, file.pdb.2 etc etc each frame in the large file is seperated by a "ENDMDL" flag so my thinking is to use this flag a a point to split the files... (4 Replies)
Discussion started by: Mish_99
4 Replies

3. Shell Programming and Scripting

Remove repeating pattern from beginning of file names.

I want a shell script that will traverse a file system starting at specific path. And look at all file names for repeating sequences of and remove them from the file name. The portion of the name that gets removed has to be a repeating sequence of the same characters. So the script would... (3 Replies)
Discussion started by: z399y
3 Replies

4. Shell Programming and Scripting

Split binary file with pattern

Hello! Have some problem with extract files from saved session. File contains any kind of special/printable characters. DATA NumberA DATA DATA Begin DATA1.1 DATA1.2 NumberB1 DATA1.3 DATA1.4 End DATA DATA DATA Begin DATA2.1 DATA2.2 NumberB2 DATA2.3 DATA2.4 End DATA DATA ... (4 Replies)
Discussion started by: vvild
4 Replies

5. UNIX for Dummies Questions & Answers

Extract repeating data from file

I want to extract the last rows of a data file, similar to that one below: C1 xxx C2 rrr C3 ttt .... Cn-1 hhh Cn bbb C1 yyy C2 sss C3 uuu ... Cn-1 iii Cn ccc ... I just want to extract the final rows between C1 and Cn at each data file. n is not a constant,... (2 Replies)
Discussion started by: natasha
2 Replies

6. Shell Programming and Scripting

Sed Replace repeating pattern

Hi, I have an sqlplus output file using the character ';' as a delimiter and I would like to replace the fields without datas (i.e delimited by ';;') by ';0;' Example: my sqlplus output: 11;22;33;44;;;77;; What I would like to have: 11;22;33;44;0;0;77;0; Thanks in advance for your... (2 Replies)
Discussion started by: popesk
2 Replies

7. Solaris

How to split 10GB file into small Sizes

Hi Team I have one 10 Gb log file I want to split it into say 10 of 1-1Gb file pls share ur experiences how to do this? Thanks in advance, (3 Replies)
Discussion started by: zimmyyash
3 Replies

8. Shell Programming and Scripting

Split the file based on pattern

Hi , I have huge files around 400 mb, which has clob data and have diffeent scenarios: I am trying to pass scenario number as parameter and and get required modified file based on the scenario number and criteria. Scenario 1: file name : scenario_1.txt ... (2 Replies)
Discussion started by: sol_nov
2 Replies

9. Shell Programming and Scripting

How to grab a block of data in a file with repeating pattern?

I need to send email to receipient in each block of data in a file which has the sender address under TO and just send that block of data where it ends as COMPANY. I tried to work this out by getting line numbers of the string HELLO but unable to grab the next block of data to send the next... (5 Replies)
Discussion started by: loggedout
5 Replies

10. UNIX for Advanced & Expert Users

Split one file to many based on pattern

Hello All, I have records in a file in a pattern A,B,B,B,B,K,A,B,B,K Is there any command or simple logic I can pull out records into multiple files based on A record? I want output as File1: A,B,B,B,B,K File2: A,B,B,K (9 Replies)
Discussion started by: deal1dealer
9 Replies
cut(1)								   User Commands							    cut(1)

NAME
cut - cut out selected fields of each line of a file SYNOPSIS
cut -b list [-n] [file]... cut -c list [file]... cut -f list [-d delim] [-s] [file]... DESCRIPTION
Use the cut utility to cut out columns from a table or fields from each line of a file; in data base parlance, it implements the projection of a relation. The fields as specified by list can be fixed length, that is, character positions as on a punched card (-c option) or the length can vary from line to line and be marked with a field delimiter character like TAB (-f option). cut can be used as a filter. Either the -b, -c, or -f option must be specified. Use grep(1) to make horizontal ``cuts'' (by context) through a file, or paste(1) to put files together column-wise (that is, horizontally). To reorder columns in a table, use cut and paste. OPTIONS
The following options are supported: list A comma-separated or blank-character-separated list of integer field numbers (in increasing order), with optional - to indi- cate ranges (for instance, 1,4,7; 1-3,8; -5,10 (short for 1-5,10); or 3- (short for third through last field)). -b list The list following -b specifies byte positions (for instance, -b1-72 would pass the first 72 bytes of each line). When -b and -n are used together, list is adjusted so that no multi-byte character is split. -c list The list following -c specifies character positions (for instance, -c1-72 would pass the first 72 characters of each line). -d delim The character following -d is the field delimiter (-f option only). Default is tab. Space or other characters with special meaning to the shell must be quoted. delim can be a multi-byte character. -f list The list following -f is a list of fields assumed to be separated in the file by a delimiter character (see -d ); for instance, -f1,7 copies the first and seventh field only. Lines with no field delimiters will be passed through intact (useful for table subheadings), unless -s is specified. -n Do not split characters. When -b list and -n are used together, list is adjusted so that no multi-byte character is split. -s Suppresses lines with no delimiter characters in case of -f option. Unless specified, lines with no delimiters will be passed through untouched. OPERANDS
The following operands are supported: file A path name of an input file. If no file operands are specified, or if a file operand is -, the standard input will be used. USAGE
See largefile(5) for the description of the behavior of cut when encountering files greater than or equal to 2 Gbyte (2^31 bytes). EXAMPLES
Example 1 Mapping user IDs A mapping of user IDs to names follows: example% cut -d: -f1,5 /etc/passwd Example 2 Setting current login name To set name to current login name: example$ name=`who am i | cut -f1 -d' '` ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of cut: LANG, LC_ALL, LC_CTYPE, LC_MES- SAGES, and NLSPATH. EXIT STATUS
The following exit values are returned: 0 All input files were output successfully. >0 An error occurred. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWcsu | +-----------------------------+-----------------------------+ |CSI |Enabled | +-----------------------------+-----------------------------+ |Interface Stability |Standard | +-----------------------------+-----------------------------+ SEE ALSO
grep(1), paste(1), attributes(5), environ(5), largefile(5), standards(5) DIAGNOSTICS
cut: -n may only be used with -b cut: -d may only be used with -f cut: -s may only be used with -f cut: cannot open <file> Either file cannot be read or does not exist. If multiple files are present, processing continues. cut: no delimiter specified Missing delim on -d option. cut: invalid delimiter cut: no list specified Missing list on -b, -c, or -f option. cut: invalid range specifier cut: too many ranges specified cut: range must be increasing cut: invalid character in range cut: internal error processing input cut: invalid multibyte character cut: unable to allocate enough memory SunOS 5.11 29 Apr 1999 cut(1)
All times are GMT -4. The time now is 05:08 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy