Sponsored Content
Top Forums Shell Programming and Scripting Can I split a 10GB file into 1 GB sizes using my repeating data pattern Post 302332622 by john091 on Thursday 9th of July 2009 03:41:30 PM
Old 07-09-2009
Can I split a 10GB file into 1 GB sizes using my repeating data pattern

I'm not a unix guy so excuses my ignorance... I'm the database ETL guy.

I'm trying to be proactive and devise a plan B for a ETL process where I expect a file 10X larger than what I process daily for a recast job. The ETL may handle it but I just don't know.

This file may need to be split and we don't want to lose related data. I assume it would be easier to do it at the unix level rather than the etl tool providing there are no limitations to file sizes with the unix commands.

The file will most likely be 10GB +- a few GB. It is unknown at this time

The basic file format is as follows with the first 3 characters being the record type (100,401,404,410,411)

The file must be split into segments equal to a daily run approximately 1gb in size and it has to occur just before a 100 record as all the rows that follow a 100 belong together.

1001104vvbvnbvd
4011104ghghghgh
404111kjdkfjkdf
404111kjdkfjkdf
404111kjdkfjkdf
404111kjdkfjkdf
4103445kkjkljlk
4103445kkjkljlk
4113445kkjkljlk
4043445kkjkljlk
10011ffgfgg1250
4011104fffhghgh
404111kjddfjkdf
404111kjdkrtrdf
etc...

thanks in advance. I think we use HP-UX
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Split a file with no pattern -- Split, Csplit, Awk

I have gone through all the threads in the forum and tested out different things. I am trying to split a 3GB file into multiple files. Some files are even larger than this. For example: split -l 3000000 filename.txt This is very slow and it splits the file with 3 million records in each... (10 Replies)
Discussion started by: madhunk
10 Replies

2. Shell Programming and Scripting

Split a file based on a pattern

Dear all, I have a large file which is composed of 8000 frames, what i would like to do is split the file into 8000 single files names file.pdb.1, file.pdb.2 etc etc each frame in the large file is seperated by a "ENDMDL" flag so my thinking is to use this flag a a point to split the files... (4 Replies)
Discussion started by: Mish_99
4 Replies

3. Shell Programming and Scripting

Remove repeating pattern from beginning of file names.

I want a shell script that will traverse a file system starting at specific path. And look at all file names for repeating sequences of and remove them from the file name. The portion of the name that gets removed has to be a repeating sequence of the same characters. So the script would... (3 Replies)
Discussion started by: z399y
3 Replies

4. Shell Programming and Scripting

Split binary file with pattern

Hello! Have some problem with extract files from saved session. File contains any kind of special/printable characters. DATA NumberA DATA DATA Begin DATA1.1 DATA1.2 NumberB1 DATA1.3 DATA1.4 End DATA DATA DATA Begin DATA2.1 DATA2.2 NumberB2 DATA2.3 DATA2.4 End DATA DATA ... (4 Replies)
Discussion started by: vvild
4 Replies

5. UNIX for Dummies Questions & Answers

Extract repeating data from file

I want to extract the last rows of a data file, similar to that one below: C1 xxx C2 rrr C3 ttt .... Cn-1 hhh Cn bbb C1 yyy C2 sss C3 uuu ... Cn-1 iii Cn ccc ... I just want to extract the final rows between C1 and Cn at each data file. n is not a constant,... (2 Replies)
Discussion started by: natasha
2 Replies

6. Shell Programming and Scripting

Sed Replace repeating pattern

Hi, I have an sqlplus output file using the character ';' as a delimiter and I would like to replace the fields without datas (i.e delimited by ';;') by ';0;' Example: my sqlplus output: 11;22;33;44;;;77;; What I would like to have: 11;22;33;44;0;0;77;0; Thanks in advance for your... (2 Replies)
Discussion started by: popesk
2 Replies

7. Solaris

How to split 10GB file into small Sizes

Hi Team I have one 10 Gb log file I want to split it into say 10 of 1-1Gb file pls share ur experiences how to do this? Thanks in advance, (3 Replies)
Discussion started by: zimmyyash
3 Replies

8. Shell Programming and Scripting

Split the file based on pattern

Hi , I have huge files around 400 mb, which has clob data and have diffeent scenarios: I am trying to pass scenario number as parameter and and get required modified file based on the scenario number and criteria. Scenario 1: file name : scenario_1.txt ... (2 Replies)
Discussion started by: sol_nov
2 Replies

9. Shell Programming and Scripting

How to grab a block of data in a file with repeating pattern?

I need to send email to receipient in each block of data in a file which has the sender address under TO and just send that block of data where it ends as COMPANY. I tried to work this out by getting line numbers of the string HELLO but unable to grab the next block of data to send the next... (5 Replies)
Discussion started by: loggedout
5 Replies

10. UNIX for Advanced & Expert Users

Split one file to many based on pattern

Hello All, I have records in a file in a pattern A,B,B,B,B,K,A,B,B,K Is there any command or simple logic I can pull out records into multiple files based on A record? I want output as File1: A,B,B,B,B,K File2: A,B,B,K (9 Replies)
Discussion started by: deal1dealer
9 Replies
term::ansi::ctrl::unix(3tcl)					 Terminal control				      term::ansi::ctrl::unix(3tcl)

__________________________________________________________________________________________________________________________________________________

NAME
term::ansi::ctrl::unix - Control operations and queries SYNOPSIS
package require Tcl 8.4 package require term::ansi::ctrl::unix ?0.1.1? ::term::ansi::ctrl::unix::import ?ns? ?arg...? ::term::ansi::ctrl::unix::raw ::term::ansi::ctrl::unix::raw ::term::ansi::ctrl::unix::columns ::term::ansi::ctrl::unix::rows _________________________________________________________________ DESCRIPTION
WARNING: This package is unix-specific and depends on the availability of two unix system commands for terminal control, i.e. stty and tput, both of which have to be found in the $PATH. If any of these two commands is missing the loading of the package will fail. The package provides commands to switch the standard input of the current process between raw and cooked input modes, and to query the size of terminals, i.e. the available number of columns and lines. API
INTROSPECTION ::term::ansi::ctrl::unix::import ?ns? ?arg...? This command imports some or all attribute commands into the namespace ns. This is by default the namespace ctrl. Note that this is relative namespace name, placing the imported command into a child of the current namespace. By default all commands are imported, this can howver be restricted by listing the names of the wanted commands after the namespace argument. OPERATIONS ::term::ansi::ctrl::unix::raw This command switches the standard input of the current process to raw input mode. This means that from then on all characters typed by the user are immediately reported to the application instead of waiting in the OS buffer until the Enter/Return key is received. ::term::ansi::ctrl::unix::raw This command switches the standard input of the current process to cooked input mode. This means that from then on all characters typed by the user are kept in OS buffers for editing until the Enter/Return key is received. ::term::ansi::ctrl::unix::columns This command queries the terminal connected to the standard input for the number of columns available for display. ::term::ansi::ctrl::unix::rows This command queries the terminal connected to the standard input for the number of rows (aka lines) available for display. BUGS, IDEAS, FEEDBACK This document, and the package it describes, will undoubtedly contain bugs and other problems. Please report such in the category term of the Tcllib SF Trackers [http://sourceforge.net/tracker/?group_id=12883]. Please also report any ideas for enhancements you may have for either package and/or documentation. KEYWORDS
ansi, columns, control, cooked, input mode, lines, raw, rows, terminal CATEGORY
Terminal control COPYRIGHT
Copyright (c) 2006-2011 Andreas Kupries <andreas_kupries@users.sourceforge.net> term 0.1.1 term::ansi::ctrl::unix(3tcl)
All times are GMT -4. The time now is 11:11 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy