Sponsored Content
Top Forums Shell Programming and Scripting Split Large Files Based On Row Pattern.. Post 302876633 by Don Cragun on Monday 25th of November 2013 11:00:36 PM
Old 11-26-2013
As long as you don't have more than about 10 different output files to be produced from an input file, the following awk script should do what you want:
Code:
awk -F, '
FNR == 1 {
        h = $0
        next
}
{       ofile = $6".TXT"
        if(!(ofile in ofiles)) {
                ofiles[ofile]
                print h > ofile
        }
        print > ofile
}' file

If you have a lot of output files, you'll need to keep track of how many files are open and close and reopen files as needed. Since your sample input only produces three output files, there was no need to keep track of open files (other than to print the header in each new output file).

If you want to run this script on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk.
This User Gave Thanks to Don Cragun For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

split large file based on field criteria

I have a file containing date/time sorted data of the form ... 2009/06/10,20:59:59.950,XAG/USD,Q,1,1115, 14.3025,100,1,1 2009/06/10,20:59:59.950,XAG/USD,Q,1,1116, 14.3026,125,1,1 2009/06/10,20:59:59.950,XAG/USD,R,0,0, , 0,0,0 2009/06/10,20:59:59.950,XAG/USD,R,1,0, 14.1910,100,1,1... (6 Replies)
Discussion started by: asriva
6 Replies

2. Shell Programming and Scripting

Split large file based on last digit from a column

Hello, What's the best way to split a large into multiple files based on the last digit in the first column. input file: f 2738483300000x0y03772748378831x1y13478378358383x2y23743878383802x3y33787828282820x4y43748838383881x5y5 Desired Output: f0 3738483300000x0y03787828282820x4y4 f1... (9 Replies)
Discussion started by: alain.kazan
9 Replies

3. Shell Programming and Scripting

Split a file into multiple files based on the input pattern

I have a file with lines something like. ...... 123_start ...... ....... 123_end .... ..... 456_start ...... ..... 456_end .... ..... 789_start .... .... 789_end (6 Replies)
Discussion started by: abinash
6 Replies

4. Shell Programming and Scripting

Splitting large file into multiple files in unix based on pattern

I need to write a shell script for below scenario My input file has data in format: qwerty0101TWE 12345 01022005 01022005 datainala alanfernanded 26 qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28 qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43 qwerty0101CFG 12345... (19 Replies)
Discussion started by: jimmy12
19 Replies

5. Shell Programming and Scripting

split XML file into multiple files based on pattern

Hello, I am using awk to split a file into multiple files using command: nawk '{ if ( $1 == "<process" ) { n=split($2, arr, "\""); file=arr } print > file }' processes.xml <process name="Process1.process"> ... (3 Replies)
Discussion started by: chiru_h
3 Replies

6. Shell Programming and Scripting

Problem with splitting large file based on pattern

Hi Experts, I have to split huge file based on the pattern to create smaller files. The pattern which is expected in the file is: Master..... First... second.... second... third.. third... Master... First.. second... third... Master... First... second.. second.. second..... (2 Replies)
Discussion started by: saisanthi
2 Replies

7. Shell Programming and Scripting

Help needed - Split large file into smaller files based on pattern match

Help needed urgently please. I have a large file - a few hundred thousand lines. Sample CP START ACCOUNT 1234556 name 1 CP END ACCOUNT CP START ACCOUNT 2224444 name 1 CP END ACCOUNT CP START ACCOUNT 333344444 name 1 CP END ACCOUNT I need to split this file each time "CP START... (7 Replies)
Discussion started by: frustrated1
7 Replies

8. UNIX for Dummies Questions & Answers

Split a huge 7 GB File Based on Pattern into 4 files

Hi, I have a Huge 7 GB file which has around 1 million records, i want to split this file into 4 files to contain around 250k messages each. Please help me as Split command cannot work here as it might miss tags.. Format of the file is as below <!--###### ###### START-->... (6 Replies)
Discussion started by: KishM
6 Replies

9. UNIX for Advanced & Expert Users

Split one file to many based on pattern

Hello All, I have records in a file in a pattern A,B,B,B,B,K,A,B,B,K Is there any command or simple logic I can pull out records into multiple files based on A record? I want output as File1: A,B,B,B,B,K File2: A,B,B,K (9 Replies)
Discussion started by: deal1dealer
9 Replies

10. Shell Programming and Scripting

Split files based on row delimiter count

I have a huge file (around 4-5 GB containing 20 million rows) which has text like: <EOFD>11<EOFD>22<EORD>2<EOFD>2222<EOFD>3333<EORD>3<EOFD>44<EOFD>55<EORD>66<EOFD>888<EOFD>9999<EORD> Actually above is an extracted file from a Sql Server with each field delimited by <EOFD> and each row ends... (8 Replies)
Discussion started by: amvip
8 Replies
sadc(8) 						    BSD System Manager's Manual 						   sadc(8)

NAME
sadc -- system activity data collector SYNOPSIS
/usr/lib/sa/sadc [-m mode] [t n] [ofile] DESCRIPTION
The sadc tool is used to collect cumulative system activity data. The sample system data is collected at intervals t seconds apart, in a loop n times. The binary sample data is written to ofile if specified. Otherwise, the binary data is written to stdout. If the ofile file does not exist, it is created, otherwise it is truncated. sadc is intended to be used as the engine behind the sar(1) command, and is not typically invoked on the command line. Two shell scripts, sa1 and sa2 are provided to drive the typical sampling, saving, and reporting process. OPTIONS
The following options modify the way data is collected by sadc. -m mode Modify the collection of system statistics as specified by mode. Currently only one mode is supported. PPP By default, the collection of ppp network interface statistics is turned off. This is because the number of ppp connec- tions can be very high, causing the raw data file to grow unexpectedly large, especially when samples are collected at short intervals. Use the PPP mode to turn the collection back on. EXAMPLES
/usr/lib/sa/sadc 15 20 /tmp/sample.out This call collects 20 samples at 15 second intervals. The binary data is written to the /tmp/sample.out file FILES
/var/log/sa/sadd Default daily activity file that holds the binary sampling data. dd are digits that represent the day of the month. /usr/lib/sa/sa1 Shell script used to drive the sar data collection. /usr/lib/sa/sa2 Shell script used to drive the sar data reporting. SEE ALSO
sa1(8), sa2(8), sar(1), iostat(8), vm_stat(1), netstat(1), top(1), sc_usage(1), fs_usage(1) Mac OS X Jul 25 2003 Mac OS X
All times are GMT -4. The time now is 09:47 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy