Sponsored Content
Top Forums Shell Programming and Scripting split large file based on field criteria Post 302327123 by asriva on Friday 19th of June 2009 05:58:36 PM
Old 06-19-2009
split large file based on field criteria

I have a file containing date/time sorted data of the form
...
2009/06/10,20:59:59.950,XAG/USD,Q,1,1115, 14.3025,100,1,1
2009/06/10,20:59:59.950,XAG/USD,Q,1,1116, 14.3026,125,1,1
2009/06/10,20:59:59.950,XAG/USD,R,0,0, , 0,0,0
2009/06/10,20:59:59.950,XAG/USD,R,1,0, 14.1910,100,1,1
2009/06/10,20:59:59.950,XAG/USD,A,0,, 14.3011,100,1
2009/06/10,21:00:00.100,CHF/JPY,Q,0,0, , 0,0,0
2009/06/10,21:00:00.100,CHF/JPY,Q,1,0, 70.26, 60,2,2
2009/06/10,21:00:00.150,CHF/JPY,D,0, 70.14, 20,XC05, ,NYD9,US,NYA1
...

I want to split this file into exactly two files based on the the date/time criteria. The criteria is all the lines with timestamps less than and equal to 21:00:00.000 should go to 'file1' and greater than 21:00:00.000 should goto 'file2'.

I wrote a simple script using while loop reading each line and matching criteria.
The script works fine but since these files containing data are huge (gigs), the processing takes forever.

Is there a a better way (sed, awk, egrep or even split) to use this more effeciently??

Thanks.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split file based on field

Hi I have a large file 2.6 million records and I am trying to split the file based on last column. I am doing awk -F"|" '{ print > $NF }' filename1 After around 1000 splits it gives me a error awk: can't open file 3332332423 input record number 1068, file filename1 source... (6 Replies)
Discussion started by: s_adu
6 Replies

2. Solaris

Split a file which a word criteria in two files with awk

Hello, I'm searching with the Awk command to split a file into two others files. I explain : in the file N°1 I search the word "NameVirtual" and since that word to the end of the file I want to store all lines in a new file N°2 Also from that word to the beginning of the file I want to... (11 Replies)
Discussion started by: steiner
11 Replies

3. Shell Programming and Scripting

Split large file based on last digit from a column

Hello, What's the best way to split a large into multiple files based on the last digit in the first column. input file: f 2738483300000x0y03772748378831x1y13478378358383x2y23743878383802x3y33787828282820x4y43748838383881x5y5 Desired Output: f0 3738483300000x0y03787828282820x4y4 f1... (9 Replies)
Discussion started by: alain.kazan
9 Replies

4. UNIX for Dummies Questions & Answers

remove duplicates based on a field and criteria

Hi, I have a file with fields like below: A;XYZ;102345;222 B;XYZ;123243;333 C;ABC;234234;444 D;MNO;103345;222 E;DEF;124243;333 desired output: C;ABC;234234;444 D;MNO;103345;222 E;DEF;124243;333 ie, if the 4rth field is a duplicate.. i need only those records where... (5 Replies)
Discussion started by: wanderingmind16
5 Replies

5. Shell Programming and Scripting

Splitting large file and renaming based on field

I am trying to update an older program on a small cluster. It uses individual files to send jobs to each node. However the newer database comes as one large file, containing over 10,000 records. I therefore need to split this file. It looks like this: HMMER3/b NAME 1-cysPrx_C ACC ... (2 Replies)
Discussion started by: fozrun
2 Replies

6. Shell Programming and Scripting

Split a file into multiple files based on field value

Hi, I've one requirement. I have to split one comma delimited file into multiple files based on one of the column values. How can I achieve this Unix Here is the sample data. In this case I have split the files based on date column(c4) Input file c1,c2,c3,c4,c5... (1 Reply)
Discussion started by: manasvi24
1 Replies

7. Shell Programming and Scripting

How to split file into multiple files using awk based on 1 field in the file?

Good day all I need some helps, say that I have data like below, each field separated by a tab DATE NAME ADDRESS 15/7/2012 LX a.b.c 15/7/2012 LX1 a.b.c 16/7/2012 AB a.b.c 16/7/2012 AB2 a.b.c 15/7/2012 LX2 a.b.c... (2 Replies)
Discussion started by: alexyyw
2 Replies

8. Shell Programming and Scripting

Help needed - Split large file into smaller files based on pattern match

Help needed urgently please. I have a large file - a few hundred thousand lines. Sample CP START ACCOUNT 1234556 name 1 CP END ACCOUNT CP START ACCOUNT 2224444 name 1 CP END ACCOUNT CP START ACCOUNT 333344444 name 1 CP END ACCOUNT I need to split this file each time "CP START... (7 Replies)
Discussion started by: frustrated1
7 Replies

9. Shell Programming and Scripting

Split Large Files Based On Row Pattern..

Hi all. I've tried searching the web but could not find similar problem to mine. I have one large file to be splitted into several files based on the matching pattern found in each row. For example, let's say the file content: ... (13 Replies)
Discussion started by: aimy
13 Replies

10. Shell Programming and Scripting

Split file based on a column/field value

Hi All, I have a requirement to split file into 2 sets of file. Below is a sample data of the file AU;PTN;24EX;25-AUG-14;AU;123;SE;123;Test NN;;;;ASD; AU;PTN;24EX;25-AUG-14;AU;456;SE;456;Test NN;;;;ASD; AU;PTN;24EX;25-AUG-14;AU;147;SE;147;Test NN;;;;ASD;... (6 Replies)
Discussion started by: galaxy_rocky
6 Replies
TIMECOUNTERS(4) 					   BSD Kernel Interfaces Manual 					   TIMECOUNTERS(4)

NAME
timecounters -- kernel time counters subsystem SYNOPSIS
The kernel uses several types of time-related devices, such as: real time clocks, time counters and event timers. Real time clocks are responsible for tracking real world time, mostly when the system is down. Time counters are responsible for tracking purposes, when the sys- tem is running. Event timers are responsible for generating interrupts at a specified time or periodically, to run different time-based events. This page is about the second. DESCRIPTION
Time counters are the lowest level of time tracking in the kernel. They provide monotonically increasing timestamps with known width and update frequency. They can overflow, drift, etc and so in raw form can be used only in very limited performance-critical places like the process scheduler. More usable time is created by scaling the values read from the selected time counter and combining it with some offset, regularly updated by tc_windup() on hardclock() invocation. Different platforms provide different kinds of timer hardware. The goal of the time counters subsystem is to provide a unified way to access that hardware. Each driver implementing time counters registers them with the subsystem. It is possible to see the list of present time counters, via the kern.timecounter sysctl(8) variable: kern.timecounter.choice: TSC-low(-100) HPET(950) i8254(0) ACPI-fast(900) dummy(-1000000) kern.timecounter.tc.ACPI-fast.mask: 16777215 kern.timecounter.tc.ACPI-fast.counter: 13467909 kern.timecounter.tc.ACPI-fast.frequency: 3579545 kern.timecounter.tc.ACPI-fast.quality: 900 kern.timecounter.tc.i8254.mask: 65535 kern.timecounter.tc.i8254.counter: 62692 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.HPET.mask: 4294967295 kern.timecounter.tc.HPET.counter: 3013495652 kern.timecounter.tc.HPET.frequency: 14318180 kern.timecounter.tc.HPET.quality: 950 kern.timecounter.tc.TSC-low.mask: 4294967295 kern.timecounter.tc.TSC-low.counter: 4067509463 kern.timecounter.tc.TSC-low.frequency: 11458556 kern.timecounter.tc.TSC-low.quality: -100 The output nodes are defined as follows: kern.timecounter.tc.X.mask is a bitmask, defining valid counter bits, kern.timecounter.tc.X.counter is a present counter value, kern.timecounter.tc.X.frequency is a counter update frequency, kern.timecounter.tc.X.quality is an integral value, defining the quality of this time counter compared to others. A negative value means this time counter is broken and should not be used. The time management code of the kernel chooses one time counter from that list. The current choice can be read and affected via the kern.timecounter.hardware tunable/sysctl. SEE ALSO
attimer(4), eventtimers(4), ffclock(4), hpet(4) BSD
April 12, 2014 BSD
All times are GMT -4. The time now is 06:07 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy