03-29-2006
Need to split a large data file using a Unix script
Greetings all:
I am still new to Unix environment and I need help with the following requirement.
I have a large sequential file sorted on a field (say store#) that is being split into several smaller files, one for each store. That means if there are 500 stores, there will be 500 files. This is being done using a SQR program right now. How is this done using a Unix script? Any Pseudocode will be appreciated.
In the below example, the first two records are written to a file and when there's a change in the store#, it writes to an other file. The names of the files are lgXXX where XXX is the store number (i.e, lg002, lg003 and so on).
Format of the input file:
Store# City ZIP
--------------------
002 XXX 01601 ..> written to lg002 file
002 YYY 01601 ..> written to lg002 file
003 AAA 11111 ..> written to lg003 file
004 BBB 11222 ..> written to lg004 file
:
:
:
555 XYZ 99999 ..> written to lg555 file
Thank you!
SaiK
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
I have a large file(csv format) that I need to split into 2 files. The file looks something like
Original_file.txt
first name, family name, address
a, b, c,
d, e, f,
and so on for over 100,00 lines
I need to create two files from this one file. The condition is i need to ensure... (4 Replies)
Discussion started by: nbvcxzdz
4 Replies
2. Shell Programming and Scripting
Hi,
I have a large file with a repeating pattern in it. Now i want the file split into the block of patterns with a specified no. of lines in each file.
i.e. The file is like
1...
2...
2...
3...
1...
2...
3...
1...
2...
2...
2...
2...
2...
3...
where 1 is the start of the block... (5 Replies)
Discussion started by: sudhamacs
5 Replies
3. Shell Programming and Scripting
HI,
i've to split a large file which inputs seems like :
Input file name_file.txt
00001|AAAA|MAIL|DATEOFBIRTHT|.......
00001|AAAA|MAIL|DATEOFBIRTHT|.......
00002|BBBB|MAIL|DATEOFBIRTHT|.......
00002|BBBB|MAIL|DATEOFBIRTHT|.......
00003|CCCC|MAIL|DATEOFBIRTHT|.......... (1 Reply)
Discussion started by: AMARA
1 Replies
4. Shell Programming and Scripting
Hello Everyone,
I have a large file that needs to be split into many seperate files, however the text in between the blank lines need to be intact. The file looks like
SomeText
SomeText
SomeText
SomeOtherText
SomeOtherText
....
Since the number of lines of text are different for... (3 Replies)
Discussion started by: jwillis0720
3 Replies
5. Shell Programming and Scripting
I have a 3 GB text file that I would like to split. How can I do this?
It's a giant comma-separated list of numbers. I would like to make it into about 20 files of ~100 MB each, with a custom header and footer. The file can only be split on commas, but they're plentiful.
Something like... (3 Replies)
Discussion started by: CRGreathouse
3 Replies
6. UNIX for Dummies Questions & Answers
hi ,
I have a requirement
input file:
1 1111111111111 108
1 1111111111111 109
1 1111111111111 109
1 1111111111111 110
1 1111111111111 111
1 1111111111111 111
1 1111111111111 111
1 1111111111111 112
1 1111111111111 112
1 1111111111111 112
The output should be, (19 Replies)
Discussion started by: mechvijays
19 Replies
7. UNIX for Beginners Questions & Answers
Dear Users,
Appreciate your help if you could help me with splitting a large file > 1 million lines with sed or awk. below is the text in the file
input file.txt
scaffold1 928 929 C/T +
scaffold1 942 943 G/C +
scaffold1 959 960 C/T +... (6 Replies)
Discussion started by: kapr0001
6 Replies
8. UNIX for Advanced & Expert Users
Hi,
I have received a file which is 20 GB. We would like to split the file into 4 equal parts and process it to avoid memory issues.
If the record delimiter is unix new line, I could use split command either with option l or b.
The problem is that the line terminator is |##|
How to use... (5 Replies)
Discussion started by: Ravi.K
5 Replies
9. Shell Programming and Scripting
I have a perl script which splits a large file into chunks.The script is given below
use strict;
use warnings;
open (FH, "<monolingual.txt") or die "Could not open source file. $!";
my $i = 0;
while (1) {
my $chunk;
print "process part $i\n";
open(OUT, ">part$i.log") or die "Could... (4 Replies)
Discussion started by: gimley
4 Replies
10. UNIX for Beginners Questions & Answers
Trying to split a 35gb file into 1000mb parts. My research shows I should you this. split -b 1000m file.txt and my return is "split: cannot open 'crunch1.txt' for reading: No such file or directory" so I tried split -b 1000m Documents/Wordlists/file.txt and I get nothing other than the curser just... (3 Replies)
Discussion started by: sub terra
3 Replies
OSMJS(1) General Commands Manual OSMJS(1)
NAME
osmjs - Javascript interpreter for the Osmium framework
SYNOPSIS
osmjs [options] osmfile [args]
DESCRIPTION
This manual page documents briefly the osmjs command.
osmjs is an Osmium based framework for handling OSM data by calling Javascript callbacks for each object in an OSM data file. This gives
you the flexibility of Javascript together with speed of the C++ Osmium framework and the Google V8 Javascript JIT compiler.
osmfile can be an OSM XML (suffix .osm) (optionally packed with bz2 or gz) or PBF (suffix .osm.pbf) file. In single-pass mode it can also
be '-' to read a PBF file from stdin.
OPTIONS
This program follows the usual GNU command line syntax, with long options starting with two dashes (`-'). A summary of options is included
below.
-h, --help
Show summary of options.
-d, --debug
Enable debugging output.
-i FILE, --include=FILE
Include Javascript file (can be given several times)
-j FILE, --javascript=FILE
Process given Javascript file
-l STORE, --location-store=STORE
Set location store (default: 'none'). See below for a list of available stores.
-r, --no-repair
Do not attempt to repair broken multipolygons
-2, --2pass
Read osmfile twice
-m, --multipolygon
Build multipolygons (implies -2)
STORES
none Do not store node locations (you will have no way or polygon geometries)
array Store node locations in large array (use for large OSM files)
disk Store node locations on disk (use when low on memory)
sparsetable
Store node locations in sparse table (use for small OSM files)
AUTHOR
Osmium was written by Jochen Topf <jochen@topf.org>.
This manual page was written by David Paleino <dapal@debian.org>, for the Debian project (and may be used by others).
November 14, 2011 OSMJS(1)