Sponsored Content
Top Forums Shell Programming and Scripting Extract data from large file 80+ million records Post 302321978 by cfajohnson on Tuesday 2nd of June 2009 12:48:37 PM
Old 06-02-2009
Quote:
Originally Posted by learner16s
I have got one file with more than 120+ million records(35 GB in size). I have to extract some relevant data from file based on some parameter and generate other output file.

...

I tried to use grep ...but it took a lot of time ..nearly 45 mintues to give me output file.

With a file that size, anything is going to take a long time. There's not going to be anything faster than grep, with the possible exception of a filter written in C that does nothing but what you want.

With that much data, you might want to look at using a DBMS, e.g., PostgresQL.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need to Extract Data From 94000 records

i have a input file which does not have a delimiter All i Need to do is to identify a line and extract the data from it and run the loop again and need to ensure that it was not extracted earlier Input file ------------ abcd 12345 egfhijk ip 192.168.0.1 CNN.com abcd 12345 egfhijk ip... (12 Replies)
Discussion started by: vasimm
12 Replies

2. Shell Programming and Scripting

sort a file which has 3.7 million records

hi, I'm trying to sort a file which has 3.7 million records an gettign the following error...any help is appreciated... sort: Write error while merging. Thanks (6 Replies)
Discussion started by: greenworld
6 Replies

3. Shell Programming and Scripting

How to Pick Random records from a large file

Hi, I have a huge file say with 2000000 records. The file has 42 fields. I would like to pick randomly 1000 records from this huge file. Can anyone help me how to do this? (1 Reply)
Discussion started by: ajithshankar@ho
1 Replies

4. Shell Programming and Scripting

Extract data from records that match pattern

Hi Guys, I have a file as follows: a b c 1 2 3 4 pp gg gh hh 1 2 fm 3 4 g h i j k l m 1 2 3 4 d e f g h j i k l 1 2 3 f 3 4 r t y u i o p d p re 1 2 3 f 4 t y w e q w r a s p a 1 2 3 4 I am trying to extract all the 2's from each row. 2 is just an example... (6 Replies)
Discussion started by: npatwardhan
6 Replies

5. Shell Programming and Scripting

awk - splitting 1 large file into multiple based on same key records

Hello gurus, I am new to "awk" and trying to break a large file having 4 million records into several output files each having half million but at the same time I want to keep the similar key records in the same output file, not to exist accross the files. e.g. my data is like: Row_Num,... (6 Replies)
Discussion started by: kam66
6 Replies

6. Programming

Suitable data structure large number of heterogeneous records

Hi All, I don't need any code for this just some advice. I have a large collection of heterogeneous data (about 1.3 million) which simply means data of different types like float, long double, string, ints. I have built a linked list for it and stored all the different data types in a structure,... (5 Replies)
Discussion started by: shoaibjameel123
5 Replies

7. Shell Programming and Scripting

Matching 10 Million file records with 10 Million in other file

Dear All, I have two files both containing 10 Million records each separated by comma(csv fmt). One file is input.txt other is status.txt. Input.txt-> contains fields with one unique id field (primary key we can say) Status.txt -> contains two fields only:1. unique id and 2. status ... (8 Replies)
Discussion started by: vguleria
8 Replies

8. Shell Programming and Scripting

Split a large file in n records and skip a particular record

Hello All, I have a large file, more than 50,000 lines, and I want to split it in even 5000 records. Which I can do using sed '1d;$d;' <filename> | awk 'NR%5000==1{x="F"++i;}{print > x}'Now I need to add one more condition that is not to break the file at 5000th record if the 5000th record... (20 Replies)
Discussion started by: ibmtech
20 Replies

9. Shell Programming and Scripting

Quick way to select many records from a large file

I have a file, named records.txt, containing large number of records, around 0.5 million records in format below: 28433005 1 1 3 2 2 2 2 2 2 2 2 2 2 2 28433004 0 2 3 2 2 2 2 2 2 1 2 2 2 2 ... Another file is a key file, named key.txt, which is the list of some numbers in the first column of... (5 Replies)
Discussion started by: zenongz
5 Replies

10. Shell Programming and Scripting

Need to extract 8 characters from a large file.

Hi All!! I have a large file containing millions of records. My purpose is to extract 8 characters immediately from the given file. 222222222|ZRF|2008.pdf|2008|01/29/2009|001|B|C|C 222222222|ZRF|2009.pdf|2009|01/29/2010|001|B|C|C 222222222|ZRF|2010.pdf|2010|01/29/2011|001|B|C|C... (5 Replies)
Discussion started by: pavand
5 Replies
OGMDEMUX(1)							   User Commands						       OGMDEMUX(1)

NAME
ogmdemux - Extract streams from OGG/OGM files into separate files SYNOPSIS
ogmdemux [options] inname DESCRIPTION
This program extracts all or only some streams from an OGM and writes them to separate files. inname Use 'inname' as the source. -o, --output out Use 'out' as the base for destination file names. '-v1', '-v2', '-a1', '-t1'... will be appended to this name. Default: use 'inname'. -a, --astream n Extract specified audio stream. Can be used more than once. Default: extract all streams. -d, --vstream n Extract specified video stream. Can be used more than once. Default: extract all streams. -t, --tstream n Extract specified text stream. Can be used more than once. Default: extract all streams. -na, --noaudio Don't extract any audio streams. -nv, --novideo Don't extract any video streams. -nt, --notext Don't extract any text streams. Default: extract all streams. -r, --raw Extract the raw streams only. Default: extract to useful formats (AVI, WAV, OGG, SRT...). -v, --verbose Increase verbosity. -h, --help Show this help. -V, --version Show version number. NOTES
What works: * Extraction of the following formats is fully supported including writing the stream contents to useful container formats: video -> AVI Vorbis -> OGG/Vorbis PCM -> WAV text -> text files (SRT subtitle format) * All other audio streams (MP3, AC3) are just copied 1:1 into output files. MP3 and AC3 files should be usable. Others might not. What not works: * Headers created by older OggDS (DirectShow) filter versions are not supported (and probably never will be). AUTHOR
ogmdemux was written by Moritz Bunkus <moritz@bunkus.org>. SEE ALSO
ogmmerge(1), ogmsplit(1), ogminfo(1), ogmcat(1), dvdxchap(1) WWW
The newest version can always be found at <http://www.bunkus.org/videotools/ogmtools/> <http://www.bunkus.org/videotools/ogmtools/> ogmdemux v1.5 November 2004 OGMDEMUX(1)
All times are GMT -4. The time now is 11:50 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy