Sponsored Content
Top Forums Shell Programming and Scripting Extract data from large file 80+ million records Post 302321975 by learner16s on Tuesday 2nd of June 2009 12:41:08 PM
Old 06-02-2009
Extract data from large file 80+ million records

Hello,

I have got one file with more than 120+ million records(35 GB in size). I have to extract some relevant data from file based on some parameter and generate other output file.

What will be the besat and fastest way to extract the ne file.

sample file format :--
++++++7777jjjjjjj0000000000 ( header record)
2098 POCG 0000 KKKK
2097 KOLL 0F00 KLLL
2095 LKJH 0L99 L0IU
.
.
.
.

********66666666666**** ( trailer record

Now suppose i enter the key as 2098(field as key) , so all rercords with 2098 as the first record should be moved to new file.

**********************************************

I tried to use grep ...but it took a lot of time ..nearly 45 mintues to give me output file.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need to Extract Data From 94000 records

i have a input file which does not have a delimiter All i Need to do is to identify a line and extract the data from it and run the loop again and need to ensure that it was not extracted earlier Input file ------------ abcd 12345 egfhijk ip 192.168.0.1 CNN.com abcd 12345 egfhijk ip... (12 Replies)
Discussion started by: vasimm
12 Replies

2. Shell Programming and Scripting

sort a file which has 3.7 million records

hi, I'm trying to sort a file which has 3.7 million records an gettign the following error...any help is appreciated... sort: Write error while merging. Thanks (6 Replies)
Discussion started by: greenworld
6 Replies

3. Shell Programming and Scripting

How to Pick Random records from a large file

Hi, I have a huge file say with 2000000 records. The file has 42 fields. I would like to pick randomly 1000 records from this huge file. Can anyone help me how to do this? (1 Reply)
Discussion started by: ajithshankar@ho
1 Replies

4. Shell Programming and Scripting

Extract data from records that match pattern

Hi Guys, I have a file as follows: a b c 1 2 3 4 pp gg gh hh 1 2 fm 3 4 g h i j k l m 1 2 3 4 d e f g h j i k l 1 2 3 f 3 4 r t y u i o p d p re 1 2 3 f 4 t y w e q w r a s p a 1 2 3 4 I am trying to extract all the 2's from each row. 2 is just an example... (6 Replies)
Discussion started by: npatwardhan
6 Replies

5. Shell Programming and Scripting

awk - splitting 1 large file into multiple based on same key records

Hello gurus, I am new to "awk" and trying to break a large file having 4 million records into several output files each having half million but at the same time I want to keep the similar key records in the same output file, not to exist accross the files. e.g. my data is like: Row_Num,... (6 Replies)
Discussion started by: kam66
6 Replies

6. Programming

Suitable data structure large number of heterogeneous records

Hi All, I don't need any code for this just some advice. I have a large collection of heterogeneous data (about 1.3 million) which simply means data of different types like float, long double, string, ints. I have built a linked list for it and stored all the different data types in a structure,... (5 Replies)
Discussion started by: shoaibjameel123
5 Replies

7. Shell Programming and Scripting

Matching 10 Million file records with 10 Million in other file

Dear All, I have two files both containing 10 Million records each separated by comma(csv fmt). One file is input.txt other is status.txt. Input.txt-> contains fields with one unique id field (primary key we can say) Status.txt -> contains two fields only:1. unique id and 2. status ... (8 Replies)
Discussion started by: vguleria
8 Replies

8. Shell Programming and Scripting

Split a large file in n records and skip a particular record

Hello All, I have a large file, more than 50,000 lines, and I want to split it in even 5000 records. Which I can do using sed '1d;$d;' <filename> | awk 'NR%5000==1{x="F"++i;}{print > x}'Now I need to add one more condition that is not to break the file at 5000th record if the 5000th record... (20 Replies)
Discussion started by: ibmtech
20 Replies

9. Shell Programming and Scripting

Quick way to select many records from a large file

I have a file, named records.txt, containing large number of records, around 0.5 million records in format below: 28433005 1 1 3 2 2 2 2 2 2 2 2 2 2 2 28433004 0 2 3 2 2 2 2 2 2 1 2 2 2 2 ... Another file is a key file, named key.txt, which is the list of some numbers in the first column of... (5 Replies)
Discussion started by: zenongz
5 Replies

10. Shell Programming and Scripting

Need to extract 8 characters from a large file.

Hi All!! I have a large file containing millions of records. My purpose is to extract 8 characters immediately from the given file. 222222222|ZRF|2008.pdf|2008|01/29/2009|001|B|C|C 222222222|ZRF|2009.pdf|2009|01/29/2010|001|B|C|C 222222222|ZRF|2010.pdf|2010|01/29/2011|001|B|C|C... (5 Replies)
Discussion started by: pavand
5 Replies
editmap(1M)						  System Administration Commands					       editmap(1M)

NAME
editmap - query and edit single records in database maps for sendmail SYNOPSIS
editmap -C file [-N] [-f] [-q | -u | -x] maptype mapname key ["value"...] DESCRIPTION
The editmap command queries or edits one record in a database maps used by the keyed map lookups in sendmail(1M). Arguments are passed on the command line and output (for queries) is directed to standard output. Depending on how it is compiled, editmap handles up to three different database formats, selected using the maptype parameter. See OPER- ANDS. If the TrustedUser option is set in the sendmail configuration file and editmap is invoked as root, the generated files are owned by the specified TrustedUser. OPTIONS
The following options are supported: -C file Use the specified sendmail configuration file (file) to look up the TrustedUser option. -f Disable the folding of all upper case letters in the key to lower case. Normally, all upper case letters in the key are folded to upper case. This is intended to mesh with the -f flag in the K line in sendmail.cf. The value is never case folded. -N Include the null byte that terminates strings in the map (for alias maps). -q Query the map for the specified key. If found, print value to standard output and exit with 0. If not found then print an error message to stdout and exit with EX_UNAVAILABLE. -u Update the record for key with value or inserts a new record if one doesn't exist. Exits with 0 on success or EX_IOERR on failure. -x Delete the specific key from the map. Exit with 0 on success or EX_IOERR on failure. OPERANDS
The following operands are supported: key The left hand side of a record. Each record is of the form: key value key and value are separated by white space. mapname File name of the database map being created. maptype Specifies the database format. The following maptype parameters are available: dbm Specifies DBM format maps. btree Specifies B-Tree format maps. hash Specifies hash format maps. value The right hand side of a record. Each record is of the form: key value key and value are separated by white space. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWsndmu | +-----------------------------+-----------------------------+ SEE ALSO
makemap(1M), sendmail(1M), attributes(5) SunOS 5.10 14 Sep 2001 editmap(1M)
All times are GMT -4. The time now is 08:35 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy