Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Speeding/Optimizing GREP search on CSV files Post 302450961 by Whit3H0rse on Sunday 5th of September 2010 04:36:01 AM
Old 09-05-2010
Speeding/Optimizing GREP search on CSV files

Hi all,

I have problem with searching hundreds of CSV files, the problem is that search is lasting too long (over 5min).

Csv files are "," delimited, and have 30 fields each line, but I always grep same 4 fields - so is there a way to grep just those 4 fields to speed-up search.

Example:
1,22,11,44,55,7,45,55,55,555,55,66,6,66,book12horror,book34horror,book24horror,book45horror,22,44,55 ..etc.

grep -h "book34" /home/data/books/*

Also I wanted to know how to optimize grep on Solaris10, what option to use to speed up searches, what tweaks can I do on Solaris? Maybe awk can help me on this one?
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Speeding up a Shell Script (find, grep and a for loop)

Hi all, I'm having some trouble with a shell script that I have put together to search our web pages for links to PDFs. The first thing I did was: ls -R | grep .pdf > /tmp/dave_pdfs.outWhich generates a list of all of the PDFs on the server. For the sake of arguement, say it looks like... (8 Replies)
Discussion started by: Dave Stockdale
8 Replies

2. Shell Programming and Scripting

grep'ing and sed'ing chunks in bash... need help on speeding up a log parser.

I have a file that is 20 - 80+ MB in size that is a certain type of log file. It logs one of our processes and this process is multi-threaded. Therefore the log file is kind of a mess. Here's an example: The logfile looks like: "DATE TIME - THREAD ID - Details", and a new file is created... (4 Replies)
Discussion started by: elinenbe
4 Replies

3. UNIX for Dummies Questions & Answers

Using grep to search within files

Hi, At my company, we have custom web sites that we create for different clients. The folder structure is something like: <project name>/html/web/custom/ The custom folder contains a file called "category.html" Every project has the same folder structure, and same file names but, the data... (2 Replies)
Discussion started by: miklo
2 Replies

4. UNIX for Dummies Questions & Answers

Reading compressed files during a grep search

All, The bottom line is that im reading a file, storing it as variables, recursively grep searching it, and then piping it to allow word counts as well. I am unsure on how to open any .zip .tar and .gzip, search for keywords and return results. Any help would be much appreciated! Thanks (6 Replies)
Discussion started by: ryan.lee
6 Replies

5. UNIX for Dummies Questions & Answers

pattern search using grep in specific range of files

Hi, I am trying to do the following: grep -l <pattern> <files to be searched for> In <files to be searched for> , all files should of some specific date like "Apr 8" not all files in current directory. I just to search within files Apr 8 files so that it won't search in entire list of... (2 Replies)
Discussion started by: apjneeraj
2 Replies

6. Shell Programming and Scripting

Perl search csv fileA where two strings exist on another csv fileB

Hi I have two csv files, with the following formats: FileA.log: Application, This occured blah Application, That occured blah Application, Also this AnotherLog, Bob did this AnotherLog, Dave did that FileB.log: Uk, London, Application, datetime, LaterDateTime, Today it had'nt... (8 Replies)
Discussion started by: PerlNewbRP
8 Replies

7. Shell Programming and Scripting

Speeding up search and replace in a for loop

Hello, I am using sed in a for loop to replace text in a 100MB file. I have about 55,000 entries to convert in a csv file with two entries per line. The following script works to search file.txt for the first field from conversion.csv and then replace it with the second field. While it works fine,... (15 Replies)
Discussion started by: pbluescript
15 Replies

8. Shell Programming and Scripting

awk read column csv and search in other csv

hi, someone to know how can i read a specific column of csv file and search the value in other csv columns if exist the value in the second csv copy entire row with all field in a new csv file. i suppose that its possible using awk but i m not expertise thanks in advance (8 Replies)
Discussion started by: giankan
8 Replies

9. Shell Programming and Scripting

Optimizing search using grep

I have a huge log file close to 3GB in size. My task is to generate some reporting based on # of times something is being logged. I need to find the number of time StringA , StringB , StringC is being called separately. What I am doing right now is: grep "StringA" server.log | wc -l... (4 Replies)
Discussion started by: Junaid Subhani
4 Replies

10. Shell Programming and Scripting

Speeding up shell script with grep

HI Guys hoping some one can help I have two files on both containing uk phone numbers master is a file which has been collated over a few years ad currently contains around 4 million numbers new is a file which also contains 4 million number i need to split new nto two separate files... (4 Replies)
Discussion started by: dunryc
4 Replies
CSV(3pm)						User Contributed Perl Documentation						  CSV(3pm)

NAME
Class::CSV - Class based CSV parser/writer SYNOPSIS
use Class::CSV; my $csv = Class::CSV->parse( filename => 'test.csv', fields => [qw/item qty sub_total/] ); foreach my $line (@{$csv->lines()}) { $line->sub_total('$'. sprintf("%0.2f", $line->sub_total())); print 'Item: '. $line->item(). " ". 'Qty: '. $line->qty(). " ". 'SubTotal: '. $line->sub_total(). " "; } my $cvs_as_string = $csv->string(); $csv->print(); my $csv = Class::CSV->new( fields => [qw/userid username/], line_separator => " "; ); $csv->add_line([2063, 'testuser']); $csv->add_line({ userid => 2064, username => 'testuser2' }); DESCRIPTION
This module can be used to create objects from CSV files, or to create CSV files from objects. Text::CSV_XS is used for parsing and creating CSV file lines, so any limitations in Text::CSV_XS will of course be inherant in this module. EXPORT None by default. METHOD
CONSTRUCTOR parse the parse constructor takes a hash as its paramater, the various options that can be in this hash are detailed below. Required Options o fields - an array ref containing the list of field names to use for each row. there are some reserved words that cannot be used as field names, there is no checking done for this at the moment but it is something to be aware of. the reserved field names are as follows: "string", "set", "get". also field names cannot contain whitespace or any characters that would not be allowed in a method name. Source Options (only one of these is needed) o filename - the path of the CSV file to be opened and parsed. o filehandle - the file handle of the CSV file to be parsed. o objects - an array ref of objects (e.g. Class::DBI objects). for this to work properly the field names provided in fields needs to correspond to the field names of the objects in the array ref. o classdbi_objects - depreciated use objects instead - using classdbi_objects will still work but its advisable to update your code. Optional Options o line_separator - the line seperator to be included at the end of every line. defaulting to " " (unix carriage return). new the new constructor takes a hash as its paramater, the same options detailed in parse apply to new however no Source Options can be used. this constructor creates a blank CSV object of which lines can be added via add_line. ACCESSING lines returns an array ref containing objects of each CSV line (made via Class::Accessor). the field names given upon construction are available as accessors and can be set or get. for more information please see the notes below or the perldoc for Class::Accessor. the lines accessor is also able to be updated/retrieved in the same way as individual lines fields (examples below). Example retrieving the lines: my @lines = @{$csv->lines()}; removing the first line: pop @lines; $csv->lines(@lines); sorting the lines: @lines = sort { $a->userid() <=> $b->userid() } @lines: $csv->lines(@lines); sorting the lines (all-in-one way): $csv->lines([ sort { $a->userid() <=> $b->userid() } @{$csv->lines()} ]); Retrieving a fields value there is two ways to retrieve a fields value (as documented in Class::Accessor). firstly you can call the field name on the object and secondly you can call "get" on the object with the field name as the argument (multiple field names can be specified to retrieve an array of values). examples are below. my $value = $line->test(); OR my $value = $line->get('test'); OR my @values = $line->get(qw/test test2 test3/); Setting a fields value setting a fields value is simmilar to getting a fields value. there are two ways to set a fields value (as documented in Class::Accessor). firstly you can simply call the field name on the object with the value as the argument or secondly you can call "set" on the object with a hash of fields and their values to set (this isn't standard in Class::Accessor, i have overloaded the "set" method to allow this). examples are below. $line->test('123'); OR $line->set( test => '123' ); OR $line->set( test => '123', test2 => '456' ); Retrieving a line as a string to retrieve a line as a string simply call "string" on the object. my $string = $line->string(); new_line returns a new line object, this can be useful for to "splice" a line into lines (see example below). you can pass the values of the line as an ARRAY ref or a HASH ref. Example my $line = $csv->new_line({ userid => 123, domainname => 'splicey.com' }); my @lines = $csv->lines(); splice(@lines, 1, 0, $line); OR splice(@{$csv->lines()}, 1, 0, $csv->new_line({ userid => 123, domainname => 'splicey.com' })); add_line adds a line to the lines stack. this is mainly useful when the new constructor is used but can of course be used with any constructor. it will add a new line to the end of the lines stack. you can pass the values of the line as an ARRAY ref or a HASH ref. examples of how to use this are below. Example $csv->add_line(['house', 100000, 4]); $csv->add_line({ item => 'house', cost => 100000, bedrooms => 4 }); OUTPUT string returns the object as a string (CSV file format). print calls "print" on string (prints the CSV to STDOUT). SEE ALSO
Text::CSV_XS, Class::Accessor AUTHOR
David Radunz, <david@boxen.net> COPYRIGHT AND LICENSE
Copyright 2004 by David Radunz This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.10.0 2007-02-08 CSV(3pm)
All times are GMT -4. The time now is 02:19 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy