Sponsored Content
Top Forums Shell Programming and Scripting search a number in very very huge amount of data Post 302578843 by Corona688 on Friday 2nd of December 2011 12:37:41 PM
Old 12-02-2011
Unless you've got a blindingly fast RAID setup, "hours" is to be expected for 10 terabytes no matter what you do.

That's a useless use of ls *.

You don't want 9 processes writing to the same file at the same time. They may interfere with each other, overwriting each others' lines, etc.

It's looking for the string 'variable' because you put it in //, just give it the variable. You didn't even name it 'variable' though, you named it 'var'. Try $2 ~ var

In summary, I'd do this:

Code:
zcat * | awk -F"|" -v var="$1" '$2 ~ var' > output.txt

Depending on whether your disk's faster than your processor or vice versa, there may be ways to speed this up by running multiple gunzip's at once. I'm not sure how to do that yet but I'll think about it.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

search and grab data from a huge file

folks, In my working directory, there a multiple large files which only contain one line in the file. The line is too long to use "grep", so any help? For example, if I want to find if these files contain a string like "93849", what command I should use? Also, there is oder_id number... (1 Reply)
Discussion started by: ting123
1 Replies

2. Programming

Read/Write a fairly large amount of data to a file as fast as possible

Hi, I'm trying to figure out the best solution to the following problem, and I'm not yet that much experienced like you. :-) Basically I have to read a fairly large file, composed of "messages" , in order to display all of them through an user interface (made with QT). The messages that... (3 Replies)
Discussion started by: emitrax
3 Replies

3. UNIX for Advanced & Expert Users

Best way to search for patterns in huge text files

I have the following situation: a text file with 50000 string patterns: abc2344536 gvk6575556 klo6575556 .... and 3 text files each with more than 1 million lines: ... 000000 abc2344536 46575 0000 000000 abc2344536 46575 4444 000000 abc2344555 46575 1234 ... I... (8 Replies)
Discussion started by: andy2000
8 Replies

4. Shell Programming and Scripting

How to delete a huge number of files at a time

I met a problem on HPUX with 64G RAM and 20 CPU. There are 5 million files with file name from file0000001.dat to file9999999.dat, in the same directory, and with some other files with random names. I was trying to remove all the files from file0000001.dat to file9999999.dat at the same time.... (9 Replies)
Discussion started by: lisp21
9 Replies

5. Shell Programming and Scripting

search a string in a huge file

How to search a string which has occured numerous times in a single row. I tried many options, I am facing issue with the file size. Anything I go for, it says it is huge.. File is 82MB. Assume, the file contains the string 'Name' in many places.. Something Like below. ... (5 Replies)
Discussion started by: Muthuraj K
5 Replies

6. AIX

Error while copying huge amount of data in aix

Hi When i copy 300GB of data from one filesystem to the other filesystem in AIX I get the error : tar: 0511-825 The file 'SAPBRD.dat' is too large. The command I used is : # tar -cf - . | (cd /sapbackup ; tar -xf - ) im copying as root The below is my ulimit -a output : ... (3 Replies)
Discussion started by: samsungsamsung
3 Replies

7. Shell Programming and Scripting

Search and replace ---A huge number of files

Hello Friends, I have the below scenario in my current project. Suggest me which tool ( perl,python etc) is best to this scenario. Or should I go for Programming language ( C/Java ).. (1) I will be having a very big file ( information about 200million subscribers will be stored in it ). This... (5 Replies)
Discussion started by: panyam
5 Replies

8. Shell Programming and Scripting

Perl : Large amount of data put into an array

This basic code works. I have a very long list, almost 10000 lines that I am building into the array. Each line has either 2 or 3 fields as shown in the code snippit. The array elements are static (for a few reasons that out of scope of this question) the list has to be "built in". It... (5 Replies)
Discussion started by: sumguy
5 Replies

9. Shell Programming and Scripting

Aggregation of huge data

Hi Friends, I have a file with sample amount data as follows: -89990.3456 8788798.990000128 55109787.20 -12455558989.90876 I need to exclude the '-' symbol in order to treat all values as an absolute one and then I need to sum up.The record count is around 1 million. How... (8 Replies)
Discussion started by: Ravichander
8 Replies

10. Shell Programming and Scripting

How to make awk command faster for large amount of data?

I have nginx web server logs with all requests that were made and I'm filtering them by date and time. Each line has the following structure: 127.0.0.1 - xyz.com GET 123.ts HTTP/1.1 (200) 0.000 s 3182 CoreMedia/1.0.0.15F79 (iPhone; U; CPU OS 11_4 like Mac OS X; pt_br) These text files are... (21 Replies)
Discussion started by: brenoasrm
21 Replies
search.h(3HEAD) 						      Headers							   search.h(3HEAD)

NAME
search.h, search - search tables SYNOPSIS
#include <search.h> DESCRIPTION
The <search.h> header defines the ENTRY type for structure entry, which includes the following members: char *key void *data and defines ACTION and VISIT as enumeration data types through type definitions as follows: enum { FIND, ENTER } ACTION; enum { preorder, postorder, endorder, leaf } VISIT; The size_t type is defined as described in <sys/types.h>. See types.h(3HEAD). ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Interface Stability |Standard | +-----------------------------+-----------------------------+ SEE ALSO
hsearch(3C), insque(3C), lsearch(3C), tsearch(3C), types.h(3HEAD), attributes(5), standards(5) SunOS 5.11 10 Sep 2004 search.h(3HEAD)
All times are GMT -4. The time now is 05:22 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy