awk with really big files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk with really big files
# 1  
Old 07-08-2009
awk with really big files

Hi,

I have a text file that is around 7Gb which is basically a matrix of numbers (FS is a space and RS is \n). I need the most efficient way of plucking out a number from a specified row and column in the file.

For example, for the value at row 15983, col 26332, I'm currently I'm using:

Code:
x=26332
y=15983

awk -v row=$y -v col=$x 'NR==row {print $col;exit}' big_file_of_numbers.txt

Obviously, the larger the value of x and y (especially y) the longer it takes. A few seconds per value is normal. I need to do this thousands of times.

I'm not sure if I can use NR like this; I mean is one allowed to 'set' the value of NR? It seems to work, it's just not very quick. I'm using exit to prevent it reading the whole thing.

Cheers,

Jon
# 2  
Old 07-08-2009
Simple. You obviously have a file with rows and columns in it. Call it rcfile
Assume the format is
Code:
 13   99
128  50

13 == row number; 99 == column number
Code:
awk 'FILENAME=="rcfile" {arr[$1]=$2}
       FILENAME=="bigfile" && FNR in arr {print $arr[FNR]}'  rcfile bigfile   > newfile

[/code]
Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Grep -f for big files

ok guys. this isnt homework or anything. i have been using grep -f all my life but i am trying this for a huge file and it doesnt work. can someone give me a replacement for grep -f pattern file for big files? thanks (6 Replies)
Discussion started by: ahfze
6 Replies

2. Shell Programming and Scripting

Split a big file into multiple files using awk

this thread is a continuation from previous thread https://www.unix.com/shell-programming-and-scripting/223901-split-big-file-into-multiple-files-based-first-four-characters.html ..I am using awk to split file and I have a syntax error while executing the below code I am using AIX 7.2... (4 Replies)
Discussion started by: etldev
4 Replies

3. Shell Programming and Scripting

AWK - Parse a big file

INPUT SAMPLE Symmetrix ID : 000192601507 Masking View Name : TS00P22_13E_1 Last updated at : 05:10:18 AM on Tue Mar 22,2011 Initiator Group Name : 10000000c960b9cd Host Initiators { WWN : 10000000c960b9cd } Port Group Name :... (8 Replies)
Discussion started by: greycells
8 Replies

4. Shell Programming and Scripting

awk column comparison big file

Hi all, I would like to compare a column in one file to a column in another file and when there is a match it prints the first column and the corresponding second column. Example File1 ABA ABC ABE ABF File 2 ABA 123 ABB 124 ABD 125 ABC 126 So what I would like printed to a... (6 Replies)
Discussion started by: pcg
6 Replies

5. Shell Programming and Scripting

Perl or awk/egrep from big files??

Hi experts. In one thread i have asked you how to grep the string from the below sample file- Unfortunately the script did not gave proper output (it missed many strings). It happened may be i did gave you the proper contents of the file That was the script- "$ perl -00nle'print join... (13 Replies)
Discussion started by: thepurple
13 Replies

6. UNIX for Dummies Questions & Answers

Archiving big ammount of files.

Hello All. I have problem archiving files. The problem is:) I have about 10000 files in one directory, all this file aproximately the same size, i need to gzip them and write on DVD. But all this files take about 15 GB of space (already gzipped). So i need DVD Blue-Ray :p or i need to split... (3 Replies)
Discussion started by: Maxeg
3 Replies

7. Shell Programming and Scripting

How big is my awk array?

Hi All, I'm creating a script that goes through some csv files (output from sar) trying to get some statistics on system performance. This is the code I have so far: awk -F"\",\"" 'NR != 1 { per++ sum += $10 } ... (4 Replies)
Discussion started by: pondlife
4 Replies

8. Shell Programming and Scripting

awk not working as expected with BIG files ...

I am facing some strange problem. I know, there is only one record in a file 'test.txt' which starts with 'X' I ensure that with following command, awk /^X/ test.txt | wc -l This gives me output = '1'. Now I take out this record out of the file, as follows : awk /^X/ test.txt >... (1 Reply)
Discussion started by: videsh77
1 Replies
Login or Register to Ask a Question