How to read and write a random row from a file?


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers How to read and write a random row from a file?
# 1  
Old 05-08-2008
How to read and write a random row from a file?

Lets say I have a file abc.txt and it has about 35 million rows. I would like to take a sample of 100 random rows from that file for my testing purpose and write it to a file say test.txt.

How do I do this operation?

Thanks,
Sashank
# 2  
Old 05-12-2008
If you need it to be truly random, you'll need a suitable random number generator.

Once you have 100 random (or predefined if that's all you need) numbers, normalise them against the number of lines in your file (I'd suggest you select 100 numbers between 0 and 1, then multiply that by the lines in the file). That will give you a list of 100 lines numbers.
Code:
#!/bin/sh
# make LINES contain a space seperated list of your line numbers
INPUT="abc.txt"
OUTPUT="test.txt"
rm -f $OUTPUT
for line in $LINES
do
  head -$line $INPUT | tail -1 >> $OUTPUT
done

It's clumsy and won't be very quick (especially for lines further into the file) but it'll do the job.

A better solution would be to seek into the file the correct distance and just dd out the required data.
# 3  
Old 05-13-2008
Quote:
Originally Posted by Smiling Dragon
If you need it to be truly random, you'll need a suitable random number generator.

Once you have 100 random (or predefined if that's all you need) numbers, normalise them against the number of lines in your file (I'd suggest you select 100 numbers between 0 and 1, then multiply that by the lines in the file). That will give you a list of 100 lines numbers.
Code:
#!/bin/sh
# make LINES contain a space seperated list of your line numbers
INPUT="abc.txt"
OUTPUT="test.txt"
rm -f $OUTPUT
for line in $LINES
do
  head -$line $INPUT | tail -1 >> $OUTPUT
done

It's clumsy and won't be very quick (especially for lines further into the file) but it'll do the job.

A better solution would be to seek into the file the correct distance and just dd out the required data.
Why not use sed in place of the head/tail?

sed -n "$line p" filename >> $OUTOUT
# 4  
Old 05-13-2008
Quote:
Originally Posted by frozentin
Why not use sed in place of the head/tail?
sed -n "$line p" filename >> $OUTOUT
Didn't know sed could do that... Handy Smilie I wonder which one uses more cycles as they both would have to parse and reparse the file... I suspect yours will be quicker
# 5  
Old 05-13-2008
This was something I picked up from these forums a couple of days back. Didn't know I would be passing on this "knowledge" to others so soon.

About the cycles: yes, I think so.
# 6  
Old 05-13-2008
Much faster use of sed...
Code:
 sed -n ${line}'{p;q;}' filename

Upon reaching the desired line, sed will print it and then immediately exit. It will not continue uselessly reading the rest of the file.
# 7  
Old 05-13-2008
You can combine all of them into a single sed script, which will be a lot quicker.

Say, given a list of numbers in increasing order in file1, you can

Code:
sed -e 's/$/p/' file1 | sed  -n -f - bigfile >samples

to read the big file just once, and print the selected line numbers.

(Not all sed implementations understand "-f -" I have been stymied to learn; you need a temporary file then, obviously.)

If you are a bit clever you can also make it quit after printing the last one, to avoid needlessly reading the big input file through to the end. Implementing that is left as an exercise for the astute reader. (-:

Last edited by era; 05-13-2008 at 06:14 AM.. Reason: Oops, forgot sed -n
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Read line from the file and append it to each row

Hi All, We have a file in the following format: 0.010000 $ ITI 11 LV2 $ 40456211 $ 0.135000 $ ITI 11 LV1 $ 40512211 $ 1.215600 $ ITI 11 ITI3 $ 41406211 $ 24/05/2014 14:05:02 0.030000 $ ITI 11 LV2 $ 40456211 $ ... (3 Replies)
Discussion started by: gauravsinghal79
3 Replies

2. Shell Programming and Scripting

Read/write perl file

Hi I am trying to build a web form where it can take the input from the user and write it to a file. And when I will open that form again that for should read the file that was created at the 1st step and all the fields should auto populate from that file. I have 20 text fields in my form. I... (1 Reply)
Discussion started by: sauravrout
1 Replies

3. Shell Programming and Scripting

Read row number from 1 file and print that row of second file

Hi. How can I read row number from one file and print that corresponding record present at that row in another file. eg file1 1 3 5 7 9 file2 11111 22222 33333 44444 55555 66666 77777 88888 99999 (3 Replies)
Discussion started by: Abhiraj Singh
3 Replies

4. Shell Programming and Scripting

Perl write and read on same file

Hi, I am trying to do a write operation followed by a read operation on the same file through Perl, expecting the output produced by read to contain the new lines added, as follows: #! /usr/bin/perl -w open FH, "+< testfile" or die "$@"; print FH "New content added\n"; while (my $line =... (1 Reply)
Discussion started by: royalibrahim
1 Replies

5. Shell Programming and Scripting

Read and write in the file

Hello Guys, How all are doing? I have an issue in Unix and want help from all of you I have a file in UNIX which it read by line by line , If at the end of line '0' is written the it should fetch that line into another file and change '0' to '1' and If at the end of line '1' is written then it... (10 Replies)
Discussion started by: adisky123
10 Replies

6. Shell Programming and Scripting

File Read and Write

I have got a file in following format: AAAAAAA BBBBBBBB CCCCCCC DDDDDDD I am trying to read this file and out put it in following format: AAAAAAA,BBBBBBB,CCCCCCC,DDDDDD Preferred method is shell or Perl. Any help appreciated. (11 Replies)
Discussion started by: Araoki
11 Replies

7. Linux

File read/ write operation

Hi, I am creating a progress bar for file upload for which I have CGI script which copies the data and depending on certain bytes it increments the progress bar. Here, I am writing the incremented value to a file which is read by Ajax at server end. However, here I want to ask that, is it... (18 Replies)
Discussion started by: xs2punit
18 Replies

8. IP Networking

read/write,write/write lock with smbclient fails

Hi, We have smb client running on two of the linux boxes and smb server on another linux system. During a backup operation which uses smb, read of a file was allowed while write to the same file was going on.Also simultaneous writes to the same file were allowed.Following are the settings in the... (1 Reply)
Discussion started by: swatidas11
1 Replies

9. Shell Programming and Scripting

Read random line from a text file

I have a text file with hundreds of lines, i wish to run a script and reads a random line to pass it to another command line such as: for line in `cat file |grep random line`; do echo $line |mail my@example.com ; done thank you (6 Replies)
Discussion started by: Bashar
6 Replies

10. Shell Programming and Scripting

read and write from a file

I have tried to show the file name whose size is greater than 200 byte in current directory. Please help me. ls -l | tr -s " " " " | cut -f 5,9 -d " " >out.txt #set -a x `cat out.txt` i=0 `cat out.txt` | while do read x echo $x #re=200 j=0 if }" < "200" ] then echo $j j=`expr $j... (2 Replies)
Discussion started by: rinku
2 Replies
Login or Register to Ask a Question