"How to randomly select lines from a text file"

Tuesday 23rd of October 2012

For the fun of it here's another way that does not use awk although the awk version will be more efficient. This has the overhead of creating the pipeline repeatedly which should be avoided for good practice. Also I believe the ksh RANDOM built-in has a limit of 32767 that must be considered if the file is large.
$ cat x
## x nbr_of_lines_wanted  filename


((lines_avail=$(wc -l < "$file")+1))

while (( $iterations > 0 )); do
  head -$((${RANDOM} % $lines_avail)) "$file" | tail -1
  (( iterations=$iterations - 1 ))

exit 0

This is actually a good example of how a seemingly simple solution for a small file can end up burning you on performance and system limitations should you need to run it on a much larger file
or a system that may see increased load in the future.
Typically when you see a long command line or pipeline like this being done a large number of times (especially a user-enterable number of times) it should
be a red flag warning that there will most likely be a more efficient way of structuring the program.

Last edited by gary_w; 10-23-2012 at 06:11 PM..
This User Gave Thanks to gary_w For This Post:
