Get 20% of lines in File randomly

Get 20% of lines in File randomly
# 1  
Get 20% of lines in File randomly


This is my code:
nb_lignes=`wc -l $1 | cut -d " " -f1`
for i in $(seq $nb_lignes)
m=`head $1 -n $i | tail -1`

Please how can i change it to get Get 20% of lines in File randomly to apply "command" on each line ? 20% or 40% or 60 % (it's a parameter)

Thank you.

# 2  
[root@host dir]# head -$(( $(wc -l file | cut -d" " -f1) * 40 / 100 )) file

# 3  
If I'm right your problem is to randomly generate N integers in the range [1;M], where M is the number of lines in the file, and N is a rounded percentage of M.

Before going on: are repetitions allowed? In other words, may a number (a line in the file) be randomly selected more than once?

And: what are your shell and OS?

EDIT: solution proposed

However, this is a possible solution working on linux/bash, with repetitions not allowed:
range=$(wc -l "$1" | cut -d " " -f1)
(( $range > 32768 )) && exit              ### max number of lines for this script: 32768
percent=$2                                ### set this  as you like, [1-100]
(( 0 < $2 )) && (( $2 < 101 )) || exit
lim=$(( $range * $percent / 100 ))
for ((i=0;i<lim;i++)); do
        num=$(( $RANDOM % $range + 1 ));
        for ((j=0;j<i;j++)); do
                (( ${arr[$j]} == $num )) && {
                        let i--
                        break; }

for linenum in "${arr[@]}"; do
        line=$(sed -n "$linenum p" "$1")
        ### your stuff here, for example: ###
        echo "$linenum"$'\t'"### $line"

exit 0

lem@biggy:/tmp$ ./file2 file2 30
6	### lim=$(( $range * $percent / 100 ))
19	###         ### your stuff here, for example: ###
11	###                 (( ${arr[$j]} == $num )) && {
10	###         for ((j=0;j<i;j++)); do
5	### (( 0 < $2 )) && (( $2 < 101 )) || exit
4	### percent=$2                                ### set this  as you like, [1-100]


# 4  
Here is a solution using awk:
~/$ awk 'int(100*rand())%5<1' file

5 and 1 are the parameters you want to modify here : 1/5 = 20% in this example

To be more specific in your requirements:
~/$ awk 'int(101*rand())%100<value' value=20 file | while read line; do echo "command $line"; done

Set value=... to get ...% of lines

# 5  
@Lem: thank you so much for help, i test your solution it works Smilie
# 6  
@Lem: please can you explain me this loop by an example

for ((i=0;i<lim;i++)); do
        num=$(( $RANDOM % $range + 1 ));
        for ((j=0;j<i;j++)); do
                (( ${arr[$j]} == $num )) && {
                        let i--
                        break; }

i'm sorry if i disturb you thank you so much Smilie

# 7  
## Let's say that:
## the number of lines in your file is 200 (range=200);
## you want 30% of the lines in your file: percent=30.
## So we'll have that lim= 200 * 30 / 100 = 60.

for ((i=0;i<lim;i++)); do
## We start our first step of our loop with i=0. We check that 0<200. It is, so this time
## we run the loop.
## At the end of the loop i value will be increased by one: this is the meaning of i++.
## Let's say that now i=15.

num=$(( $RANDOM % $range + 1 ));
## $RANDOM looks like a parameter, but you better think of it as a special function. 
## Every time you call it, you get a pseudo random number between 0 and 32767.
## Let's say this time we get 8512. We calculate the remainder of the division
## 8512/200. So: 8512/200=200*42+112. 112 is the remainder. We add 1, and we get
## num=113. Note: num we'll always be between 1 and 200.

## We save this value as arr[15], the 16th element of our array.  At the end our array will have
## lim elements, so 60 elements in this example, indexed from arr[0] to arr[59].

for ((j=0;j<i;j++)); do
## This is our control loop.
## Now we want to check if the 16th element we've just found is a repetition. We 
## already know that our first 15 elements are all different, because we've already run
## this same test for each of the past 15 elements.

(( ${arr[$j]} == $num )) && {
## So we compare arr[15] with arr[0], then with arr[1], then..., then with arr[14].
## If at any point we find it is indeed a repetition (that is: if we find that our
## 16th value is equal to one of the previous 15 values), we

let i--
## decrease the i value by 1, so now i=14 and

break; }
## we immediately exit from our control loop: no need to waste time. So now we're
## back to our main loop, where i is increased by one: it gets back to 15 again,
## and again we try to find a new 16th element.

## If instead we complete all 15 steps of our control loop, without a break, we know that
## arr[15] is not a repetition, and we go back to our main loop. As we said i value
## is incremented by 1, and  so now it is 16. And we go forth for the 17th element generation.

## After we've found 60 different elements, we're done

Feel free to ask again if I couldn't explain myself.
Featured Tech Videos