Get 20% of lines in File randomly


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Get 20% of lines in File randomly
# 1  
Old 09-28-2012
Get 20% of lines in File randomly

Hello,

This is my code:
Code:
nb_lignes=`wc -l $1 | cut -d " " -f1`
for i in $(seq $nb_lignes)
do
m=`head $1 -n $i | tail -1`
//command
done

Please how can i change it to get Get 20% of lines in File randomly to apply "command" on each line ? 20% or 40% or 60 % (it's a parameter)

Thank you.

Last edited by Franklin52; 09-28-2012 at 08:07 AM.. Reason: Please use code tags for data and code samples
# 2  
Old 09-28-2012
Code:
[root@host dir]# head -$(( $(wc -l file | cut -d" " -f1) * 40 / 100 )) file

# 3  
Old 09-28-2012
If I'm right your problem is to randomly generate N integers in the range [1;M], where M is the number of lines in the file, and N is a rounded percentage of M.

Before going on: are repetitions allowed? In other words, may a number (a line in the file) be randomly selected more than once?

And: what are your shell and OS?

EDIT: solution proposed

However, this is a possible solution working on linux/bash, with repetitions not allowed:
Code:
#!/bin/bash
range=$(wc -l "$1" | cut -d " " -f1)
(( $range > 32768 )) && exit              ### max number of lines for this script: 32768
percent=$2                                ### set this  as you like, [1-100]
(( 0 < $2 )) && (( $2 < 101 )) || exit
lim=$(( $range * $percent / 100 ))
for ((i=0;i<lim;i++)); do
        num=$(( $RANDOM % $range + 1 ));
        arr[$i]=$num;
        for ((j=0;j<i;j++)); do
                (( ${arr[$j]} == $num )) && {
                        let i--
                        break; }
        done
done

for linenum in "${arr[@]}"; do
        line=$(sed -n "$linenum p" "$1")
        ### your stuff here, for example: ###
        echo "$linenum"$'\t'"### $line"
done

exit 0

Usage:
Code:
lem@biggy:/tmp$ ./file2 file2 30
6	### lim=$(( $range * $percent / 100 ))
19	###         ### your stuff here, for example: ###
11	###                 (( ${arr[$j]} == $num )) && {
10	###         for ((j=0;j<i;j++)); do
5	### (( 0 < $2 )) && (( $2 < 101 )) || exit
4	### percent=$2                                ### set this  as you like, [1-100]

--
Bye

Last edited by Lem; 09-28-2012 at 11:36 AM.. Reason: Solution proposed
This User Gave Thanks to Lem For This Post:
# 4  
Old 09-28-2012
Here is a solution using awk:
Code:
~/unix.com$ awk 'int(100*rand())%5<1' file

5 and 1 are the parameters you want to modify here : 1/5 = 20% in this example

To be more specific in your requirements:
Code:
~/unix.com$ awk 'int(101*rand())%100<value' value=20 file | while read line; do echo "command $line"; done

Set value=... to get ...% of lines

Last edited by tukuyomi; 09-28-2012 at 12:04 PM..
This User Gave Thanks to tukuyomi For This Post:
# 5  
Old 10-01-2012
@Lem: thank you so much for help, i test your solution it works Smilie
# 6  
Old 10-08-2012
@Lem: please can you explain me this loop by an example

Code:
for ((i=0;i<lim;i++)); do
        num=$(( $RANDOM % $range + 1 ));
        arr[$i]=$num;
        for ((j=0;j<i;j++)); do
                (( ${arr[$j]} == $num )) && {
                        let i--
                        break; }
        done
done

i'm sorry if i disturb you thank you so much Smilie

Last edited by Scott; 10-20-2012 at 08:21 AM.. Reason: Code tags, not quote tags for code
# 7  
Old 10-08-2012
Code:
## Let's say that:
## the number of lines in your file is 200 (range=200);
## you want 30% of the lines in your file: percent=30.
## So we'll have that lim= 200 * 30 / 100 = 60.

for ((i=0;i<lim;i++)); do
## We start our first step of our loop with i=0. We check that 0<200. It is, so this time
## we run the loop.
## At the end of the loop i value will be increased by one: this is the meaning of i++.
## Let's say that now i=15.

num=$(( $RANDOM % $range + 1 ));
## $RANDOM looks like a parameter, but you better think of it as a special function. 
## Every time you call it, you get a pseudo random number between 0 and 32767.
## Let's say this time we get 8512. We calculate the remainder of the division
## 8512/200. So: 8512/200=200*42+112. 112 is the remainder. We add 1, and we get
## num=113. Note: num we'll always be between 1 and 200.

arr[$i]=$num;
## We save this value as arr[15], the 16th element of our array.  At the end our array will have
## lim elements, so 60 elements in this example, indexed from arr[0] to arr[59].

for ((j=0;j<i;j++)); do
## This is our control loop.
## Now we want to check if the 16th element we've just found is a repetition. We 
## already know that our first 15 elements are all different, because we've already run
## this same test for each of the past 15 elements.

(( ${arr[$j]} == $num )) && {
## So we compare arr[15] with arr[0], then with arr[1], then..., then with arr[14].
## If at any point we find it is indeed a repetition (that is: if we find that our
## 16th value is equal to one of the previous 15 values), we

let i--
## decrease the i value by 1, so now i=14 and

break; }
## we immediately exit from our control loop: no need to waste time. So now we're
## back to our main loop, where i is increased by one: it gets back to 15 again,
## and again we try to find a new 16th element.

done
## If instead we complete all 15 steps of our control loop, without a break, we know that
## arr[15] is not a repetition, and we go back to our main loop. As we said i value
## is incremented by 1, and  so now it is 16. And we go forth for the 17th element generation.

done
## After we've found 60 different elements, we're done

Feel free to ask again if I couldn't explain myself.
--
Bye
This User Gave Thanks to Lem For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to select lines randomly without replacement in UNIX?

Dear Folks I have one column of 15000 lines and want to select randomly 5000 of them in five different times without replacement. I am aware that command 'shuf' and 'sort -R' could select randomly those lines but I am not sure how could I avoid the replacement of selection line. Does anyone have... (10 Replies)
Discussion started by: sajmar
10 Replies

2. Shell Programming and Scripting

Randomly create time in UNIX

Hey, How can i create randomly create time N times. Suppose i want to create data for a particualr date 5 times... Mon Jan 19 11:42:50 Mon Jan 19 19:16:40 Mon Jan 19 12:12:33 Mon Jan 19 14:26:27 Mon Jan 19 12:29:53 Mon Jan 19 13:30:31 I want the script to create N times randome... (2 Replies)
Discussion started by: jaituteja
2 Replies

3. Shell Programming and Scripting

Randomly inserting extra columns into csv file

Hi Tech Guru, I have a test file as below , which needs some more fields to be populated randomly : dks3243;12;20130823;1420;25m;0;syt dks3243;rocy;10 dks3243;kiop;18 sde21p4;77;20151210;8479;7py;9;vfr sde21p4;temp;67 sfq6i01;12;20120123;3412;4rd;7;jui sfq6i01;uymk;90 sfq6i01;kiop;51 ... (8 Replies)
Discussion started by: Lokesha
8 Replies

4. UNIX for Dummies Questions & Answers

How to randomly select lines from a text file

I have a text file with 1000 lines, I want to randomly select 200 lines from it and print them as output. How do I go about doing that? Thanks! (7 Replies)
Discussion started by: evelibertine
7 Replies

5. Programming

Java application dying randomly

Hi, (First post, please be gental!) I have a java app that I am running on unix (centos) But it keeps dying randomly. The times seem random from anything between 3 hours and 3 days. I have a cronjob running to restart it when ever it dies but I would rather this happened less often. ... (2 Replies)
Discussion started by: sm9ai
2 Replies

6. Shell Programming and Scripting

Cron job randomly once a day

I want to create a cron job randomly once a day for my site's registration. The responsible file for registrations is a config file and I need to change the contents twice on day (on and off) I know the way for random cron job for example */n * * * * /usr/local/bin/php... (6 Replies)
Discussion started by: lucker
6 Replies

7. UNIX for Dummies Questions & Answers

randomly renaming files

I have a directory of files that look like filename 001.ext, filename 002.ext, etc. I'd like to rename the files with unique random numbered names, so that the original filenames are stripped and the files are given a new, random number name. I'm not super new to UNIX, but I don't often use it for... (2 Replies)
Discussion started by: platz
2 Replies

8. Shell Programming and Scripting

use awk to read randomly located columns in an excel file

Hi, I have an excel file that have a random count of columns/fields and what im trying to do is to only retrieve all the rows under 2 specific field headers. I can use the usually command for awk which is awk 'print{ $1 $2}' > output.txt, but the location of the 2 specific field headers is... (9 Replies)
Discussion started by: mdap
9 Replies

9. Shell Programming and Scripting

Read line from file randomly

I have data file with customer.dat, and this contains the customer names >cat customer.dat FirstName1 LastName1 FistName2 LastName1 FistName3 MiddleName3 LastName3 This file can contain areoun 100 customer names. Regards, (1 Reply)
Discussion started by: McLan
1 Replies

10. Shell Programming and Scripting

how to select a value randomly

on my desktop i am using the kde rotating desktop image option. this rotates images randomly every half hour. now, i would like to write an html file which will have an inline frame with some text, maybe system messages, or my friends live journal thati read alot, or unix.com! however, i dont want... (1 Reply)
Discussion started by: norsk hedensk
1 Replies
Login or Register to Ask a Question