Random selection of subset of sample from file Post: 302737549

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Total file size of a subset list

Hello! I'm trying to find out the total file size of a subset list in a directory. For example, I do not need to know the total file size of all the files in a directory, but I need to know what the total size is of say, "ls -l *FEB08*" in a directory. Is there any easy way of doing this? ...

2. Shell Programming and Scripting

Random lines selection form a file.

>cat data.dat 0001 Robbert 0002 Nick 0003 Mark ....... 1000 Jarek

3. Shell Programming and Scripting

Count the number of words in some subset of file and disregard others

Hi All, I have some 6000 text files in a directory. My files are named like 1.txt, 2.txt 3.txt and so on until 6000.txt. I want to count the "number of words" in only first 3000 of them. Any suggestions? I know wc -w can count the number of words in a text file. I am using Red Hat Linux.

4. Shell Programming and Scripting

Random File Selection and Moving

OK, I am stumpped. I have this shell Script that I want to randomly select a file with the extention of .sct. Then using a portion of its file name select the six related .mot files. Then move them all to another folder. I also need a user input form for the number of .SCT files to randomly select...

5. UNIX for Dummies Questions & Answers

how to get a subset of such a file

Dear all, I have a file lik below: n of row=420, n of letters in each row=100000 like below: there is no space between the letters. what I want is: the 75000th letter to the 85000th letter in each row. how to do that? thanks a lot! ...

6. UNIX for Dummies Questions & Answers

Swapping the columns of a text file for a subset of rows

Hi, I'd like to swap the columns 1 and 2 of a space-delimited text file but only for the first 1000 rows. How do I go about doing that? Thanks!

7. Shell Programming and Scripting

Creating subset of a file based on specific columns

Hello Unix experts, I need a help to create a subset file. I know with cut comand, its very easy to select many different columns, or threshold. But here I have a bit problem as in my data file is big. And I don't want to identify the column numbers or names manually. I am trying to find any...

8. Shell Programming and Scripting

Need to generate a file with random data. /dev/[u]random doesn't exist.

Need to use dd to generate a large file from a sample file of random data. This is because I don't have /dev/urandom. I create a named pipe then: dd if=mynamed.fifo do=myfile.fifo bs=1024 count=1024 but when I cat a file to the fifo that's 1024 random bytes: cat randomfile.txt >...

9. UNIX for Advanced & Expert Users

How to extract subset file from dataset?

Hello I have a data set which looks like this : progeny sire dam gender 12 1 3 M 13 2 4 F 14 2 5 F 15 6 5 ...

10. Shell Programming and Scripting

awk to filter file using another working on smaller subset

In the below awk if I use the attached file as the input, I get no results for TCF4. However, if I just copy that line from the attached file and use that as input I get results for TCF4. Basically the gene file is a 1 column list that is used to filter $8 of the attached file. When there is a...

LEARN ABOUT DEBIAN

svm-subset

svm-subset(1)							   User Manuals 						     svm-subset(1)

NAME

       svm-subset - a subset selection tool for LIBSVM

SYNOPSIS

       svm-subset [ -s method ] dataset number [ output1 ] [ output2 ]

DESCRIPTION

       Training  large data is time consuming. Sometimes one should work on a smaller subset first. The python script subset.py randomly selects a
       specified number of samples. For classification data, we provide a stratified selection to ensure the same class distribution in  the  sub-
       set.

OPTIONS

       -s method

       0      -- stratified selection (classification only) (default)

       1      -- random selection

       output1
	      The subset. If output1 is omitted, the subset will be printed on the screen.

       output2
	      The rest of data.

FILES

       See svm-train(1) for the format of dataset

EXAMPLES

	      svm-subset heart_scale 100 file1 file2

       From heart_scale 100 samples are randomly selected and stored in file1. All remaining instances are stored in file2.

BUGS

       Please report bugs to the Debian BTS.

AUTHOR

       Chih-Chung Chang, Chih-Jen Lin <cjlin@csie.ntu.edu.tw>, Chen-Tse Tsai <ctse.tsai@gmail.com> (packaging)

SEE ALSO

       svm-train(1), svm-predict(1)

Linux								     DEC 2009							     svm-subset(1)

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Total file size of a subset list

Discussion started by: tekster757

2. Shell Programming and Scripting

Random lines selection form a file.

Discussion started by: McLan

3. Shell Programming and Scripting

Count the number of words in some subset of file and disregard others

Discussion started by: shoaibjameel123

4. Shell Programming and Scripting

Random File Selection and Moving

Discussion started by: stak1993