How to extract subset file from dataset? Post: 302850249

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Total file size of a subset list

Hello! I'm trying to find out the total file size of a subset list in a directory. For example, I do not need to know the total file size of all the files in a directory, but I need to know what the total size is of say, "ls -l *FEB08*" in a directory. Is there any easy way of doing this? ...

2. Shell Programming and Scripting

How to extract a subset from a huge dataset

Hi, All I have a huge file which has 450G. Its tab-delimited format is as below x1 A 50020 1 x1 B 50021 8 x1 C 50022 9 x1 A 50023 10 x2 D 50024 5 x2 C 50025 7 x2 F 50026 8 x2 N 50027 1 : : Now, I want to extract a subset from this file. In this subset, column 1 is x10, column 2 is...

3. Shell Programming and Scripting

Count the number of words in some subset of file and disregard others

Hi All, I have some 6000 text files in a directory. My files are named like 1.txt, 2.txt 3.txt and so on until 6000.txt. I want to count the "number of words" in only first 3000 of them. Any suggestions? I know wc -w can count the number of words in a text file. I am using Red Hat Linux.

4. Solaris

flarecreate for zfs root dataset and ignore multiple dataset

Hi All, I want to write a script to create flar images on multiple servers. In non zfs filesystem I am using -X option to refer a file to exclude mounts on different servers. but on ZFS -X option is not working. I want multiple mounts to be ignore on ZFS base system during flarecreate. I...

5. Shell Programming and Scripting

How to remove a subset of data from a large dataset based on values on one line

Hello. I was wondering if anyone could help. I have a file containing a large table in the format: marker1 marker2 marker3 marker4 position1 position2 position3 position4 genotype1 genotype2 genotype3 genotype4 with marker being a name, position a numeric...

6. UNIX for Dummies Questions & Answers

how to get a subset of such a file

Dear all, I have a file lik below: n of row=420, n of letters in each row=100000 like below: there is no space between the letters. what I want is: the 75000th letter to the 85000th letter in each row. how to do that? thanks a lot! ...

7. UNIX for Dummies Questions & Answers

Swapping the columns of a text file for a subset of rows

Hi, I'd like to swap the columns 1 and 2 of a space-delimited text file but only for the first 1000 rows. How do I go about doing that? Thanks!

8. UNIX for Dummies Questions & Answers

Random selection of subset of sample from file

Hello Could you please help me to find a code that can randomly select 1224 lines from a file of 12240 and make tn output with 1224 line each. my input is txt file with 12240 lines like : 13474 999003507 0 0 2 -9 13475 999003508 0 0 2 -9 13476 999003509 0 0 1 -9 13477 999003510 0 0 1 -9 ...

9. Shell Programming and Scripting

Creating subset of a file based on specific columns

Hello Unix experts, I need a help to create a subset file. I know with cut comand, its very easy to select many different columns, or threshold. But here I have a bit problem as in my data file is big. And I don't want to identify the column numbers or names manually. I am trying to find any...

10. Shell Programming and Scripting

awk to filter file using another working on smaller subset

In the below awk if I use the attached file as the input, I get no results for TCF4. However, if I just copy that line from the attached file and use that as input I get results for TCF4. Basically the gene file is a 1 column list that is used to filter $8 of the attached file. When there is a...

LEARN ABOUT DEBIAN

svm-subset

svm-subset(1)							   User Manuals 						     svm-subset(1)

NAME

       svm-subset - a subset selection tool for LIBSVM

SYNOPSIS

       svm-subset [ -s method ] dataset number [ output1 ] [ output2 ]

DESCRIPTION

       Training  large data is time consuming. Sometimes one should work on a smaller subset first. The python script subset.py randomly selects a
       specified number of samples. For classification data, we provide a stratified selection to ensure the same class distribution in  the  sub-
       set.

OPTIONS

       -s method

       0      -- stratified selection (classification only) (default)

       1      -- random selection

       output1
	      The subset. If output1 is omitted, the subset will be printed on the screen.

       output2
	      The rest of data.

FILES

       See svm-train(1) for the format of dataset

EXAMPLES

	      svm-subset heart_scale 100 file1 file2

       From heart_scale 100 samples are randomly selected and stored in file1. All remaining instances are stored in file2.

BUGS

       Please report bugs to the Debian BTS.

AUTHOR

       Chih-Chung Chang, Chih-Jen Lin <cjlin@csie.ntu.edu.tw>, Chen-Tse Tsai <ctse.tsai@gmail.com> (packaging)

SEE ALSO

       svm-train(1), svm-predict(1)

Linux								     DEC 2009							     svm-subset(1)

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Total file size of a subset list

Discussion started by: tekster757

2. Shell Programming and Scripting

How to extract a subset from a huge dataset

Discussion started by: cliffyiu

3. Shell Programming and Scripting

Count the number of words in some subset of file and disregard others

Discussion started by: shoaibjameel123

4. Solaris

flarecreate for zfs root dataset and ignore multiple dataset

Discussion started by: uxravi