How to extract subset file from dataset?


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users How to extract subset file from dataset?
# 1  
Old 09-04-2013
How to extract subset file from dataset?

Hello
I have a data set which looks like this :

Code:
progeny      sire          dam        gender
12                  1             3                M
13                  2             4                F
14                  2              5               F
15                  6              5               M

I need a subset data which separate the gender (M and F) to two files.
I want something like this:
file 1 output:
Code:
progeny      sire          dam        gender
13                  2             4                F
14                  2              5               F

file2 output:
Code:
progeny      sire          dam        gender
12                  1             3                M
15                  6              5               M

Thanks

Moderator's Comments:
Mod Comment Use code tags!
# 2  
Old 09-04-2013
Code:
awk 'NR==1 { print > "M" ; print > "F"; next }
{ print > $4 }' inputfile

This User Gave Thanks to Corona688 For This Post:
# 3  
Old 09-04-2013
Quote:
Originally Posted by Corona688
Code:
awk 'NR==1 { print > "M" ; print > "F"; next }
{ print > $4 }' inputfile

@ Corona
Thanks for your suggestion. However, this command do not solve my problem.
# 4  
Old 09-04-2013
In what way did it not solve your problem? Be specific or I won't know what problem to fix.
# 5  
Old 09-04-2013
Quote:
Originally Posted by Corona688
In what way did it not solve your problem? Be specific or I won't know what problem to fix.
@ COrona
To be clear my problem, I have a data set :
Code:
progeny            sire          dam        gender 
12                             1                  3                     M 
13                             2                  4      F 
14                             2                   5                    F 
15                             6      5                   M

I want the subset data based on selecting the gender which looks like this:
Code:
progeny            sire          dam        gender 
13                           2                   4                      F 
14                           2                    5                     F


Last edited by Scrutinizer; 09-08-2013 at 07:30 AM.. Reason: code tags
# 6  
Old 09-04-2013
That is what my suggestion does, yes.

In what way does it not work for you? Be specific. What exactly did you do, and what precisely happened?
# 7  
Old 09-04-2013
Quote:
Originally Posted by Corona688
That is what my suggestion does, yes.

In what way does it not work for you? Be specific. What exactly did you do, and what precisely happened?
@ Corona:
When I run the program, it gives me the empty file.
Code:
awk 'NR==1 { print > "M" ; print > "F"; next }{ print > $4 }' aa > bb


Last edited by Scrutinizer; 09-08-2013 at 07:31 AM.. Reason: code tags
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to filter file using another working on smaller subset

In the below awk if I use the attached file as the input, I get no results for TCF4. However, if I just copy that line from the attached file and use that as input I get results for TCF4. Basically the gene file is a 1 column list that is used to filter $8 of the attached file. When there is a... (9 Replies)
Discussion started by: cmccabe
9 Replies

2. Shell Programming and Scripting

Creating subset of a file based on specific columns

Hello Unix experts, I need a help to create a subset file. I know with cut comand, its very easy to select many different columns, or threshold. But here I have a bit problem as in my data file is big. And I don't want to identify the column numbers or names manually. I am trying to find any... (7 Replies)
Discussion started by: smitra
7 Replies

3. UNIX for Dummies Questions & Answers

Random selection of subset of sample from file

Hello Could you please help me to find a code that can randomly select 1224 lines from a file of 12240 and make tn output with 1224 line each. my input is txt file with 12240 lines like : 13474 999003507 0 0 2 -9 13475 999003508 0 0 2 -9 13476 999003509 0 0 1 -9 13477 999003510 0 0 1 -9 ... (7 Replies)
Discussion started by: biopsy
7 Replies

4. UNIX for Dummies Questions & Answers

Swapping the columns of a text file for a subset of rows

Hi, I'd like to swap the columns 1 and 2 of a space-delimited text file but only for the first 1000 rows. How do I go about doing that? Thanks! (1 Reply)
Discussion started by: evelibertine
1 Replies

5. UNIX for Dummies Questions & Answers

how to get a subset of such a file

Dear all, I have a file lik below: n of row=420, n of letters in each row=100000 like below: there is no space between the letters. what I want is: the 75000th letter to the 85000th letter in each row. how to do that? thanks a lot! ... (2 Replies)
Discussion started by: forevertl
2 Replies

6. Shell Programming and Scripting

How to remove a subset of data from a large dataset based on values on one line

Hello. I was wondering if anyone could help. I have a file containing a large table in the format: marker1 marker2 marker3 marker4 position1 position2 position3 position4 genotype1 genotype2 genotype3 genotype4 with marker being a name, position a numeric... (2 Replies)
Discussion started by: davegen
2 Replies

7. Solaris

flarecreate for zfs root dataset and ignore multiple dataset

Hi All, I want to write a script to create flar images on multiple servers. In non zfs filesystem I am using -X option to refer a file to exclude mounts on different servers. but on ZFS -X option is not working. I want multiple mounts to be ignore on ZFS base system during flarecreate. I... (0 Replies)
Discussion started by: uxravi
0 Replies

8. Shell Programming and Scripting

Count the number of words in some subset of file and disregard others

Hi All, I have some 6000 text files in a directory. My files are named like 1.txt, 2.txt 3.txt and so on until 6000.txt. I want to count the "number of words" in only first 3000 of them. Any suggestions? I know wc -w can count the number of words in a text file. I am using Red Hat Linux. (3 Replies)
Discussion started by: shoaibjameel123
3 Replies

9. Shell Programming and Scripting

How to extract a subset from a huge dataset

Hi, All I have a huge file which has 450G. Its tab-delimited format is as below x1 A 50020 1 x1 B 50021 8 x1 C 50022 9 x1 A 50023 10 x2 D 50024 5 x2 C 50025 7 x2 F 50026 8 x2 N 50027 1 : : Now, I want to extract a subset from this file. In this subset, column 1 is x10, column 2 is... (3 Replies)
Discussion started by: cliffyiu
3 Replies

10. UNIX for Dummies Questions & Answers

Total file size of a subset list

Hello! I'm trying to find out the total file size of a subset list in a directory. For example, I do not need to know the total file size of all the files in a directory, but I need to know what the total size is of say, "ls -l *FEB08*" in a directory. Is there any easy way of doing this? ... (3 Replies)
Discussion started by: tekster757
3 Replies
Login or Register to Ask a Question