How to extract subset file from dataset?

09-04-2013

Registered User

23,310, 4,623

Join Date: Aug 2005

Last Activity: 7 July 2020, 11:47 AM EDT

Location: Saskatchewan

Posts: 23,310

Thanks Given: 1,331

Thanked 4,623 Times in 4,217 Posts

The output file was not included in my instructions, for the reason that it would be empty. It doesn't use it.

Check for the files 'M' and 'F' in the same directory, they will not be empty.

Corona688

View Public Profile for Corona688

Visit Corona688's homepage!

Find all posts by Corona688

09-04-2013

Registered User

44, 0

Join Date: Sep 2013

Last Activity: 16 November 2016, 12:10 PM EST

Posts: 44

Thanks Given: 12

Thanked 0 Times in 0 Posts

Quote:

Originally Posted by Corona688

The output file was not included in my instructions, for the reason that it would be empty. It doesn't use it.

Check for the files 'M' and 'F' in the same directory, they will not be empty.

When I run the program I had M, F file but there is just one line.
What I have in my data set is more lines than the example. I have 2600 lines which contains M and F which are genders. What I want is how to separate 2 files from the data set in 2 file that have separate gender M and gender F.

sajmar

View Public Profile for sajmar

Find all posts by sajmar

09-04-2013

Registered User

23,310, 4,623

Join Date: Aug 2005

Last Activity: 7 July 2020, 11:47 AM EDT

Location: Saskatchewan

Posts: 23,310

Thanks Given: 1,331

Thanked 4,623 Times in 4,217 Posts

That is what my example does, yes. It writes to different file names depending on what the value of the fourth column is.

If the fourth column isn't what you showed it to be in your example data, it won't do what I expect. Check the contents of your folder with 'ls', it may have made weird names.

Could you show a more complete example of your input data please?

Corona688

View Public Profile for Corona688

Visit Corona688's homepage!

Find all posts by Corona688

09-04-2013

Registered User

44, 0

Join Date: Sep 2013

Last Activity: 16 November 2016, 12:10 PM EST

Posts: 44

Thanks Given: 12

Thanked 0 Times in 0 Posts

Quote:

Originally Posted by Corona688

you can find my data set which I want to subset base on gender M and F in 2 separate file.

aa.txt (724.2 KB)

sajmar

View Public Profile for sajmar

Find all posts by sajmar

09-04-2013

Registered User

23,310, 4,623

Join Date: Aug 2005

Last Activity: 7 July 2020, 11:47 AM EDT

Location: Saskatchewan

Posts: 23,310

Thanks Given: 1,331

Thanked 4,623 Times in 4,217 Posts

The data you posted clearly shows M/F in the fifth column, not the fourth.

Also, the data you posted has no header row, which your original data did. I can simplify my code a lot knowing it's not there.

Code:

awk '{ print > $5 }' inputfile

Corona688

View Public Profile for Corona688

Visit Corona688's homepage!

Find all posts by Corona688

09-08-2013

Registered User

43, 2

Join Date: Jun 2008

Last Activity: 20 January 2017, 4:45 PM EST

Posts: 43

Thanks Given: 2

Thanked 2 Times in 1 Post

This is really bad, but seems to work.
Making the assumption that M or F will only appear once on each line
and separated by white space.

Code:

while read line
	do
	    if [[ $line == *M* ]]; then  
	    echo "$line"
	    ## cat to file	
	    fi
	    if [[ $line == *F* ]]; then
	    echo "$line"
	    ## cat to file
	    fi
	done < file

Last edited by briandanielz; 09-08-2013 at 06:25 AM..

briandanielz

View Public Profile for briandanielz

Find all posts by briandanielz

09-10-2013

Registered User

45, 0

Join Date: Feb 2009

Last Activity: 19 June 2014, 2:08 PM EDT

Posts: 45

Thanks Given: 0

Thanked 0 Times in 0 Posts

The solution works

---------- Post updated at 11:57 AM ---------- Previous update was at 11:52 AM ----------

Code:

grep M aa.txt > M
grep F aa.txt > F

This will get you what you need

w020637

View Public Profile for w020637

Find all posts by w020637

UNIX for Advanced & Expert Users

How to extract subset file from dataset?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to filter file using another working on smaller subset

Discussion started by: cmccabe

2. Shell Programming and Scripting

Creating subset of a file based on specific columns

Discussion started by: smitra

3. UNIX for Dummies Questions & Answers

Random selection of subset of sample from file

Discussion started by: biopsy

4. UNIX for Dummies Questions & Answers

Swapping the columns of a text file for a subset of rows

Discussion started by: evelibertine

5. UNIX for Dummies Questions & Answers

how to get a subset of such a file

Discussion started by: forevertl

6. Shell Programming and Scripting

How to remove a subset of data from a large dataset based on values on one line

Discussion started by: davegen

7. Solaris

flarecreate for zfs root dataset and ignore multiple dataset

Discussion started by: uxravi

8. Shell Programming and Scripting

Count the number of words in some subset of file and disregard others

Discussion started by: shoaibjameel123

9. Shell Programming and Scripting

How to extract a subset from a huge dataset

Discussion started by: cliffyiu

10. UNIX for Dummies Questions & Answers

Total file size of a subset list

Discussion started by: tekster757