subsetting data


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers subsetting data
# 1  
Old 04-02-2010
subsetting data

Hi
can you please show me how to subset data from a file?

file1 looks like this:
>chr1 strand:+ excise_beg:554293 excise_end:554402
TAATATATTAGATTTGACCTTCAGCAAGGTCAAAGGGAGTCCGAACTAGTCT
>chr2 strand:+ excise_beg:554542 excise_end:554651
ACAGCATACCCCCGATTCCGCTACGACCAACTCATACACCTCCTATGAAAAAA
>chr3 strand:+ excise_beg:554497 excise_end:554606
GTCACCAAGACCCTACTTCTGACCTCCCTGTTCTTATGAATTCGAACAGCATA
>chr4 strand:+ excise_beg:554654 excise_end:554763
CCAGCATTCCCCCTCAAACCTAAGAAATATGTCTGATAAAAGAGTTACTTTGATA

file2 looks like this:
chr1
chr3

I would like to use the information in file2 to subset data from file1 and get file3.

file3 should be:
>chr1 strand:+ excise_beg:554293 excise_end:554402
TAATATATTAGATTTGACCTTCAGCAAGGTCAAAGGGAGTCCGAACTAGTCT
>chr3 strand:+ excise_beg:554497 excise_end:554606
GTCACCAAGACCCTACTTCTGACCTCCCTGTTCTTATGAATTCGAACAGCATA

thank you

joseph
# 2  
Old 04-02-2010
Code:
awk '{ if(FILENAME=="file1" && index($1,">")==1 ){ key=substr($1,2); arr[key]=$0; next}
         if(FILENAME=="file1" && index($1,">")!=1 ){ arr[key]=arr[key] " " $0) }
         if(FILENAME=="file2")  {print arr[$1]; next } 
        }' file1 file2  > newfile2

try something like this.

Last edited by jim mcnamara; 04-02-2010 at 09:39 PM..
# 3  
Old 04-02-2010
thank you for your fast response.
can you please check the error I got?


jdhahbi$ awk '{ if(FILENAME=="file1" && index($1,">")==1 ){ key=substr($1,2); arr[key]=$0; next}
> if(FILENAME=="file1" && index($1,">")!=1 { arr[key]=arr[key] " " $0) }
> if(FILENAME=="file2") {print arr[$1]; next }
> }' file1 file2 > newfile2
awk: syntax error at source line 2
context is
if(FILENAME=="file1" && index($1,">")!=1 >>> { <<<
awk: illegal statement at source line 2
awk: illegal statement at source line 2
# 4  
Old 04-02-2010
cat file2|while read r1
do
grep $r1 file1
done
# 5  
Old 04-02-2010
thank you very much. It prints the output.
what would you add to it to save the output in file3?
# 6  
Old 04-02-2010
grep $r1 file1 > file2.txt

---------- Post updated at 06:41 PM ---------- Previous update was at 06:41 PM ----------

grep $r1 file1 >> file3.txt
# 7  
Old 04-02-2010
My bad.

I edited my first response - typo - look for the red....
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk --> math-operation in data-record and joining with second file data

Hi! I have a pretty complex job - at least for me! i have two csv-files with meassurement-data: fileA ...... (2 Replies)
Discussion started by: IMPe
2 Replies

2. Shell Programming and Scripting

Parsing XML (and insert data) then output data (bash / Solaris)

Hi folks I have a script I wrote that basically parses a bunch of config and xml files works out were to add in the new content then spits out the data into a new file. It all works - apart from the xml and config file format in the new file with XML files the original XML (that ends up in... (2 Replies)
Discussion started by: dfinch
2 Replies

3. Shell Programming and Scripting

Generate tabular data based on a column value from an existing data file

Hi, I have a data file with : 01/28/2012,1,1,98995 01/28/2012,1,2,7195 01/29/2012,1,1,98995 01/29/2012,1,2,7195 01/30/2012,1,1,98896 01/30/2012,1,2,7083 01/31/2012,1,1,98896 01/31/2012,1,2,7083 02/01/2012,1,1,98896 02/01/2012,1,2,7083 02/02/2012,1,1,98899 02/02/2012,1,2,7083 I... (1 Reply)
Discussion started by: himanish
1 Replies

4. Shell Programming and Scripting

Converting variable space width data into CSV data in bash

Hi All, I was wondering how I can convert each line in an input file where fields are separated by variable width spaces into a CSV file. Below is the scenario what I am looking for. My Input data in inputfile.txt 19 15657 15685 Sr2dReader 107.88 105.51... (4 Replies)
Discussion started by: vharsha
4 Replies

5. UNIX for Dummies Questions & Answers

How to get data only inside polygon created by points which is part of whole data from file?

hiii, Help me out..i have a huge set of data stored in a file.This file has has 2 columns which is latitude & longitude of a region. Now i have a program which asks for the number of points & based on this number it asks the user to enter that latitude & longitude values which are in the same... (7 Replies)
Discussion started by: reva
7 Replies

6. Shell Programming and Scripting

Extract data based on match against one column data from a long list data

My input file: data_5 Ali 422 2.00E-45 102/253 140/253 24 data_3 Abu 202 60.00E-45 12/23 140/23 28 data_1 Ahmad 256 7.00E-45 120/235 140/235 22 data_4 Aman 365 8.00E-45 15/65 140/65 20 data_10 Jones 869 9.00E-45 65/253 140/253 18... (12 Replies)
Discussion started by: patrick87
12 Replies

7. Shell Programming and Scripting

subsetting lines with grep

Hi my file has two columns: GAII_4:6:100:548:645/1 GTACACAACCCCCCCCCCCCACCCCACCCCCCCCCCCCCC GAII_4:6:100:1:1242/1 AGTCTGCCCCTCCCCCTNNNNNNNTCTTTTNCCTCCTCCT GAII_4:6:100:444:504/1 GTAACACACACCCTGATACTCCCCCCTCCACAACCGCTCT I want to subset the lines that start with GT in the second column... (5 Replies)
Discussion started by: jdhahbi
5 Replies

8. UNIX for Dummies Questions & Answers

subsetting data

I have a file where the data is stored in 6 columns, I would like to subset only lines with the fourth column is blank. Can anybody help me with this? Thanks Joseph (19 Replies)
Discussion started by: jdhahbi
19 Replies

9. Shell Programming and Scripting

how to verify that copied data to remote system is identical with local data.

I have created simple shell script #!/bin/sh echo `date`; echo "Start .... find . -mtime +95 -print > /tmp/files.txt for file in `cat /tmp/files.txt` do echo "copying file - $file" /usr/local/bin/scp -p -P 2222 $file remote.hostname:/file/path echo "copid file -... (3 Replies)
Discussion started by: ynilesh
3 Replies

10. UNIX for Dummies Questions & Answers

Howto capture data from rs232port andpull data into oracle database-9i automatically

Hi, i willbe very much grateful to u if u help me out.. if i simply connect pbx machine to printer by serial port RS232 then we find this view: But i want to capture this data into database automatically when the pbx is running.The table in database will contain similar to this view inthe... (1 Reply)
Discussion started by: boss
1 Replies
Login or Register to Ask a Question