[Solved] Converting the data into matrix with 0's and 1's


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting [Solved] Converting the data into matrix with 0's and 1's
# 1  
Old 08-20-2013
[Solved] Converting the data into matrix with 0's and 1's

I have a file that contains 2 columns tag,pos
cat input_file

Code:
tag	pos
atg	10
ata	16
agt	15
agg	19
atg	17
agg	14

I have used following command to sort the file based on second column
Code:
sort -k 2 input_file
tag	pos 
atg	10
agg	14
agt	15
ata	16	
agg	19
atg	17

i want to convert it into a matrix as below with 5 as an interval and #1 and #0 representing the presence or absence of that tag within that interval

output_file

Code:
	atg	agg	agt	ata
10-15	1	1	1	0
15-20	1	1	0	1

The first row in the output matrix is 10-15 because in the sorted file it starts from 10
# 2  
Old 08-20-2013
First extract a unique sorted master list of tags, using both files.

then, pass both file through a transform so values between 10 and 14 (15 must be a typo?, ranges are 5 and 6 wide?) are converted to "10-14", etc., sort them all by range and tag, run through uniq -c, and now you have your counts in "ct tag range" form and range, tag order. Using the master list of tags, build the lines for each range present in tag order. If you know all the tags and ranges ahead of time, you can also use associative vectors keyed on "tag+range" to accumulate values and then spit out the totals without sorting, using awk, bash, ksh.
# 3  
Old 08-21-2013
Thank you for your suggestions i got it working
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Converting unstructured data to structured data

Hi, Can someone help in converting the below unstructured data to a CSV format please. { "branchId" : "BNSFGDJNSJG-73264HB-132131BNHJFSDG", "branchName" : "NEWYORK-SSDF", "branchProductId" : "72Y5HFHSF7H3RUNAWEF", "PreferenceId" : "BASDBVcbzcYHcb", "emailId" :... (9 Replies)
Discussion started by: naveen.kuppili
9 Replies

2. Shell Programming and Scripting

Converting text file in a matrix

Hi All, I do have a file with many lines (rows) and it is space delimited. For example: I have a file named SR345_pl.txt. If I open it in an editor, it looks like this: adfr A2 0.9345 dtgr/2 A2 0.876 fgh/3 A2 023.76 fghe/4 A2 2345 bnhy/1 A3 3456 bhy A3 0.9876 phy A5 0.987 kdrt A5... (9 Replies)
Discussion started by: Lucky Ali
9 Replies

3. Shell Programming and Scripting

[SOLVED] Converting data from one format to the other

Hi All, I need to convert an exel spreadsheet into a SAS dataset, and the following format change is needed. Please help, this is too complex for a biologist. Let me describe the input. 1st row is generation.1st column in keyword 'generation', starting 2nd column there are 5... (9 Replies)
Discussion started by: newbie83
9 Replies

4. Programming

Converting columns to matrix

Dear All I would like to convert columns to matrix For example my data looks like this D2 0 D2 0 1.0 D2 0 D2 1 0.308 D2 0 D2 2 0.554 D2 0 D2 3 0.287 D2 0 D2 4 0.633 D2 0 D2 5 0.341 D2 0 D2 6 0.665 D2 0 D2 7 0.698 D2 0 D2 8 0.625 D2 0 D2 9 0.429 D2 0 D2 10 0.698 D2 0 D2 11... (7 Replies)
Discussion started by: bala06
7 Replies

5. Shell Programming and Scripting

Converting to matrix-like file using AWK

Hi, Needs for statistics, doing converting Here is a sample file Input : 1|A|17,94 1|B|22,59 1|C|56,93 2|A|63,71 2|C|23,92 5|B|19,49 5|C|67,58 expecting something like that Output : 1|A|17,94|B|22,59|C|56,93 2|A|63,71|B|0|C|23,92 5|A|0|B|19,49|C|67,58 (11 Replies)
Discussion started by: fastlane3000
11 Replies

6. Shell Programming and Scripting

Converting variable space width data into CSV data in bash

Hi All, I was wondering how I can convert each line in an input file where fields are separated by variable width spaces into a CSV file. Below is the scenario what I am looking for. My Input data in inputfile.txt 19 15657 15685 Sr2dReader 107.88 105.51... (4 Replies)
Discussion started by: vharsha
4 Replies

7. Ubuntu

How to convert full data matrix to linearised left data matrix?

Hi all, Is there a way to convert full data matrix to linearised left data matrix? e.g full data matrix Bh1 Bh2 Bh3 Bh4 Bh5 Bh6 Bh7 Bh1 0 0.241058 0.236129 0.244397 0.237479 0.240767 0.245245 Bh2 0.241058 0 0.240594 0.241931 0.241975 ... (8 Replies)
Discussion started by: evoll
8 Replies

8. Programming

Converting distance list to distance matrix in R

Hi power user, I have this type of data (distance list): file1 A B 10 B C 20 C D 50I want output like this # A B C D A 0 10 30 80 B 10 0 20 70 C 30 20 0 50 D 80 70 50 0 Which is a distance matrix I have tried... (0 Replies)
Discussion started by: anjas
0 Replies

9. Shell Programming and Scripting

extract data from a data matrix with filter criteria

Here is what old matrix look like, IDs X1 X2 Y1 Y2 10914061 -0.364613333 -0.362922333 0.001691 -0.450094667 10855062 0.845956333 0.860396667 0.014440333 1.483899333... (7 Replies)
Discussion started by: ssshen
7 Replies

10. UNIX for Dummies Questions & Answers

converting a tabular format data to comma seperated data in KSH

Hi, Could anyone help me in changing a tabular format output to comma seperated file pls in K-sh. Its very urgent. E.g : username empid ------------------------ sri 123 to username,empid sri,123 Thanks, Hema:confused: (2 Replies)
Discussion started by: Hemamalini
2 Replies
Login or Register to Ask a Question