Sponsored Content
Top Forums Shell Programming and Scripting Parsing a subset of data from a large matrix Post 302980574 by Kanja on Tuesday 30th of August 2016 11:17:13 AM
Old 08-30-2016
Parsing a subset of data from a large matrix

I do have a large matrix of the following format and it is tab delimited

Code:
                 ch-ab1-20 ch-bb2-23 ch-ab1-34 ch-ab1-24 er-cc1-45 bv-cc1-78
ch-ab1-20       0             2               3                  4         5             6
ch-bb2-23       3             0               5                  6         9             10
ch-ab1-34       1             3              0                  8        10             12
ch-ab1-24      56            6              9                  0         12             450
er-cc1-45       67            0              10                 12        0             100
bv-cc1-78       78           23             33                 5          9              0

I would like to parse out a subset from the above matrix based on a regular expression on the rows and column headers.

For example: i would like to parse out all the values that has *-ab1-* (for example). The desired output file is;

Code:
                 ch-ab1-20 ch-ab1-34 ch-ab1-24
ch-ab1-20      0                3             4
ch-ab1-34      1                0              8
ch-ab1-24      56               9             0

Please let me know the best way to parse it out using awk or sed. ab1* is just an example. Sorry the example data shown are not tab delimited here.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Parsing a large log

I need to parse a large log say 300-400 mb The commands like awk and cat etc are taking time. Please help how to process. I need to process the log for certain values of current date. But I am unbale to do so. (17 Replies)
Discussion started by: asth
17 Replies

2. Shell Programming and Scripting

extract data from a data matrix with filter criteria

Here is what old matrix look like, IDs X1 X2 Y1 Y2 10914061 -0.364613333 -0.362922333 0.001691 -0.450094667 10855062 0.845956333 0.860396667 0.014440333 1.483899333... (7 Replies)
Discussion started by: ssshen
7 Replies

3. Shell Programming and Scripting

Helping in parsing subset of text from a big results file

Hi All, I need some help to effectively parse out a subset of results from a big results file. Below is an example of the text file. Each block that I need to parse starts with "reading sequence file 10.codon" (next block starts with another number) and ends with **p-Value(s)**. I have given... (1 Reply)
Discussion started by: Lucky Ali
1 Replies

4. Shell Programming and Scripting

grep/fgrep/egrep for a very large matrix

All, I have a problem with grep/fgrep/egrep. Basically I am building a 200 times 200 correlation matrix. The entries of this matrix need to be retrieved from another very large matrix (~100G). I tried to use the grep/fgrep/egrep to locate each entry and put them into one file. It looks very... (1 Reply)
Discussion started by: realwindfly
1 Replies

5. Shell Programming and Scripting

help printing two consecutive columns, every twenty in a large matrix

Hi, I'm having a problem printing two consecutive columns, as I iterate through a large matrix by twenty columns and I was looking for a solution. My input file looks something like this 1 id1 A1 A2 A3 A4 A5 A6....A20 A21 A22 A23....A4001 A4002 2 id2 B1 B2 B3 B4 B5 B6... 3 id3 ... 4 id4... (8 Replies)
Discussion started by: flotsam
8 Replies

6. Ubuntu

How to convert full data matrix to linearised left data matrix?

Hi all, Is there a way to convert full data matrix to linearised left data matrix? e.g full data matrix Bh1 Bh2 Bh3 Bh4 Bh5 Bh6 Bh7 Bh1 0 0.241058 0.236129 0.244397 0.237479 0.240767 0.245245 Bh2 0.241058 0 0.240594 0.241931 0.241975 ... (8 Replies)
Discussion started by: evoll
8 Replies

7. Shell Programming and Scripting

How to remove a subset of data from a large dataset based on values on one line

Hello. I was wondering if anyone could help. I have a file containing a large table in the format: marker1 marker2 marker3 marker4 position1 position2 position3 position4 genotype1 genotype2 genotype3 genotype4 with marker being a name, position a numeric... (2 Replies)
Discussion started by: davegen
2 Replies

8. Programming

Matrix parsing help !

Hello every body ! I'm a new in this forum and beginner in Perl scripting and I have some problems :(:(:(! I have a big file like that : ID1 ID2 Identity chromosome07_194379 chromosome01_168057 0.975 chromosome01_100293 chromosome01_168057 ... (23 Replies)
Discussion started by: mchimich
23 Replies

9. UNIX for Dummies Questions & Answers

How to subset data?

Hi. I have a large data file. the first column has unique identifiers. I have approximately 5 of these files and they have varying number of columns in their rows. I need to extract ~300 of the rows in to a separate file. I'm not looking for something that would do all 5 files at once, but... (7 Replies)
Discussion started by: kadm
7 Replies

10. Shell Programming and Scripting

Highest value matrix parsing

Hi All I do have a matrix in the following format a_2 a_3 s_4 t_6 b 0 0.9 0.004 0 c 0 0 1 0 d 0 0.98 0 0 e 0.0023 0.96 0 0.0034 I have thousands of rows I would like to parse the maximum value in each of the row and out put that highest value along the column header of... (2 Replies)
Discussion started by: Kanja
2 Replies
dlas2.f(3)							      LAPACK								dlas2.f(3)

NAME
dlas2.f - SYNOPSIS
Functions/Subroutines subroutine dlas2 (F, G, H, SSMIN, SSMAX) DLAS2 computes singular values of a 2-by-2 triangular matrix. Function/Subroutine Documentation subroutine dlas2 (double precisionF, double precisionG, double precisionH, double precisionSSMIN, double precisionSSMAX) DLAS2 computes singular values of a 2-by-2 triangular matrix. Purpose: DLAS2 computes the singular values of the 2-by-2 matrix [ F G ] [ 0 H ]. On return, SSMIN is the smaller singular value and SSMAX is the larger singular value. Parameters: F F is DOUBLE PRECISION The (1,1) element of the 2-by-2 matrix. G G is DOUBLE PRECISION The (1,2) element of the 2-by-2 matrix. H H is DOUBLE PRECISION The (2,2) element of the 2-by-2 matrix. SSMIN SSMIN is DOUBLE PRECISION The smaller singular value. SSMAX SSMAX is DOUBLE PRECISION The larger singular value. Author: Univ. of Tennessee Univ. of California Berkeley Univ. of Colorado Denver NAG Ltd. Date: September 2012 Further Details: Barring over/underflow, all output quantities are correct to within a few units in the last place (ulps), even in the absence of a guard digit in addition/subtraction. In IEEE arithmetic, the code works correctly if one matrix element is infinite. Overflow will not occur unless the largest singular value itself overflows, or is within a few ulps of overflow. (On machines with partial overflow, like the Cray, overflow may occur if the largest singular value is within a factor of 2 of overflow.) Underflow is harmless if underflow is gradual. Otherwise, results may correspond to a matrix modified by perturbations of size near the underflow threshold. Definition at line 108 of file dlas2.f. Author Generated automatically by Doxygen for LAPACK from the source code. Version 3.4.2 Tue Sep 25 2012 dlas2.f(3)
All times are GMT -4. The time now is 06:05 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy