Sponsored Content
Top Forums Shell Programming and Scripting Parsing a subset of data from a large matrix Post 302980578 by Don Cragun on Tuesday 30th of August 2016 11:54:30 AM
Old 08-30-2016
What operating system are you using?

What shell are you using?

What have you tried to solve this problem?

Note that *-ab1-* is a valid filename matching pattern, but it is not a valid regular expression. (A BRE or ERE to match a string contain the string -ab1- anywhere in the string would be .*-ab1-.* or just -ab1-.)

Since you have not shown us tab-delimited data, it is hard to guess at exactly what you mean. The data you have shown us seems to have room for multiple tabs between fields on the data lines (which should not happen in a tab-delimited file). Other than the heading line (where I assume there is exactly one tab at the start of the line and a single tab between field headings), will there ever be any field that is empty? Will there ever be any <space> characters in your data?
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Parsing a large log

I need to parse a large log say 300-400 mb The commands like awk and cat etc are taking time. Please help how to process. I need to process the log for certain values of current date. But I am unbale to do so. (17 Replies)
Discussion started by: asth
17 Replies

2. Shell Programming and Scripting

extract data from a data matrix with filter criteria

Here is what old matrix look like, IDs X1 X2 Y1 Y2 10914061 -0.364613333 -0.362922333 0.001691 -0.450094667 10855062 0.845956333 0.860396667 0.014440333 1.483899333... (7 Replies)
Discussion started by: ssshen
7 Replies

3. Shell Programming and Scripting

Helping in parsing subset of text from a big results file

Hi All, I need some help to effectively parse out a subset of results from a big results file. Below is an example of the text file. Each block that I need to parse starts with "reading sequence file 10.codon" (next block starts with another number) and ends with **p-Value(s)**. I have given... (1 Reply)
Discussion started by: Lucky Ali
1 Replies

4. Shell Programming and Scripting

grep/fgrep/egrep for a very large matrix

All, I have a problem with grep/fgrep/egrep. Basically I am building a 200 times 200 correlation matrix. The entries of this matrix need to be retrieved from another very large matrix (~100G). I tried to use the grep/fgrep/egrep to locate each entry and put them into one file. It looks very... (1 Reply)
Discussion started by: realwindfly
1 Replies

5. Shell Programming and Scripting

help printing two consecutive columns, every twenty in a large matrix

Hi, I'm having a problem printing two consecutive columns, as I iterate through a large matrix by twenty columns and I was looking for a solution. My input file looks something like this 1 id1 A1 A2 A3 A4 A5 A6....A20 A21 A22 A23....A4001 A4002 2 id2 B1 B2 B3 B4 B5 B6... 3 id3 ... 4 id4... (8 Replies)
Discussion started by: flotsam
8 Replies

6. Ubuntu

How to convert full data matrix to linearised left data matrix?

Hi all, Is there a way to convert full data matrix to linearised left data matrix? e.g full data matrix Bh1 Bh2 Bh3 Bh4 Bh5 Bh6 Bh7 Bh1 0 0.241058 0.236129 0.244397 0.237479 0.240767 0.245245 Bh2 0.241058 0 0.240594 0.241931 0.241975 ... (8 Replies)
Discussion started by: evoll
8 Replies

7. Shell Programming and Scripting

How to remove a subset of data from a large dataset based on values on one line

Hello. I was wondering if anyone could help. I have a file containing a large table in the format: marker1 marker2 marker3 marker4 position1 position2 position3 position4 genotype1 genotype2 genotype3 genotype4 with marker being a name, position a numeric... (2 Replies)
Discussion started by: davegen
2 Replies

8. Programming

Matrix parsing help !

Hello every body ! I'm a new in this forum and beginner in Perl scripting and I have some problems :(:(:(! I have a big file like that : ID1 ID2 Identity chromosome07_194379 chromosome01_168057 0.975 chromosome01_100293 chromosome01_168057 ... (23 Replies)
Discussion started by: mchimich
23 Replies

9. UNIX for Dummies Questions & Answers

How to subset data?

Hi. I have a large data file. the first column has unique identifiers. I have approximately 5 of these files and they have varying number of columns in their rows. I need to extract ~300 of the rows in to a separate file. I'm not looking for something that would do all 5 files at once, but... (7 Replies)
Discussion started by: kadm
7 Replies

10. Shell Programming and Scripting

Highest value matrix parsing

Hi All I do have a matrix in the following format a_2 a_3 s_4 t_6 b 0 0.9 0.004 0 c 0 0 1 0 d 0 0.98 0 0 e 0.0023 0.96 0 0.0034 I have thousands of rows I would like to parse the maximum value in each of the row and out put that highest value along the column header of... (2 Replies)
Discussion started by: Kanja
2 Replies
TABS(1) 						    BSD General Commands Manual 						   TABS(1)

NAME
tabs -- set terminal tabs SYNOPSIS
tabs [-n | -a | -a2 | -c | -c2 | -c3 | -f | -p | -s | -u] [+m[n]] [-T type] tabs [-T type] [+[n]] n1[,n2,...] DESCRIPTION
The tabs utility displays a series of characters that clear the hardware terminal tab settings then initialises tab stops at specified posi- tions, and optionally adjusts the margin. In the first synopsis form, the tab stops set depend on the command line options used, and may be one of the predefined formats or at regular intervals. In the second synopsis form, tab stops are set at positions n1, n2, etc. If a position is preceded by a '+', it is relative to the previous position set. No more than 20 positions may be specified. If no tab stops are specified, the ``standard'' UNIX tab width of 8 is used. The options are as follows: -n Set a tab stop every n columns. If n is 0, the tab stops are cleared but no new ones are set. -a Assembler format (columns 1, 10, 16, 36, 72). -a2 Assembler format (columns 1, 10, 16, 40, 72). -c COBOL normal format (columns 1, 8, 12, 16, 20, 55) -c2 COBOL compact format (columns 1, 6, 10, 14, 49) -c3 COBOL compact format (columns 1, 6, 10, 14, 18, 22, 26, 30, 34, 38, 42, 46, 50, 54, 58, 62, 67). -f FORTRAN format (columns 1, 7, 11, 15, 19, 23). -p PL/1 format (columns 1, 5, 9, 13, 17, 21, 25, 29, 33, 37, 41, 45, 49, 53, 57, 61). -s SNOBOL format (columns 1, 10, 55). -u Assembler format (columns 1, 12, 20, 44). +m[n], +[n] Set an n character left margin, or 10 if n is omitted. -T type Output escape sequence for the terminal type type. ENVIRONMENT
The LANG, LC_ALL, LC_CTYPE and TERM environment variables affect the execution of tabs as described in environ(7). The -T option overrides the setting of the TERM environment variable. If neither TERM nor the -T option are present, tabs will fail. EXIT STATUS
The tabs utility exits 0 on success, and >0 if an error occurs. SEE ALSO
expand(1), stty(1), tput(1), unexpand(1), termcap(5) STANDARDS
The tabs utility conforms to IEEE Std 1003.1-2001 (``POSIX.1''). HISTORY
A tabs utility appeared in PWB UNIX. This implementation was introduced in FreeBSD 5.0. BUGS
The current termcap(5) database does not define the 'ML' (set left soft margin) capability for any terminals. BSD
May 20, 2002 BSD
All times are GMT -4. The time now is 07:42 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy