Note that *-ab1-* is a valid filename matching pattern, but it is not a valid regular expression. (A BRE or ERE to match a string contain the string -ab1- anywhere in the string would be .*-ab1-.* or just -ab1-.)
Since you have not shown us tab-delimited data, it is hard to guess at exactly what you mean. The data you have shown us seems to have room for multiple tabs between fields on the data lines (which should not happen in a tab-delimited file). Other than the heading line (where I assume there is exactly one tab at the start of the line and a single tab between field headings), will there ever be any field that is empty? Will there ever be any <space> characters in your data?
And, tell us which columns to include and which to exclude.
Hi RudiC,
I believe that the intent is that the script will be invoked with an ERE and a pathname as operands. If a column header is matched by the ERE, that column will be included in the output and the rows that are output have a field 1 value that also matches the ERE. For example, if file contains the input shown in post #1 in this thread (with all sequences of 1 or more <space>s converted to a single <tab> character), and the script is named tester, the command:
would produce the output:
and the command:
would produce the output:
I'm just waiting for Kanja to show us what attempt(s) have been made to solve this problem and to confirm that I have guessed correctly at the input file format.
This User Gave Thanks to Don Cragun For This Post:
This is exactly what I am looking for. I tried awk by doing a for loop through the matrix, but the problem i am having is to get the regular expression incorporated into the script.
---------- Post updated at 01:57 PM ---------- Previous update was at 01:56 PM ----------
Hi All
I do have a matrix in the following format
a_2 a_3 s_4 t_6
b 0 0.9 0.004 0
c 0 0 1 0
d 0 0.98 0 0
e 0.0023 0.96 0 0.0034
I have thousands of rows
I would like to parse the maximum value in each of the row and out put that highest value along the column header of... (2 Replies)
Hi. I have a large data file. the first column has unique identifiers. I have approximately 5 of these files and they have varying number of columns in their rows. I need to extract ~300 of the rows in to a separate file. I'm not looking for something that would do all 5 files at once, but... (7 Replies)
Hello every body ! I'm a new in this forum and beginner in Perl scripting and I have some problems :(:(:(! I have a big file like that :
ID1 ID2 Identity
chromosome07_194379 chromosome01_168057 0.975
chromosome01_100293 chromosome01_168057 ... (23 Replies)
Hello. I was wondering if anyone could help. I have a file containing a large table in the format:
marker1 marker2 marker3 marker4
position1 position2 position3 position4
genotype1 genotype2 genotype3 genotype4
with marker being a name, position a numeric... (2 Replies)
Hi all,
Is there a way to convert full data matrix to linearised left data matrix?
e.g full data matrix
Bh1 Bh2 Bh3 Bh4 Bh5 Bh6 Bh7
Bh1 0 0.241058 0.236129 0.244397 0.237479 0.240767 0.245245
Bh2 0.241058 0 0.240594 0.241931 0.241975 ... (8 Replies)
Hi,
I'm having a problem printing two consecutive columns, as I iterate through a large matrix by twenty columns and I was looking for a solution.
My input file looks something like this
1 id1 A1 A2 A3 A4 A5 A6....A20 A21 A22 A23....A4001 A4002
2 id2 B1 B2 B3 B4 B5 B6...
3 id3 ...
4 id4... (8 Replies)
All,
I have a problem with grep/fgrep/egrep. Basically I am building a 200 times 200 correlation matrix. The entries of this matrix need to be retrieved from another very large matrix (~100G). I tried to use the grep/fgrep/egrep to locate each entry and put them into one file. It looks very... (1 Reply)
Hi All,
I need some help to effectively parse out a subset of results from a big results file.
Below is an example of the text file. Each block that I need to parse starts with "reading sequence file 10.codon" (next block starts with another number) and ends with **p-Value(s)**. I have given... (1 Reply)
I need to parse a large log say 300-400 mb
The commands like awk and cat etc are taking time.
Please help how to process.
I need to process the log for certain values of current date.
But I am unbale to do so. (17 Replies)