Sponsored Content
Full Discussion: Matrix parsing help !
Top Forums Programming Matrix parsing help ! Post 302586729 by mchimich on Tuesday 3rd of January 2012 05:42:52 AM
Old 01-03-2012
Matrix parsing help !

Hello every body ! I'm a new in this forum and beginner in Perl scripting and I have some problems SmilieSmilieSmilie! I have a big file like that :
Code:
ID1                   ID2                       Identity 
chromosome07_194379   chromosome01_168057       0.975
chromosome01_100293   chromosome01_168057       0.969
chromosome01_100293   chromosome07_194379       0.969
chromosome01_29385    chromosome01_168057       0.856
chromosome01_29385    chromosome07_194379       0.856
chromosome01_29385    chromosome01_100293       0.861
chromosome08_116839   chromosome01_168057       0.78
chromosome08_116839   chromosome01_100293       0.786
chromosome08_116839   chromosome01_293853       0.946

The three column are separated by tabulation (\t)

I want to cluster the IDs that share a identity more than 0.8 using Perl scripting, can someone help me ?
Thanks a lot in advance for your help

Moderator's Comments:
Mod Comment Please use code tags!

Last edited by zaxxon; 01-03-2012 at 06:53 AM.. Reason: code tags
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Perl parsing compared to Ksh parsing

#! /usr/local/bin/perl -w $ip = "$ARGV"; $rw = "$ARGV"; $snmpg = "/usr/local/bin/snmpbulkget -v2c -Cn1 -Cn2 -Os -c $rw"; $snmpw = "/usr/local/bin/snmpwalk -Os -c $rw"; $syst=`$snmpg $ip system sysName sysObjectID`; sysDescr.0 = STRING: Cisco Internetwork Operating System Software... (1 Reply)
Discussion started by: popeye
1 Replies

2. Shell Programming and Scripting

Parsing of file for Report Generation (String parsing and splitting)

Hey guys, I have this file generated by me... i want to create some HTML output from it. The problem is that i am really confused about how do I go about reading the file. The file is in the following format: TID1 Name1 ATime=xx AResult=yyy AExpected=yyy BTime=xx BResult=yyy... (8 Replies)
Discussion started by: umar.shaikh
8 Replies

3. Shell Programming and Scripting

diagonal matrix to square matrix

Hello, all! I am struggling with a short script to read a diagonal matrix for later retrieval. 1.000 0.234 0.435 0.123 0.012 0.102 0.325 0.412 0.087 0.098 1.000 0.111 0.412 0.115 0.058 0.091 0.190 0.045 0.058 1.000 0.205 0.542 0.335 0.054 0.117 0.203 0.125 1.000 0.587 0.159 0.357... (11 Replies)
Discussion started by: yifangt
11 Replies

4. Shell Programming and Scripting

Matrix

Hi All I would like to merge multiple files with the same row and column size into a matrix format In a folder I have multiple files in the following format vi 12.txt a 1 b 5 c 7 d 0 vi 45.txt a 3 b 6 c 9 d 2 vi 9.txt a 4 (7 Replies)
Discussion started by: Lucky Ali
7 Replies

5. Ubuntu

How to convert full data matrix to linearised left data matrix?

Hi all, Is there a way to convert full data matrix to linearised left data matrix? e.g full data matrix Bh1 Bh2 Bh3 Bh4 Bh5 Bh6 Bh7 Bh1 0 0.241058 0.236129 0.244397 0.237479 0.240767 0.245245 Bh2 0.241058 0 0.240594 0.241931 0.241975 ... (8 Replies)
Discussion started by: evoll
8 Replies

6. Shell Programming and Scripting

awk? adjacency matrix to adjacency list / correlation matrix to list

Hi everyone I am very new at awk but think that that might be the best strategy for this. I have a matrix very similar to a correlation matrix and in practical terms I need to convert it into a list containing the values from the matrix (one value per line) with the first field of the line (row... (5 Replies)
Discussion started by: stonemonkey
5 Replies

7. Shell Programming and Scripting

Constructing a Matrix

Hi, I do have couple of files in a folder. The names of each of the files have a pattern. ahet_005678.txt ahet_005898.txt ahet_007678.txt ahet_004778.txt ... ... ahet_002378.txt Each of the above files have the same pattern of data with 4 columns and have an header for the last 3... (4 Replies)
Discussion started by: Kanja
4 Replies

8. Shell Programming and Scripting

Highest value matrix parsing

Hi All I do have a matrix in the following format a_2 a_3 s_4 t_6 b 0 0.9 0.004 0 c 0 0 1 0 d 0 0.98 0 0 e 0.0023 0.96 0 0.0034 I have thousands of rows I would like to parse the maximum value in each of the row and out put that highest value along the column header of... (2 Replies)
Discussion started by: Kanja
2 Replies

9. Shell Programming and Scripting

Parsing a subset of data from a large matrix

I do have a large matrix of the following format and it is tab delimited ch-ab1-20 ch-bb2-23 ch-ab1-34 ch-ab1-24 er-cc1-45 bv-cc1-78 ch-ab1-20 0 2 3 4 5 6 ch-bb2-23 3 0 5 ... (6 Replies)
Discussion started by: Kanja
6 Replies
CBCODEC(1)						      Quick Database Manager							CBCODEC(1)

NAME
cbcodec - popular encoders and decoders SYNOPSIS
cbcodec url [-d] [-br] [-rs base target] [-l] [-e expr] [file] cbcodec base [-d] [-l] [-c num] [-e expr] [file] cbcodec quote [-d] [-l] [-c num] [-e expr] [file] cbcodec mime [-d] [-hd] [-bd] [-part num] [-l] [-ec code] [-qp] [-dc] [-e expr] [file] cbcodec csv [-d] [-t] [-l] [-e expr] [-html] [file] cbcodec xml [-d] [-p] [-l] [-e expr] [-tsv] [file] cbcodec zlib [-d] [-gz] [-crc] [file] cbcodec lzo [-d] [file] cbcodec bzip [-d] [file] cbcodec iconv [-ic code] [-oc code] [-ol ltype] [-cn] [-wc] [-um] [file] cbcodec date [-wf] [-rf] [-utc] [str] DESCRIPTION
This manual page documents briefly the cbcodec commands. cbcodec is a tool to use encoding and decoding features provided by Cabin. This command is used in the above format. file specifies a input file. If it is omitted, the standard input is read. OPTIONS
A summary of options is included below. For a complete description, see the file:///usr/share/doc/qdbm-doc/spex.html#cabincli. -d perform decoding (unescaping), not encoding (escaping). -br break up URL into elements. -rs resolve relative URL. -l output the tailing newline. -e expr specify input data directly. -c num limit the number of columns of the encoded data. -hd parse MIME and extract headers in TSV format. -bd parse MIME and extract the body. -part num parse MIME and extract a part. -ec code specify the input encoding, which is UTF-8 by default. -qp use quoted-printable encoding, which is Base64 by default. -dc output the encoding name instead of the result string when decoding. -t parse CSV. Convert the data into TSV. Tab and new-line in a cell are deleted. -html parse CSV. Convert the data into HTML. -p parse XML. Show tags and text sections with dividing headers. -tsv parse XML. Show the result in TSV format. Characters of tabs and new-lines are URL-encoded. -gz use GZIP format. -crc output the CRC32 checksum as hexadecimal and big endian. -ic code specify the input encoding, which is detected automatically by default. -oc code specify the output encoding, which is UTF-8 by default. -ol ltype convert line feed characters, with `unix'(LF), `dos'(CRLF), and `mac'(CR). -cn detect the input encoding and show its name. -wc count the number of characters of the input string of UTF-8. -um output mappings of UCS-2 characters and C strings of UTF-16BE and UTF-8. -wf output in W3CDTF format. -rf output in RFC 1123 format. -utc output the coordinate universal time. SEE ALSO
qdbm(3), cabin(3). AUTHOR
QDBM was written by Mikio Hirabayashi <mikio@fallabs.com>. This manual page was written by Fumitoshi UKAI <ukai@debian.or.jp>, for the Debian project (but may be used by others). Man Page 2005-05-23 CBCODEC(1)
All times are GMT -4. The time now is 08:13 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy