![]() |
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| High Level Programming Post questions about C, C++, Java, SQL, and other programming languages here. |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| matrix pointer | littleboyblu | High Level Programming | 3 | 03-03-2009 02:09 PM |
| matrix indexes | tal | Shell Programming and Scripting | 2 | 10-27-2008 07:08 AM |
| matrix inverse (awk) | vesyyr | Shell Programming and Scripting | 0 | 12-14-2007 03:18 PM |
| need help-matrix inverse (awk) | vesyyr | UNIX for Dummies Questions & Answers | 0 | 12-14-2007 02:44 PM |
|
|
LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
||||
|
Making an information matrix using Java
Hello all,
I'm doing some research in Biostatistics this summer studying chloroplast genomes. I have 19 text files that look exactly like this: Name: Marchantia polymorpha FileName: NC_001319 Bases: 121024 Genes: rps12 <85..842 rps7 (892..1359) ndhB (1514..3555) psbM (4001..4105) rpoB (5859..9056) rpoC1 (9087..11737) rpoC2 (11811..15971) rps2 (16055..16762) atpI (16890..17636) atpH (18014..18259) atpF (18468..19609) atpA (19654..21177) ycf12 -(22162..22263) psbI -(22997..23107) psbK -(23438..23605) chlB (24053..25594) psbA (28368..29429) mbpX (37012..38124) psbD (38855..39916) psbC (39864..41285) psbZ (41647..41835) rps14 -(42333..42635) psaB -(42724..44928) psaA -(44955..47207) rps4 -(49425..50033) ndhJ -(51233..51709) ndhK -(51793..52524) ndhC -(52515..52877) atpE -(53955..54362) atpB -(54368..55846) rbcL (56355..57782) accD (58065..59015) psaI (59193..59303) ycf4 (59525..60079) cemA (60151..61455) petA (61641..62603) psbJ -(62794..62916) psbL -(63036..63152) psbF -(63174..63293) psbE -(63303..63554) petG (64370..64483) psaJ (65027..65155) rpl33 (65273..65470) rps18 (65498..65725) rpl20 -(65807..66157) clpP -(67130..68640) psbB (69026..70552) psbT (70669..70776) psbN -(70863..70994) psbH (71092..71316) petB (71424..72566) petD (72715..73690) rpoA -(73802..74824) rps11 -(74857..75249) rpl36 -(75300..75413) infA -(75450..75686) rps8 -(75773..76171) rpl14 -(76253..76621) rpl16 -(76719..77685) rps3 -(77743..78396) rpl22 -(78445..78804) rps19 -(78822..79100) rpl2 -(79137..80514) rpl23 -(80550..80825) ndhF -(91101..93179) rpl21 (93469..93819) rpl32 (93886..94095) cysT (94183..95049) ccsA (95482..96444) ndhD -(96665..98164) psaC -(98289..98534) ndhE -(98757..99059) ndhG -(99113..99688) ndhI -(99779..100330) ndhA -(100382..102200) ndhH -(102202..103380) rps15 -(103433..103699) chlL -(110104..110973) Of course, each one of the 19 files have a different Name, NC_ number, number of bases, and different genes in different numerical positions. I have a slight knowledge of Java, and wish to take these files and make an information matrix. The NC_ numbers of the 19 different genes would be listed across the top, and each gene would be listed down the side. Then, in a matrix, if that NC_ number file contains a certain gene on the left, place a 1 in the matrix, otherwise a 0. If I just have the 19 text files as command line args, is it possible to do this somehow? Maybe with a TreeMap or other data structure? If it would make it easier, I could also trim the files down to just the gene names, with the headings of the numerical positions next to each. I don't really know what would be best, but some program that could make this matrix would really help my research. Thanks to anyone taking the time to read this and any ideas would help! -akreibich07 |
| Bookmarks |
| Tags |
| biology, data, java, matrix, structures |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|