Sponsored Content
Full Discussion: Reformat text table
Top Forums Shell Programming and Scripting Reformat text table Post 302486541 by yifangt on Sunday 9th of January 2011 09:17:10 AM
Old 01-09-2011
Thanks Scrutinizer!

This is amazing and too complicated to me. Is it possible for you to explain it to me, as I can only catch part of your code?

Actually my data is much bigger than the sample and I ignored the header row and some of the columns. I thought of using perl to parse it, and combined each row with the same SNP name in one row.

1) Each row start with the SNP name that can be repeated for 4 times at most (they are neighbour rows). Some only once. The output is a combined single row for all the same SNP;
2) If the 1st column is the same then the 2nd, 4th and 5th are the same (for same SNP), which means the same SNP in different rows. This is the most different part from my first post;
3) There are 96 variants for each SNP. The variant not listed for a specific SNP indicates the SNP is missing for it and should be labeled as - or NA for consistency of the output format;

Sorry for not put the raw data first as I was trying perl script by using hash and I am a geneticist fond of programming. Anyway, thank you if you can have a look at this again.
Code:
SNP-name    chromosome-polymorphic-sequence-Species-variants    Locus-(if mapped-to-locus)    Chromosomal-map-location
BKN000000001    1    C    RRS-7;RRS-10;Knox-10;Knox-18;Rmx-A02;Rmx-A180;Pna-17;Pna-10;Eden-1;Eden-2;Lov-1;Lov-5;Fab-2;Fab-4;Bil-5;Bil-7;Var2-1;Var2-6;Spr1-2;Spr1-6;Omo2-1;Omo2-3;Ull2-5;Ull2-3;Zdr-1;Zdr-6;Bor-1;Bor-4;Pu2-7;Pu2-23;Lp2-2;Lp2-6;HR5;HR-10;NFA-8;NFA-10;Sq-1;Sq-8;CIBC5;CIBC17;Tamm-2;Tamm-27;KZ9;Goettingen-7;Goettingen-22;Rennes-1;Rennes-11;Uod-1;Uod-7;Cvi-0;Lz-0;Ei-2;Gu-0;Ler-1;Nd-1;C24;CS22491;Wei-0;Ws-0;Yo-0;Col-0;An-1;Br-0;Est-1;Ag-0;Gy-0;Ra-0;Bay-0;Ga-0;Mrk-0;Mz-0;Wt-5;Kas-1;Ct-1;Mr-0;Tsu-1;Mt-0;Nok-3;Wa-1;Fei-0;Se-0;Ts-1;Ts-5;Pro-0;Ll-0;Kondara;Shahdara;Sorbo;Kin-0;Ms-0;Bur-0;Edi-0;Oy-0;Ws-2    AT1G01280    112482
BKN000000001    1    T    KZ1    AT1G01280    112482
BKN000000002    1    G    RRS-7;RRS-10;Knox-10;Knox-18;Rmx-A02;Rmx-A180;Pna-17;Pna-10;Eden-1;Eden-2;Lov-1;Lov-5;Fab-2;Fab-4;Bil-5;Bil-7;Var2-1;Var2-6;Spr1-2;Spr1-6;Omo2-1;Omo2-3;Ull2-5;Ull2-3;Zdr-1;Zdr-6;Bor-1;Bor-4;Pu2-7;Pu2-23;Lp2-2;Lp2-6;HR5;HR-10;NFA-8;NFA-10;Sq-1;Sq-8;CIBC5;CIBC17;Tamm-2;Tamm-27;KZ1;KZ9;Goettingen-7;Goettingen-22;Rennes-1;Rennes-11;Uod-1;Uod-7;Cvi-0;Lz-0;Ei-2;Gu-0;Ler-1;Nd-1;C24;CS22491;Wei-0;Ws-0;Yo-0;Col-0;An-1;Br-0;Est-1;Ag-0;Gy-0;Ra-0;Bay-0;Ga-0;Mrk-0;Mz-0;Wt-5;Kas-1;Ct-1;Mr-0;Tsu-1;Mt-0;Nok-3;Wa-1;Fei-0;Se-0;Ts-1;Ts-5;Pro-0;Ll-0;Shahdara;Kin-0;Ms-0;Bur-0;Edi-0;Oy-0;Ws-2    AT1G01280    112561
BKN000000002    1    A    Kondara;Sorbo    AT1G01280    112561
BKN000000003    1    A    RRS-7;RRS-10;Knox-10;Knox-18;Rmx-A02;Rmx-A180;Pna-10;Eden-1;Eden-2;Lov-1;Lov-5;Fab-2;Fab-4;Bil-5;Bil-7;Var2-1;Var2-6;Spr1-2;Spr1-6;Omo2-1;Omo2-3;Ull2-5;Ull2-3;Zdr-1;Zdr-6;Bor-1;Bor-4;Pu2-7;Pu2-23;Lp2-2;Lp2-6;Sq-8;CIBC5;CIBC17;Tamm-2;Tamm-27;KZ1;KZ9;Goettingen-7;Goettingen-22;Uod-1;Uod-7;Cvi-0;Ei-2;Gu-0;Ler-1;Nd-1;C24;CS22491;Wei-0;Ws-0;Yo-0;Col-0;An-1;Est-1;Gy-0;Ra-0;Bay-0;Ga-0;Mrk-0;Wt-5;Kas-1;Ct-1;Mr-0;Tsu-1;Mt-0;Nok-3;Wa-1;Se-0;Ts-1;Ts-5;Pro-0;Ll-0;Kondara;Shahdara;Sorbo;Kin-0;Ms-0;Bur-0;Edi-0;Oy-0;Ws-2    AT1G01280    112771
BKN000000003    1    G    Pna-17;HR5;HR-10;NFA-8;NFA-10;Sq-1;Rennes-1;Rennes-11;Lz-0;Br-0;Ag-0;Mz-0;Fei-0    AT1G01280    112771
.
.
.

Thanks again!

Yifangt

Last edited by yifangt; 01-09-2011 at 11:02 AM.. Reason: Code tags
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how can I bcp out a table into a text file including the header row in the text file

Hi All, I need to BCP out a table into a text file along with the table headers. Normal BCP out command only bulk copies the data, and not the headers. I am using the following command: bcp database1..table1 out file1.dat -c -t\| -b1000 -A8192 -Uuser -Ppassword -efile.dat.err Regards,... (0 Replies)
Discussion started by: shilpa_acc
0 Replies

2. Shell Programming and Scripting

awk to reformat a text file

I am definitely not an expert with awk, and I want to reformat a text file like the following. This is probably a very easy one for an expert out there. I would like to keep the lines in the same order, but move the heading to only be listed once above the lines. This is what the text file... (7 Replies)
Discussion started by: linux4life
7 Replies

3. Shell Programming and Scripting

Help in script - Getting table name from a text file

hhhhhhhhhh (5 Replies)
Discussion started by: sams
5 Replies

4. Shell Programming and Scripting

Make a table from a text file

Hi, I have a pipe separated text file. Can some someone tell me how to convert it to a table? Text File contents. |Activities|Status1|Status2|Status3| ||NA|$io_running2|$io_running3| |Replication Status|NA|$running2|$running3| ||NA|$master2|$master3|... (1 Reply)
Discussion started by: rocky88
1 Replies

5. Shell Programming and Scripting

Normal text to table format

Hi, I am trying to show my list, from a simple list format to a table (row and column formatted table) Currently i have this format in my output (the formart it will always be like this ) >> first 3 lines must be on the same line aligned, and the next 3 shud be on 2nd line....: INT1:... (10 Replies)
Discussion started by: eboye
10 Replies

6. Shell Programming and Scripting

awk to reformat text

I have this input and want output like below, how can I achieve that through awk: Input: CAT1 FRY-01 CAT1 FRY-04 CAT1 DRY-03 CAT1 FRY-02 CAT1 DRY-04 CAT2 FRY-03 CAT2 FRY-02 CAT2 DRY-01 FAT3 DRY-12 FAT3 FRY-06 Output: category CAT1 item FRY-01 (7 Replies)
Discussion started by: aydj
7 Replies

7. UNIX for Dummies Questions & Answers

Loading text file into table

Hi, I have text file with comma seprater shown below lu8yh,n,Fri,Feb,7,2014,16:5 deer4 deer4,n,Tue,Aug,21,,2012,on r43ed r43ed,n,Tue,Nov,12,2013,12: e43sd e43sd,n,Tue,Jan,1,,2013,on, I am using below code to load the text file into table #!/bin/ksh... (16 Replies)
Discussion started by: stew
16 Replies

8. UNIX for Dummies Questions & Answers

Deleting unwanted text from a table

Hi everyone, I have a microbial diversity table in the format ;k__kingdom; p__phylum, etc, somer rows have descriptions before the :k__ (like the af028349.1 below) is there a way I can get rid of this text (which is different every time) and keep all the other columns? Thanks a bunch! ;... (1 Reply)
Discussion started by: Juan Gonzalez
1 Replies

9. Shell Programming and Scripting

awk to reformat text file

Howdy. AWK beginner here. I need to reformat a text file in the following format: TTGS08-2014001 6018.00 143563.00 ... (2 Replies)
Discussion started by: c47v3770
2 Replies
DateTime::Locale::ha_Latn(3)				User Contributed Perl Documentation			      DateTime::Locale::ha_Latn(3)

NAME
DateTime::Locale::ha_Latn SYNOPSIS
use DateTime; my $dt = DateTime->now( locale => 'ha_Latn' ); print $dt->month_name(); DESCRIPTION
This is the DateTime locale package for Hausa Latin. DATA
This locale inherits from the DateTime::Locale::ha locale. It contains the following data. Days Wide (format) Litini Talata Laraba Alhamis Jumma'a Asabar Lahadi Abbreviated (format) Lit Tal Lar Alh Jum Asa Lah Narrow (format) L T L A J A L Wide (stand-alone) Litini Talata Laraba Alhamis Jumma'a Asabar Lahadi Abbreviated (stand-alone) Lit Tal Lar Alh Jum Asa Lah Narrow (stand-alone) L T L A J A L Months Wide (format) Janairu Fabrairu Maris Afrilu Mayu Yuni Yuli Augusta Satumba Oktoba Nuwamba Disamba Abbreviated (format) Jan Fab Mar Afr May Yun Yul Aug Sat Okt Nuw Dis Narrow (format) J F M A M Y Y A S O N D Wide (stand-alone) Janairu Fabrairu Maris Afrilu Mayu Yuni Yuli Augusta Satumba Oktoba Nuwamba Disamba Abbreviated (stand-alone) Jan Fab Mar Afr May Yun Yul Aug Sat Okt Nuw Dis Narrow (stand-alone) J F M A M Y Y A S O N D Quarters Wide (format) Q1 Q2 Q3 Q4 Abbreviated (format) Q1 Q2 Q3 Q4 Narrow (format) 1 2 3 4 Wide (stand-alone) Q1 Q2 Q3 Q4 Abbreviated (stand-alone) Q1 Q2 Q3 Q4 Narrow (stand-alone) 1 2 3 4 Eras Wide Gabanin Miladi Miladi Abbreviated GM M Narrow GM M Date Formats Full 2008-02-05T18:30:30 = Talata, 5 Fabrairu, 2008 1995-12-22T09:05:02 = Jumma'a, 22 Disamba, 1995 -0010-09-15T04:44:23 = Asabar, 15 Satumba, -10 Long 2008-02-05T18:30:30 = 5 Fabrairu, 2008 1995-12-22T09:05:02 = 22 Disamba, 1995 -0010-09-15T04:44:23 = 15 Satumba, -10 Medium 2008-02-05T18:30:30 = 5 Fab, 2008 1995-12-22T09:05:02 = 22 Dis, 1995 -0010-09-15T04:44:23 = 15 Sat, -10 Short 2008-02-05T18:30:30 = 5/2/08 1995-12-22T09:05:02 = 22/12/95 -0010-09-15T04:44:23 = 15/9/-10 Default 2008-02-05T18:30:30 = 5 Fab, 2008 1995-12-22T09:05:02 = 22 Dis, 1995 -0010-09-15T04:44:23 = 15 Sat, -10 Time Formats Full 2008-02-05T18:30:30 = 18:30:30 UTC 1995-12-22T09:05:02 = 09:05:02 UTC -0010-09-15T04:44:23 = 04:44:23 UTC Long 2008-02-05T18:30:30 = 18:30:30 UTC 1995-12-22T09:05:02 = 09:05:02 UTC -0010-09-15T04:44:23 = 04:44:23 UTC Medium 2008-02-05T18:30:30 = 18:30:30 1995-12-22T09:05:02 = 09:05:02 -0010-09-15T04:44:23 = 04:44:23 Short 2008-02-05T18:30:30 = 18:30 1995-12-22T09:05:02 = 09:05 -0010-09-15T04:44:23 = 04:44 Default 2008-02-05T18:30:30 = 18:30:30 1995-12-22T09:05:02 = 09:05:02 -0010-09-15T04:44:23 = 04:44:23 Datetime Formats Full 2008-02-05T18:30:30 = Talata, 5 Fabrairu, 2008 18:30:30 UTC 1995-12-22T09:05:02 = Jumma'a, 22 Disamba, 1995 09:05:02 UTC -0010-09-15T04:44:23 = Asabar, 15 Satumba, -10 04:44:23 UTC Long 2008-02-05T18:30:30 = 5 Fabrairu, 2008 18:30:30 UTC 1995-12-22T09:05:02 = 22 Disamba, 1995 09:05:02 UTC -0010-09-15T04:44:23 = 15 Satumba, -10 04:44:23 UTC Medium 2008-02-05T18:30:30 = 5 Fab, 2008 18:30:30 1995-12-22T09:05:02 = 22 Dis, 1995 09:05:02 -0010-09-15T04:44:23 = 15 Sat, -10 04:44:23 Short 2008-02-05T18:30:30 = 5/2/08 18:30 1995-12-22T09:05:02 = 22/12/95 09:05 -0010-09-15T04:44:23 = 15/9/-10 04:44 Default 2008-02-05T18:30:30 = 5 Fab, 2008 18:30:30 1995-12-22T09:05:02 = 22 Dis, 1995 09:05:02 -0010-09-15T04:44:23 = 15 Sat, -10 04:44:23 Available Formats d (d) 2008-02-05T18:30:30 = 5 1995-12-22T09:05:02 = 22 -0010-09-15T04:44:23 = 15 EEEd (d EEE) 2008-02-05T18:30:30 = 5 Tal 1995-12-22T09:05:02 = 22 Jum -0010-09-15T04:44:23 = 15 Asa Hm (H:mm) 2008-02-05T18:30:30 = 18:30 1995-12-22T09:05:02 = 9:05 -0010-09-15T04:44:23 = 4:44 hm (h:mm a) 2008-02-05T18:30:30 = 6:30 PM 1995-12-22T09:05:02 = 9:05 AM -0010-09-15T04:44:23 = 4:44 AM Hms (H:mm:ss) 2008-02-05T18:30:30 = 18:30:30 1995-12-22T09:05:02 = 9:05:02 -0010-09-15T04:44:23 = 4:44:23 hms (h:mm:ss a) 2008-02-05T18:30:30 = 6:30:30 PM 1995-12-22T09:05:02 = 9:05:02 AM -0010-09-15T04:44:23 = 4:44:23 AM M (L) 2008-02-05T18:30:30 = 2 1995-12-22T09:05:02 = 12 -0010-09-15T04:44:23 = 9 Md (M-d) 2008-02-05T18:30:30 = 2-5 1995-12-22T09:05:02 = 12-22 -0010-09-15T04:44:23 = 9-15 MEd (E, d-M) 2008-02-05T18:30:30 = Tal, 5-2 1995-12-22T09:05:02 = Jum, 22-12 -0010-09-15T04:44:23 = Asa, 15-9 MMM (LLL) 2008-02-05T18:30:30 = Fab 1995-12-22T09:05:02 = Dis -0010-09-15T04:44:23 = Sat MMMd (d MMM) 2008-02-05T18:30:30 = 5 Fab 1995-12-22T09:05:02 = 22 Dis -0010-09-15T04:44:23 = 15 Sat MMMEd (E d MMM) 2008-02-05T18:30:30 = Tal 5 Fab 1995-12-22T09:05:02 = Jum 22 Dis -0010-09-15T04:44:23 = Asa 15 Sat MMMMd (d MMMM) 2008-02-05T18:30:30 = 5 Fabrairu 1995-12-22T09:05:02 = 22 Disamba -0010-09-15T04:44:23 = 15 Satumba MMMMEd (E d MMMM) 2008-02-05T18:30:30 = Tal 5 Fabrairu 1995-12-22T09:05:02 = Jum 22 Disamba -0010-09-15T04:44:23 = Asa 15 Satumba ms (mm:ss) 2008-02-05T18:30:30 = 30:30 1995-12-22T09:05:02 = 05:02 -0010-09-15T04:44:23 = 44:23 y (y) 2008-02-05T18:30:30 = 2008 1995-12-22T09:05:02 = 1995 -0010-09-15T04:44:23 = -10 yM (y-M) 2008-02-05T18:30:30 = 2008-2 1995-12-22T09:05:02 = 1995-12 -0010-09-15T04:44:23 = -10-9 yMEd (EEE, d/M/yyyy) 2008-02-05T18:30:30 = Tal, 5/2/2008 1995-12-22T09:05:02 = Jum, 22/12/1995 -0010-09-15T04:44:23 = Asa, 15/9/-010 yMMM (y MMM) 2008-02-05T18:30:30 = 2008 Fab 1995-12-22T09:05:02 = 1995 Dis -0010-09-15T04:44:23 = -10 Sat yMMMEd (EEE, d MMM y) 2008-02-05T18:30:30 = Tal, 5 Fab 2008 1995-12-22T09:05:02 = Jum, 22 Dis 1995 -0010-09-15T04:44:23 = Asa, 15 Sat -10 yMMMM (y MMMM) 2008-02-05T18:30:30 = 2008 Fabrairu 1995-12-22T09:05:02 = 1995 Disamba -0010-09-15T04:44:23 = -10 Satumba yQ (y Q) 2008-02-05T18:30:30 = 2008 1 1995-12-22T09:05:02 = 1995 4 -0010-09-15T04:44:23 = -10 3 yQQQ (y QQQ) 2008-02-05T18:30:30 = 2008 Q1 1995-12-22T09:05:02 = 1995 Q4 -0010-09-15T04:44:23 = -10 Q3 yyQ (Q yy) 2008-02-05T18:30:30 = 1 08 1995-12-22T09:05:02 = 4 95 -0010-09-15T04:44:23 = 3 -10 Miscellaneous Prefers 24 hour time? Yes Local first day of the week Litini SUPPORT
See DateTime::Locale. AUTHOR
Dave Rolsky <autarch@urth.org> COPYRIGHT
Copyright (c) 2008 David Rolsky. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. This module was generated from data provided by the CLDR project, see the LICENSE.cldr in this distribution for details on the CLDR data's license. perl v5.18.2 2017-10-06 DateTime::Locale::ha_Latn(3)
All times are GMT -4. The time now is 11:02 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy