Reformat text table


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Reformat text table
# 8  
Old 01-09-2011
Quote:
Originally Posted by Scrutinizer
Hi yifangt, you are welcome. Here is an explanation:

awk -F'[ \t:;,]*'Use zero or more repetitions of the characters in square brackets as field separators
split($0,T)Split the record $0 into array T, using FS as field separator, effectively creating a copy of $1 to $NF (allowing the reuse of $1 to $NF for output..)
for(i=NF;i>=2;i--)reading backwards from the last field number to the 2nd ..
if (T[i]~/m[0-9]/)if the array copy of field number "i" contains "m" followed by a digit,
{sub(/m/,x,T[i]) remove the letter m from that field.
$(T[i]+1)=cStore the character contained in variable c into the field number contained in T[i] + 1. If for example T[i] contains 4 than store in $5
else c=T[i]if the array copy of field number "i" does not contain m followed by a digit, it must be a new value which gets stored in variable c
NF=11Cut off fields $12 until $NF, so that 11 fields remain
1Print every record
OFS="\t"Use tab as output field separator
With your actual raw data what is the required output?

S.
Thanks!

The output format is the same as your first reply, which means the header is: SNP, chromosome, the species variants (96) Locus, position. Totally 100 columns.

The body is

column 1: SNP name = BKN00000xx,
column 2: chromosome= 1, or 2 or 3 or 4 or 5
column 3~98: single nucleotide: A/T/C/G/- under each species
column 99: like At1g12300
column 100: 127861

The awk script seems the right thing. Thanks again.

Yifangt

My reply got mixed up with those to Ludwig who was trying the perl script on this. Here I copy my reply that was wrongly sent to him. Following is the output example where the most of the columns in the middle were omitted.
Your awk script is very impressive to me.

SNP chromosome Ag-0 An-1 Bay-0 Bil-5 Zdr-1 Zdr-6 Locus location
BKN000000001 1 C C C C C C AT1G01280 112482
BKN000000002 1 G G G G - - AT1G01280 112561
BKN000000003 1 G A A A A A AT1G01280 112771
.

Thank you again!
Yifang

---------- Post updated at 11:18 PM ---------- Previous update was at 11:00 PM ----------

Quote:
Originally Posted by m.d.ludwig
In your data sample:
There are six columns of data but four column headers.
Are the first three data columns the "SNP-Name"?
And the last two the "value", the 'A', 'B', 'C', 'D' in your example?

---------- Post updated at 10:51 AM ---------- Previous update was at 10:35 AM ----------

My initial implementation to generate a CSV file:
Code:
use strict;
use warnings;

$\ = "\n";
$, = '';

my %H;
my %D;

<>; # toss the header

while (<>) {
    chomp;
    my ($snpname, $snpidx, $acgt, $chomosomelist, $locus, $location) = split;

    unless (defined $location) {
        print STDERR $ARGV, '(', $., '): malformed entry - ', $_;
        next;
    }

#> adjust these as required to get a proper "label" and "value"

    $snpname .= '-' . $snpidx;
    $locus   .= '(' . $location . ')';

    foreach my $c (split /;/, $chomosomelist) {
        $H{$c}++;
        $D{$snpname}->{$c} = $locus;
    }
}

sub csv {
    local $, = ',';
    print map { defined $_ ? '"' . $_ . '"' : '"NA"' } @_;
}

my @H = sort keys %H;

csv '', @H;

foreach my $snpname (sort keys %D) {
    my $X = $D{$snpname};
    csv $snpname, map { $X->{$_} } @H;
}


Thanks!
Tried your code.
1) The output format is close to right: 97 columns with the species as the header;
2) Each cell is not right, I want A, C, T, G or - for each column corresponding to the header;
3) The output of your code in each cell is the Locus and Location (repeated 96 times!) not the A/T/C/G/-;

Not sure what is the problem. My output file should be pretty big: 100 column x 12281 rows. Quite nervous with it.

Thanks anyway! Yifang

Last edited by yifangt; 01-09-2011 at 10:47 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to reformat text file

Howdy. AWK beginner here. I need to reformat a text file in the following format: TTGS08-2014001 6018.00 143563.00 ... (2 Replies)
Discussion started by: c47v3770
2 Replies

2. UNIX for Dummies Questions & Answers

Deleting unwanted text from a table

Hi everyone, I have a microbial diversity table in the format ;k__kingdom; p__phylum, etc, somer rows have descriptions before the :k__ (like the af028349.1 below) is there a way I can get rid of this text (which is different every time) and keep all the other columns? Thanks a bunch! ;... (1 Reply)
Discussion started by: Juan Gonzalez
1 Replies

3. UNIX for Dummies Questions & Answers

Loading text file into table

Hi, I have text file with comma seprater shown below lu8yh,n,Fri,Feb,7,2014,16:5 deer4 deer4,n,Tue,Aug,21,,2012,on r43ed r43ed,n,Tue,Nov,12,2013,12: e43sd e43sd,n,Tue,Jan,1,,2013,on, I am using below code to load the text file into table #!/bin/ksh... (16 Replies)
Discussion started by: stew
16 Replies

4. Shell Programming and Scripting

awk to reformat text

I have this input and want output like below, how can I achieve that through awk: Input: CAT1 FRY-01 CAT1 FRY-04 CAT1 DRY-03 CAT1 FRY-02 CAT1 DRY-04 CAT2 FRY-03 CAT2 FRY-02 CAT2 DRY-01 FAT3 DRY-12 FAT3 FRY-06 Output: category CAT1 item FRY-01 (7 Replies)
Discussion started by: aydj
7 Replies

5. Shell Programming and Scripting

Normal text to table format

Hi, I am trying to show my list, from a simple list format to a table (row and column formatted table) Currently i have this format in my output (the formart it will always be like this ) >> first 3 lines must be on the same line aligned, and the next 3 shud be on 2nd line....: INT1:... (10 Replies)
Discussion started by: eboye
10 Replies

6. Shell Programming and Scripting

Make a table from a text file

Hi, I have a pipe separated text file. Can some someone tell me how to convert it to a table? Text File contents. |Activities|Status1|Status2|Status3| ||NA|$io_running2|$io_running3| |Replication Status|NA|$running2|$running3| ||NA|$master2|$master3|... (1 Reply)
Discussion started by: rocky88
1 Replies

7. Shell Programming and Scripting

Help in script - Getting table name from a text file

hhhhhhhhhh (5 Replies)
Discussion started by: sams
5 Replies

8. Shell Programming and Scripting

awk to reformat a text file

I am definitely not an expert with awk, and I want to reformat a text file like the following. This is probably a very easy one for an expert out there. I would like to keep the lines in the same order, but move the heading to only be listed once above the lines. This is what the text file... (7 Replies)
Discussion started by: linux4life
7 Replies

9. Shell Programming and Scripting

how can I bcp out a table into a text file including the header row in the text file

Hi All, I need to BCP out a table into a text file along with the table headers. Normal BCP out command only bulk copies the data, and not the headers. I am using the following command: bcp database1..table1 out file1.dat -c -t\| -b1000 -A8192 -Uuser -Ppassword -efile.dat.err Regards,... (0 Replies)
Discussion started by: shilpa_acc
0 Replies
Login or Register to Ask a Question