Create table based on matched patterns


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Create table based on matched patterns
# 1  
Old 02-01-2014
Create table based on matched patterns

hi,

i need help to create a table from an input file like this:-

Code:
DB|QZX3  140  165  RT_2   VgGIGvGVR
DB|QZX3  155  182  UT_1   rlgslqqLaIvlGiFT
DB|QZX3  345  362  RT_1   GRKpllligS
DB|ZXK6  174  199  RT_2  IstvtvptYlgEiatvkaR
DB|ZXK6  189  216  UT_1    algtiyqLfLviGiLF
DB|AZ264  15  17    RT_2  getapvYlaEmspasiR
DB|A1Z8N1  457  474  RT_1  GGPLIEYLGRRntilatA
DB|A1Z8N1  499  524 RT_2  LaGFCvGIaslsqpevR
DB|A1Z8N1  690  706  RT_1  GIVLIDKillyv.S
DB|A3M0N3  133  158  RT_2  LaGLGvGLiR
DB|A3M0N3  334  351  RT_1  GIGRRklllggS

The ouput file should be in a table like this:-
Code:
ID                  RT_1                                 RT_2                                        UT1
DB|QZX3        G R K p l l l i g S                    V g G I G v G V R                             r l g s l q q L a I v l G i F T
DB|ZXK6                                               I s t v t v p t Y l g E i a t v k a R         a l g t i y q L f L v i G i L F
DB|AZ264                                              g e t a p v Y l a E m s p a s i R
DB|A1Z8N1      G G P L I E Y L G R R n t i l a t A
DB|A1Z8N1      G I V L I D K i l l y v . S
DB|A3M0N3      G I G R R k l l l g g S                L a G L G v G L i R

as you can see above, there are 4 situations:
1) values in $4 should be the header after $1
2) those ids without any values in it should be left blank
3) same id with different values should be printed separately.
4) each characters in $5 in input file need to be separated

i have thousands of data like this that i need to arrange. I used "paste" but the result is not neat and does not display exactly what i want. Please help me how to do this in awk if possible. thanks

Last edited by redse171; 02-01-2014 at 02:24 PM.. Reason: typo
# 2  
Old 02-01-2014
Are RT_1, RT_2, UT1 the only values that can appear in column 4?
# 3  
Old 02-01-2014
Hi,
Yes, only these 3 will appear on 4th column
# 4  
Old 02-01-2014
I can see 5 distinct values in column 4 though:
Code:
RT1
RT2
RT_1
RT_2
UT1

# 5  
Old 02-01-2014
Hi,
Sorry, it was typo. It should be RT_1, RT_2 and UT_1
# 6  
Old 02-01-2014
Put this into "script.pl":
Code:
#!/usr/bin/perl
use strict;
use warnings;

open my $input, "<", "$ARGV[0]" or die "cannot open file: $ARGV[0]";

my %output;
while (my $line = <$input>) {
  chomp $line;
  my @F = split / +/, $line;
  $output{$F[0]} = " " x 120 if !$output{$F[0]};
  substr($output{$F[0]}, 0, 40) = sprintf "%-40s", join " ", split //, $F[4] if $F[3] eq "RT_1";
  substr($output{$F[0]}, 40, 40) = sprintf "%-40s", join " ", split //, $F[4] if $F[3] eq "RT_2";
  substr($output{$F[0]}, 80, 40) = sprintf "%-40s", join " ", split //, $F[4] if $F[3] eq "UT_1";
}

print "ID" . " " x 18 . "RT_1" . " " x 34 . "RT_2" . " " x 41 . "UT_1\n";
foreach my $id (keys %output) {
  printf "%-15s%s\n", $id, $output{$id};
}

Then run:
Code:
./script.pl input

This User Gave Thanks to bartus11 For This Post:
# 7  
Old 02-01-2014
Try also:
Code:
awk     '       {LN[$1]; HD[$4]; gsub (/./, "& ", $5); MX[$1,$4]=$5}
         END    {FMT="%-35s"
                                printf FMT, "ID"; for (i in HD) printf FMT, i; print "";
                 for (j in LN) {printf FMT, j;    for (i in HD) printf FMT, MX[j,i]; print ""}
                }
        ' file
ID                                 UT_1                               RT_1                               RT_2                               
DB|A1Z8N1                                                             G I V L I D K i l l y v . S        L a G F C v G I a s l s q p e v R  
DB|AZ264                                                                                                 g e t a p v Y l a E m s p a s i R  
DB|A3M0N3                                                             G I G R R k l l l g g S            L a G L G v G L i R                
DB|QZX3                            r l g s l q q L a I v l G i F T    G R K p l l l i g S                V g G I G v G V R                  
DB|ZXK6                            a l g t i y q L f L v i G i L F                                       I s t v t v p t Y l g E i a t v k a R

This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to print two matched patterns only from each line?

My input looks like this. # Lot Of CODE Before AppType_somethinglese=$(cat << EOF AppType_test1='test-tool/blatest-tool-ear' AppType_test2='test/blabla-ear' # Lot Of CODE After I want to print text betwen 1) _ and = and 2)/ and ' from each line and exclude lines with "EOF". Output... (2 Replies)
Discussion started by: kchinnam
2 Replies

2. Shell Programming and Scripting

Extract all the sentences that matched two patterns

Hi I have two lists of patterns named A and B consisting of around 200 entries in each and I want to extract all the sentences from a big text file which match atleast one pattern from both A and B. For example, pattern list A consists of : ama ani ahum mari ... ... and pattern... (1 Reply)
Discussion started by: my_Perl
1 Replies

3. Shell Programming and Scripting

Find matched patterns and print them with other patterns not the whole line

Hi, I am trying to extract some patterns from a line. The input file is space delimited and i could not use column to get value after "IN" or "OUT" patterns as there could be multiple white spaces before the next digits that i need to print in the output file . I need to print 3 patterns in a... (3 Replies)
Discussion started by: redse171
3 Replies

4. Shell Programming and Scripting

Matched multiple patterns that could be in a same line

Hi, I need help to match pattern started with "RW" in file 1 and with pattern in $1 in file 2 as follows:- File 1 BH /TOTAL=466(423); /POSITIVE=300(257); /UNKNOWN=25(25); BH /F_P=141(141); /F_N=136; /P=4; CC /TAX=!?; /MAX-R=2; CC /VER=2; RW P9610, AR_BSU , T; PAE25, AE_E57... (10 Replies)
Discussion started by: redse171
10 Replies

5. Shell Programming and Scripting

Find matched patterns in multiple files

Hi, I need help to find matched patterns in 30 files residing in a folder simultaneously. All these files only contain 1 column. For example, File1 Gr_1 st-e34ss-11dd bt-wwd-fewq pt-wq02-ddpk pw-xsw17-aqpp Gr_2 srq-wy09-yyd9 sqq-fdfs-ffs9 Gr_3 etas-qqa-dfw ddw-ppls-qqw... (10 Replies)
Discussion started by: redse171
10 Replies

6. Shell Programming and Scripting

Print line between two patterns when a certain pattern matched

Hello Friends, I need to print lines in between two string when a keyword existed in those lines (keywords like exception, error, failed, not started etc). for example, input: .. Begin Edr ab12 ac13 ad14 bc23 exception occured bd24 cd34 dd44 ee55 ff66 End Edr (2 Replies)
Discussion started by: EAGL€
2 Replies

7. Shell Programming and Scripting

Delete lines and the first pattern between 2 matched patterns

Hi, i need help to delete all the lines between 2 matched patterns and the first pattern must be deleted too. sample as follows: inputfile.txt >kump_1 ........................... ........................... >start_0124 dgfhghgfh fgfdgfh fdgfdh >kump_2 ............................. (7 Replies)
Discussion started by: redse171
7 Replies

8. Shell Programming and Scripting

Grab contents between two matched patterns

I am wanting to fetch the content of the table within a file the table begins with data label like N Batch Mn(I) RMSdev I/rms Rmerge Number Nrej Cm%poss AnoCmp MaxRes CMlplc SmRmerge SmMaxRes $$ $$ . #columns of data . . . . . $$ I tried the command awk... (18 Replies)
Discussion started by: piynik
18 Replies

9. UNIX for Dummies Questions & Answers

grep to show patterns being matched (-f option)

I have a list of patterns (regexes) in a file and use with `grep -f <file_with_list_of_regexes.txt> input.txt` to search in my input for those patterns. grep is doing a fantastic job at it and finds me the matching input text but I also want to see in the output the regex (from... (1 Reply)
Discussion started by: mirage
1 Replies

10. Shell Programming and Scripting

How to group matched patterns in different files

Hi, I have a master file that i need to split into multiple files based on matched patterns. sample of my data as follows:- scaff_1 a e 123 130 c_scaff_100 scaff_1 a e 132 138 c_scaff_101 scaff_1 a e 140 150 ... (2 Replies)
Discussion started by: redse171
2 Replies
Login or Register to Ask a Question