Sponsored Content
Top Forums Shell Programming and Scripting awk to parse field and include the text of 1 pipe in field 4 Post 302959901 by cmccabe on Saturday 7th of November 2015 10:45:24 AM
Old 11-07-2015
That `awk` produces:

Code:
chr1    955543    955763    AGRN-6|pr=2|gc=75    +    
 
chr1    957571    957852    AGRN-7|pr=3|gc=61.2    +  
   
chr1    970621    970740    AGRN-8|pr=1|gc=57.1    +

The |pr=2, |pr=3, and pr=1 is not needed and there looks to be a line skipped each after each row and that will may be problematic for later analysis.

Thank you Smilie.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to parse a text file with \034 as field and \035 as end of message delimiter?

I need some tips to write a unix korn shell script that will parse an input text file. Input text file has messages that span several lines, each field in the message is delimited by /034 and the end of message is delimited by /035. Input file looks something similar to ... (1 Reply)
Discussion started by: indianya
1 Replies

2. Shell Programming and Scripting

how to include field in the output filename of awk

Im using awk and I want the output filename to contain the first field of the input file. Ex. 1 dddd wwwww 1 eeeee wwww 1 wwww eerrrr 2 eeee eeeeee I want the output files to be xxx1 and xxx2 Thank you (4 Replies)
Discussion started by: yahyaaa
4 Replies

3. UNIX for Dummies Questions & Answers

Replacing a field in pipe delimited TEXT File

Hi, I want to replace a field in a text delimited file with the actual number of records in the same file. HDR|ABCD|10-13-2008 to 10-19-2008.txt|10-19-2008|XYZ DTL|0|5464-1|0|02-02-2008|02-03-2008||||F||||||||| DTL|1|5464-1|1|02-02-2008|02-03-2008|1||JJJ... (3 Replies)
Discussion started by: ravi0435
3 Replies

4. Shell Programming and Scripting

Using AWK to parse a delimited field

Hi everyone! How can I parse a delimited field using AWK? For example, if I have lastName#firstName or lastName*firstName. I'd like an AWK script that would return lastName and then another that would return firstName? Is this possible? (13 Replies)
Discussion started by: Fatbob
13 Replies

5. Shell Programming and Scripting

How to pass a field from awk in a pipe?

Thanks in advance : ) I try for a long time searching for a way to split a large gzip csv file into many gzip files (except for the last sub-file which is to joint the next big file's children.) All the subfiles are to be named by the field. But I only managed to split them into the... (9 Replies)
Discussion started by: Kingsley
9 Replies

6. Shell Programming and Scripting

awk, comma as field separator and text inside double quotes as a field.

Hi, all I need to get fields in a line that are separated by commas, some of the fields are enclosed with double quotes, and they are supposed to be treated as a single field even if there are commas inside the quotes. sample input: for this line, 5 fields are supposed to be extracted, they... (8 Replies)
Discussion started by: kevintse
8 Replies

7. Shell Programming and Scripting

how to parse with awk (using different fields), then group by a field?

When parsing multiple fields in a file using AWK, how do you group by one of the fields and parse by delimiters? to clarify If a file had tom | 223-2222-4444 , randofield ivan | 123-2422-4444 , random filed ... | and , are the delimiters ... How would you group by the social security... (4 Replies)
Discussion started by: Josef_Stalin
4 Replies

8. Shell Programming and Scripting

Awk Search text string in field, not all in field.

Hello, I am using awk to match text in a tab separated field and am able to do so when matching the exact word. My problem is that I would like to match any sequence of text in the tab-separated field without having to match it all. Any help will be appreciated. Please see the code below. awk... (3 Replies)
Discussion started by: rocket_dog
3 Replies

9. Shell Programming and Scripting

awk Parse And Create Multiple Files Based on Field Value

Hello: I am working parsing a large input file which will be broken down into multiples based on the second field in the file, in this case: STORE. The idea is to create each file with the corresponding store number, for example: Report_$STORENUM_$DATETIMESTAMP , and obtaining the... (7 Replies)
Discussion started by: ec012
7 Replies

10. Shell Programming and Scripting

awk to create separate files but not include specific field in output

I am trying to use awk to create (in this example) 3 seperate text file from the unique id in $1 in file, if it starts with the pattern aa. The contents of each row is used to populate each text file except for $1 which is not needed. It seems I am close but not quite get there. Thank you :). ... (3 Replies)
Discussion started by: cmccabe
3 Replies
Bio::Graphics::Glyph::stackedplot(3pm)			User Contributed Perl Documentation		    Bio::Graphics::Glyph::stackedplot(3pm)

NAME
Bio::Graphics::Glyph::stackedplot - The stackedplot glyph SYNOPSIS
See L<Bio::Graphics::Panel> and L<Bio::Graphics::Glyph>. DESCRIPTION
The stackedplot glyph can be used to draw quantitative feature data using a stacked column plot. It differs from the xyplot glyph in that the plot applies to a single top level feature, not a group of subfeatures. The data to be graphed is derived from an attribute called "data_series." The data to be graphed is represented as a list of arrays: ( [1, 2, 8], [6, 1, 1], [10,8, 0], [1, 1, 1], ) Each array is a column in the stacked plot. Its values become the subdivisions of the column. In this example, there are four columns, each of which has three subdivisions. You can add labels to the columns and change the colors of the subdivisions. To assign data to a feature, you can add a "series" tag: $snp1 = Bio::SeqFeature::Generic ->new (-start => 500,-end=>501, -display_name =>'example', -tag=> { series => [ [10,20,30], [30,30,0], [5,45,10], [5,45,10], [5,45,10], [50,0,50], ], } ); Note that the series tag must consist of an array of arrays. If you are using a gff3 representation, you can load a database with data that looks like this: chr1 test feature 1 1000 . . . series=10 20 30;series=30 30 0;series=5 45 10... If you are using a gff2 representation, you can load a database with data that looks like this: chr1 test feature 1 1000 . . . series 10 20 30; series 30 30 0 series 5 45 10... Or you can pass a callback to the -series option: $panel->add_track(@data, -glyph => 'stackedplot', -series => sub { my $feature = shift; return [ [10,20,30], [30,30,0], [5,45,10], ] } ); OPTIONS The following options are standard among all Glyphs. See Bio::Graphics::Glyph for a full explanation. Option Description Default ------ ----------- ------- -fgcolor Foreground color black -outlinecolor Synonym for -fgcolor -bgcolor Background color turquoise -fillcolor Synonym for -bgcolor -linewidth Line width 1 -height Height of glyph 10 -font Glyph font gdSmallFont -label Whether to draw a label 0 (false) -description Whether to draw a description 0 (false) -hilite Highlight color undef (no color) In addition, the alignment glyph recognizes all the options of the xyplot glyph, as well as the following glyph-specific option: Option Description Default ------ ----------- ------- -fixed_gap Vertical distance between 8 the rectangle that shows the start:end range of the feature and the fixed width stacked plot. -series_colors A list giving a series of red,blue,green,orange, color names for the data brown,grey,black series (the values inside each stacked column). -column_labels A list of labels to print -none- underneath each column. -column_width The width of each column. 8 -column_spacing Spacing between each 2 column. -min_score Minimum score for the 0.0 sum of the members of each data series. -max_score Maximum score for the 1.0 sum of the members of each data series. -scale_font Font to use for the scale. gdTinyFont -column_font Font to use for the column gdSmallFont labels. -draw_scale Whether to draw a scale to true right of the columns. Note that -min_score and -max_score represent the minimum and maximum SUM of all the values in the data series. For example, if your largest column contains the series (10,20,30), then the -max_score is 60. EXAMPLE
To understand how this glyph works, try running and modifying the following example: #!/usr/bin/perl use strict; use warnings; use Bio::Graphics; use Bio::SeqFeature::Generic; my $segment = Bio::Graphics::Feature->new(-start=>1,-end=>700); my $snp1 = Bio::SeqFeature::Generic ->new (-start => 500,-end=>590, -display_name =>'fred', -tag=> { series => [ [10,20,30], [30,30,0], [5,45,10], [5,45,10], [5,45,10], [50,0,50], ], }, -source=>'A test', ); my $snp2 = Bio::SeqFeature::Generic->new(-start => 300, -end => 301, -display_name => 'rs12345', -tag=> { series => [ [30,20,10 ], [80,10,10 ], ], }, -source=>'Another test', ); my $panel = Bio::Graphics::Panel->new(-segment=>$segment,-width=>800); $panel->add_track($segment,-glyph=>'arrow',-double=>1,-tick=>2); $panel->add_track([$snp1,$snp2], -height => 50, -glyph => 'stackedplot', -fixed_gap => 12, -series_colors => [qw(red blue lavender)], -column_labels => [qw(a b c d e f g)], -min_score => 0, -max_score => 100, -column_width => 8, -column_font => 'gdMediumBoldFont', -scale_font => 'gdTinyFont', -label => 1, -description=>1, ); print $panel->png; BUGS
Please report them. SEE ALSO
Bio::Graphics::Panel, Bio::Graphics::Track, Bio::Graphics::Glyph::transcript2, Bio::Graphics::Glyph::anchored_arrow, Bio::Graphics::Glyph::arrow, Bio::Graphics::Glyph::box, Bio::Graphics::Glyph::primers, Bio::Graphics::Glyph::segments, Bio::Graphics::Glyph::toomany, Bio::Graphics::Glyph::transcript, AUTHOR
Lincoln Stein <lstein@cshl.org> Copyright (c) 2006 Cold Spring Harbor Laboratory This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See DISCLAIMER.txt for disclaimers of warranty. perl v5.14.2 2012-02-20 Bio::Graphics::Glyph::stackedplot(3pm)
All times are GMT -4. The time now is 10:38 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy