When I use the below awk to count the unique lines in $4 for the input it seems to work. The answer is 3 because $4 is only unique 3 times in all the entries. However, when I use the same on actual data I get 56,536 and I know the answer should be 56,548. My question is there a better way to count the unique lines? Thank you .
Hi.
I have a tab separated file that has a couple nearly identical lines. When doing:
sort file | uniq > file.new
It passes through the nearly identical lines because, well, they still are unique.
a)
I want to look only at field x for uniqueness and if the content in field x is the... (1 Reply)
I have this input file
tilenet_test:clar_r5_performance:server_2:4.80762:0%:APM00083103999-009E,APM00083103999-009F
tilenet_int:clar_r5_performance:server_2:4.80762:0%:APM00083103999-00C4... (3 Replies)
I would like to print unique lines without sort or unique. Unfortunately the server I am working on does not have sort or unique. I have not been able to contact the administrator of the server to ask him to add it for several weeks. (7 Replies)
Im looking for an awk script that will take the unique values in column 5, then print and count the unique values in column 6.
CA001011500 11111 11111 -9999 201301 AAA
CA001012040 11111 11111 -9999 201301 AAA
CA001012573 11111 11111 -9999 201301 BBB
CA001012710 11111 11111 -9999 201301... (4 Replies)
Hello Team,
I need your help on the following:
My input file a.txt is as below:
3330690|373846|108471
3330690|373846|108471
0640829|459725|100001
0640829|459725|100001
3330690|373847|108471
Here row 1 and row 2 of column 1 are identical but corresponding column 2 value are... (4 Replies)
Hi Folks,
I have a file with fields as follows which has last field in multiple lines. I would like to combine a line which has three fields with single field line for as shown in expected output. Please help.
INPUT
hname01 windows appnamec1eda_p1, ... (5 Replies)
I am trying to remove all the lines and spaces where the count in $4 or $5 is greater than 1 (more than 1 letter). The file and the output are tab-delimited. Thank you :).
file
X 5811530 . G C NLGN4X
17 10544696 . GA G MYH3
9 96439004 . C ... (1 Reply)
Hello,
I am trying to count unique rows in my file based on 4 columns (2-5) and to output its frequency in a sixth column. My file is tab delimited
My input file looks like this:
Colum1 Colum2 Colum3 Colum4 Coulmn5
1.1 100 100 a b
1.1 100 100 a c
1.2 200 205 a d
1.3 300 301 a y
1.3 300... (6 Replies)
For some reason I am having difficulty performing what should be a fairly easy task. I would like to print lines of a file that have a unique value in the first field. For example, I have a large data-set with the following excerpt:
PS003,001 MZMWR/ L-DWD// *
PS003,001... (4 Replies)
Hi,
Sure it's an easy one, but it drives me insane.
input ("|" separated):
1|A,B,C,A
2|A,D,D
3|A,B,B
I would like to count the occurence of each capital letters in $2 across the entire file, knowing that duplicates in each record count as 1.
I am trying to get this output... (5 Replies)
Discussion started by: beca123456
5 Replies
LEARN ABOUT DEBIAN
bio::graphics::glyph::wiggle_density
Bio::Graphics::Glyph::wiggle_density(3pm) User Contributed Perl Documentation Bio::Graphics::Glyph::wiggle_density(3pm)NAME
Bio::Graphics::Glyph::wiggle_density - A density plot compatible with dense "wig"data
SYNOPSIS
See <Bio::Graphics::Panel> and <Bio::Graphics::Glyph>.
DESCRIPTION
This glyph works like the regular density but takes value data in Bio::Graphics::Wiggle file format:
reference = chr1
ChipCHIP Feature1 1..10000 wigfile=./test.wig;wigstart=0
ChipCHIP Feature2 10001..20000 wigfile=./test.wig;wigstart=656
ChipCHIP Feature3 25001..35000 wigfile=./test.wig;wigstart=1312
The "wigfile" attribute gives a relative or absolute pathname to a Bio::Graphics::Wiggle format file. The optional "wigstart" option gives
the offset to the start of the data. If not specified, a linear search will be used to find the data. The data consist of a packed binary
representation of the values in the feature, using a constant step such as present in tiling array data.
OPTIONS
The same as the regular graded_segments glyph, except that the following options are recognized:
Name Value Description
--------------------
basedir path Path to be used to resolve "wigfile" and "densefile"
tags giving relative paths. Default is to use the
current working directory. Absolute wigfile &
densefile paths will not be changed.
autoscale "local" or "global"
If one or more of min_score and max_score options
are absent, then these values will be calculated
automatically. The "autoscale" option controls how
the calculation is done. The "local" value will
scale values according to the minimum and maximum
values present in the window being graphed. "global"
will use chromosome-wide statistics for the entire
wiggle or dense file to find min and max values.
smoothing method name Smoothing method: one of "mean", "max", "min" or "none"
smoothing_window
integer Number of values across which data should be smoothed.
bicolor_pivot
name Where to pivot the two colors when drawing bicolor plots.
Options are "mean" and "zero". A numeric value can
also be provided.
pos_color color When drawing bicolor plots, the fill color to use for values
that are above the pivot point.
neg_color color When drawing bicolor plots, the fill color to use for values
that are below the pivot point.
SPECIAL FEATURE TAGS
The glyph expects one or more of the following tags (attributes) in feature it renders:
Name Value Description
--------------------
wigfile path name Path to the Bio::Graphics::Wiggle file for vales.
(required)
densefile path name Path to a Bio::Graphics::DenseFeature object
(deprecated)
denseoffset integer Integer offset to where the data begins in the
Bio::Graphics::DenseFeature file (deprecated)
densesize integer Integer size of the data in the Bio::Graphics::DenseFeature
file (deprecated)
BUGS
Please report them.
SEE ALSO
Bio::Graphics::Panel, Bio::Graphics::Glyph, Bio::Graphics::Glyph::arrow, Bio::Graphics::Glyph::cds, Bio::Graphics::Glyph::crossbox,
Bio::Graphics::Glyph::diamond, Bio::Graphics::Glyph::dna, Bio::Graphics::Glyph::dot, Bio::Graphics::Glyph::ellipse,
Bio::Graphics::Glyph::extending_arrow, Bio::Graphics::Glyph::generic, Bio::Graphics::Glyph::graded_segments,
Bio::Graphics::Glyph::heterogeneous_segments, Bio::Graphics::Glyph::line, Bio::Graphics::Glyph::pinsertion, Bio::Graphics::Glyph::primers,
Bio::Graphics::Glyph::rndrect, Bio::Graphics::Glyph::segments, Bio::Graphics::Glyph::ruler_arrow, Bio::Graphics::Glyph::toomany,
Bio::Graphics::Glyph::transcript, Bio::Graphics::Glyph::transcript2, Bio::Graphics::Glyph::translation, Bio::Graphics::Glyph::allele_tower,
Bio::DB::GFF, Bio::SeqI, Bio::SeqFeatureI, Bio::Das, GD
AUTHOR
Lincoln Stein <steinl@cshl.edu>.
Copyright (c) 2007 Cold Spring Harbor Laboratory
This package and its accompanying libraries is free software; you can redistribute it and/or modify it under the terms of the GPL (either
version 1, or at your option, any later version) or the Artistic License 2.0. Refer to LICENSE for the full license text. In addition,
please see DISCLAIMER.txt for disclaimers of warranty.
perl v5.14.2 2012-02-20 Bio::Graphics::Glyph::wiggle_density(3pm)