Sponsored Content
Top Forums Shell Programming and Scripting Count Repetitive Number in a column and renumbering using awk Post 303024136 by himanshu on Monday 1st of October 2018 08:34:18 AM
Old 10-01-2018
Count Repetitive Number in a column and renumbering using awk

Unable to get the desired output. Need only the rows which has repeated values in column 5.

Input File <tab separated file>
Code:
chr1    3773797 3773797 CEP10   1
chr1    3773797 3773797 CEP104  2
chr1    3689350 3689350 SMIM1   2
chr1    3773797 3773797 CEP4    3
chr1    3773797 3773797 EP104   4
chr1    43283059    43283059    CCDC23  5
chr1    43282775    43282775    ERMAP   5

Extra column 6 should be added in column 6. Rows, with repeated valued in column 5, should be extracted and extra column 6 is added with renumbering from 1 to n.

Output File <tab separated file>
Code:
chr1    3773797 3773797 CEP104  2   1
chr1    3689350 3689350 SMIM1   2   1
chr1    43283059    43283059    CCDC23  5   2
chr1    43282775    43282775    ERMAP   5   2

Moderator's Comments:
Mod Comment Please use CODE tags as required by forum rules!

Last edited by RudiC; 10-01-2018 at 09:41 AM.. Reason: Added CODE tags.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Count if numbers are not repetitive

Hi All, I have an input below and i would want to do a count on all the term "aaa" and count only once if the number in first column is the same. For eg, if i use a "grep -c aaa input" command, the count will be "8". However, i would want the count to be "6" instead since 2 numbers in the 1st... (7 Replies)
Discussion started by: Raynon
7 Replies

2. Shell Programming and Scripting

column renumbering

Hi, I am a beginner in awk scripting! I need your help; I want to replace the fifth column number (which is 15 here) here in this file for example : ATOM 142 N PRO A 15 ATOM 143 CD PRO A 15 ATOM 144 HD1 PRO A 15 ATOM ... (5 Replies)
Discussion started by: adak
5 Replies

3. UNIX for Dummies Questions & Answers

how to count number of rows and sum of column using awk

Hi All, I have the following input which i want to process using AWK. Rows,NC,amount 1,1202,0.192387 2,1201,0.111111 3,1201,0.123456 i want the following output count of rows = 3 ,sum of amount = 0.426954 Many thanks (2 Replies)
Discussion started by: pistachio
2 Replies

4. Shell Programming and Scripting

to count the number of occurences of a column value

im trying to count the number of occurences of column 2 value(starting from KKK*) of the below file, file.txt using the code cat file.txt | awk ' BEGIN { print "Category Counts"} {FS=","} {NR > 2} { cats = cats + 1} END { for(c in cats) { print c, "=", cats} } ' but its returning as ... (6 Replies)
Discussion started by: michaelrozar17
6 Replies

5. UNIX for Dummies Questions & Answers

count number of rows based on other column values

Could anybody help with this? I have input below ..... david,39 david,39 emelie,40 clarissa,22 bob,42 bob,42 tim,32 bob,39 david,38 emelie,47 what i want to do is count how many names there are with different ages, so output would be like this .... david,2 emelie,2 clarissa,1... (3 Replies)
Discussion started by: itsme999
3 Replies

6. Shell Programming and Scripting

Count the number or row with same value in a column

This is the source file, we called it errorlist.out 196 server_a server_unix_2 CD 196 server_b server_win_1 CD 196 server_c server_win_2 CD 196 server_bd server_unix_2 CD 196 server_d server_unix_2 CD 196 server_es server_win_1 CD 196 ... (14 Replies)
Discussion started by: sQew
14 Replies

7. Shell Programming and Scripting

Count the number of fields in column

Hi I was going through the below thread https://www.unix.com/shell-programming-scripting/48535-how-count-number-fields-record.html I too have something similar requirement as specified in this thread but the number of columns in my case can be very high, so I am getting following error. ... (3 Replies)
Discussion started by: shekharjchandra
3 Replies

8. Shell Programming and Scripting

Count number of characters in particular column

Hi i have data like abchd 124 ldskc aattggcc each separated by tab space i want to count number of characters in 4th column and print it in new column with tabspace for every line can anyone help me how to do it. Thanks. (3 Replies)
Discussion started by: bhargavpbk88
3 Replies

9. UNIX for Dummies Questions & Answers

count number of distinct values in each column with awk

Hi ! input: A|B|C|D A|F|C|E A|B|I|C A|T|I|B As the title of the thread says, I would need to get: 1|3|2|4 I tried different variants of this command, but I don't manage to obtain what I need: gawk 'BEGIN{FS=OFS="|"}{for(i=1; i<=NF; i++) a++} END {for (b in a) print b}' input ... (2 Replies)
Discussion started by: beca123456
2 Replies

10. Shell Programming and Scripting

Count number of unique values in each column of array

What is an efficient way of counting the number of unique values in a 400 column by 1000 row array and outputting the counts per column, assuming the unique values in the array are: A, B, C, D In other words the output should look like: Value COL1 COL2 COL3 A 50 51 52... (16 Replies)
Discussion started by: Geneanalyst
16 Replies
Bio::Graphics::Glyph::wiggle_density(3pm)		User Contributed Perl Documentation		 Bio::Graphics::Glyph::wiggle_density(3pm)

NAME
Bio::Graphics::Glyph::wiggle_density - A density plot compatible with dense "wig"data SYNOPSIS
See <Bio::Graphics::Panel> and <Bio::Graphics::Glyph>. DESCRIPTION
This glyph works like the regular density but takes value data in Bio::Graphics::Wiggle file format: reference = chr1 ChipCHIP Feature1 1..10000 wigfile=./test.wig;wigstart=0 ChipCHIP Feature2 10001..20000 wigfile=./test.wig;wigstart=656 ChipCHIP Feature3 25001..35000 wigfile=./test.wig;wigstart=1312 The "wigfile" attribute gives a relative or absolute pathname to a Bio::Graphics::Wiggle format file. The optional "wigstart" option gives the offset to the start of the data. If not specified, a linear search will be used to find the data. The data consist of a packed binary representation of the values in the feature, using a constant step such as present in tiling array data. OPTIONS The same as the regular graded_segments glyph, except that the following options are recognized: Name Value Description ---- ----- ----------- basedir path Path to be used to resolve "wigfile" and "densefile" tags giving relative paths. Default is to use the current working directory. Absolute wigfile & densefile paths will not be changed. autoscale "local" or "global" If one or more of min_score and max_score options are absent, then these values will be calculated automatically. The "autoscale" option controls how the calculation is done. The "local" value will scale values according to the minimum and maximum values present in the window being graphed. "global" will use chromosome-wide statistics for the entire wiggle or dense file to find min and max values. smoothing method name Smoothing method: one of "mean", "max", "min" or "none" smoothing_window integer Number of values across which data should be smoothed. bicolor_pivot name Where to pivot the two colors when drawing bicolor plots. Options are "mean" and "zero". A numeric value can also be provided. pos_color color When drawing bicolor plots, the fill color to use for values that are above the pivot point. neg_color color When drawing bicolor plots, the fill color to use for values that are below the pivot point. SPECIAL FEATURE TAGS The glyph expects one or more of the following tags (attributes) in feature it renders: Name Value Description ---- ----- ----------- wigfile path name Path to the Bio::Graphics::Wiggle file for vales. (required) densefile path name Path to a Bio::Graphics::DenseFeature object (deprecated) denseoffset integer Integer offset to where the data begins in the Bio::Graphics::DenseFeature file (deprecated) densesize integer Integer size of the data in the Bio::Graphics::DenseFeature file (deprecated) BUGS
Please report them. SEE ALSO
Bio::Graphics::Panel, Bio::Graphics::Glyph, Bio::Graphics::Glyph::arrow, Bio::Graphics::Glyph::cds, Bio::Graphics::Glyph::crossbox, Bio::Graphics::Glyph::diamond, Bio::Graphics::Glyph::dna, Bio::Graphics::Glyph::dot, Bio::Graphics::Glyph::ellipse, Bio::Graphics::Glyph::extending_arrow, Bio::Graphics::Glyph::generic, Bio::Graphics::Glyph::graded_segments, Bio::Graphics::Glyph::heterogeneous_segments, Bio::Graphics::Glyph::line, Bio::Graphics::Glyph::pinsertion, Bio::Graphics::Glyph::primers, Bio::Graphics::Glyph::rndrect, Bio::Graphics::Glyph::segments, Bio::Graphics::Glyph::ruler_arrow, Bio::Graphics::Glyph::toomany, Bio::Graphics::Glyph::transcript, Bio::Graphics::Glyph::transcript2, Bio::Graphics::Glyph::translation, Bio::Graphics::Glyph::allele_tower, Bio::DB::GFF, Bio::SeqI, Bio::SeqFeatureI, Bio::Das, GD AUTHOR
Lincoln Stein <steinl@cshl.edu>. Copyright (c) 2007 Cold Spring Harbor Laboratory This package and its accompanying libraries is free software; you can redistribute it and/or modify it under the terms of the GPL (either version 1, or at your option, any later version) or the Artistic License 2.0. Refer to LICENSE for the full license text. In addition, please see DISCLAIMER.txt for disclaimers of warranty. perl v5.14.2 2012-02-20 Bio::Graphics::Glyph::wiggle_density(3pm)
All times are GMT -4. The time now is 06:40 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy