Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Print a row with the max number in a column Post 303044100 by RudiC on Thursday 13th of February 2020 04:46:44 PM
Old 02-13-2020
Try as well
Code:
sort -k1,1 -k9nr  file | uniq -w16
chr1_16857_17742        -        chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -        885
chr1_16857_18061        -        chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -        1204
chr1_17232_18061        -        chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -        829
chr1_17914_24891        -        chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -        6977
chr1_18267_29570        -        chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -        11303

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to print column based on row number

Hi, I want to print column value based on row number say multiple of 8. Input file: line 1 67 34 line 2 45 57 . . . . . . line 8 12 46 . . . . . . line 16 24 90 . . . . . . line 24 49 67 Output 46 90 67 (2 Replies)
Discussion started by: Surabhi_so_mh
2 Replies

2. Shell Programming and Scripting

extracting row with max column value using awk or unix

Hello, BC106081_abc_128240811_128241377 7.96301 BC106081_abc_128240811_128241377 39.322 BC106081_cde_128240811_128241377 1.98628 BC106081_def_128240811_128241377 -2.44492 BC106081_abc_128240811_128241377 69.5504 FLJ00075_xyz_14406_16765 -0.173417 ... (3 Replies)
Discussion started by: Diya123
3 Replies

3. Shell Programming and Scripting

print max number of 2 columns - awk

Is it possible to print max number of 2 columns - awk note: print max if the integer is positive and print min if the integer is negative input a 1 2 b 3 4 c 5 1 d -3 -5 d -5 -3 output a 2 b 4 c 5 d -5 d -5 (4 Replies)
Discussion started by: quincyjones
4 Replies

4. Shell Programming and Scripting

Print min and max value from two column

Dear All, I have data like this, input: 1254 10125 1254 10126 1254 10127 1254 10128 1254 10129 1255 10130 1255 10131 1255 10132 1255 10133 1256 10134 1256 10135 1256 10137... (3 Replies)
Discussion started by: aksin
3 Replies

5. UNIX for Dummies Questions & Answers

awk to print first row with forth column and last row with fifth column in each file

file with this content awk 'NR==1 {print $4} && NR==2 {print $5}' file The error is shown with syntax error; what can be done (4 Replies)
Discussion started by: cdfd123
4 Replies

6. Shell Programming and Scripting

Print every 5 4th column values as separate row with different first column

Hi, I have the following file, chr1 100 200 20 chr1 201 300 22 chr1 220 345 23 chr1 230 456 33.5 chr1 243 567 90 chr1 345 600 20 chr1 430 619 21.78 chr1 870 910 112.3 chr1 914 920 12 chr1 930 999 13 My output would be peak1 20 22 23 33.5 90 peak2 20 21.78 112.3 12 13 Here the... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

7. Shell Programming and Scripting

Sum value in a row and print the max

I have the input file in attached. I want the output file : Date , Time , Max_Bearer 11/01/2013 , 23:00 , 1447.894167 11/02/2013 , 00:00 , 1429.266667 11/03/2013 , 00:00 , 712.3175 11/04/2013 , 22:00 , 650.9533333 11/05/2013 , 23:00 , 665.9558333 11/06/2013 , 23:00 , 659.8616667... (2 Replies)
Discussion started by: justbow
2 Replies

8. Shell Programming and Scripting

Get row number from file1 and print that row of file2

Hi. How can we print those rows of file2 which are mentioned in file1. first character of file1 is a row number.. for eg file1 1:abc 3:ghi 6:pqr file2 a abc b def c ghi d jkl e mno f pqr ... (6 Replies)
Discussion started by: Abhiraj Singh
6 Replies

9. Shell Programming and Scripting

Print row on 4th column to all row

Dear All, I have input : SEG901 5173 9005 5740 SEG902 5227 5284 SEG903 5284 5346 SEG904 5346 9010 SEG905 5400 5456 SEG906 5456 5511 SEG907 5511 9011 SEG908 5572 9015 SEG909 5622 9020 SEG910 5678 5739 SEG911 5739 5796 SEG912 5796 9025 ... (3 Replies)
Discussion started by: attila
3 Replies

10. Shell Programming and Scripting

Filter Row Based On Max Column Value After Group BY

Hello Team, Need your expertise on following: Here is the set of data: C1|4|C1SP1|A1|C1BP1|T1 C1|4|C1SP2|A1|C1BP2|T2 C2|3|C2SP1|A2|C2BP1|T2 C3|3|C3SP1|A3|C3BP1|T2 C2|2|C2SP2|A2|C2BP2|T1 I need to filter above date base on following two steps: 1. Group them by column 1 and 4 2.... (12 Replies)
Discussion started by: angshuman
12 Replies
tabix(1)						       Bioinformatics tools							  tabix(1)

NAME
bgzip - Block compression/decompression utility tabix - Generic indexer for TAB-delimited genome position files SYNOPSIS
bgzip [-cdhB] [-b virtualOffset] [-s size] [file] tabix [-0lf] [-p gff|bed|sam|vcf] [-s seqCol] [-b begCol] [-e endCol] [-S lineSkip] [-c metaChar] in.tab.bgz [region1 [region2 [...]]] DESCRIPTION
Tabix indexes a TAB-delimited genome position file in.tab.bgz and creates an index file in.tab.bgz.tbi when region is absent from the com- mand-line. The input data file must be position sorted and compressed by bgzip which has a gzip(1) like interface. After indexing, tabix is able to quickly retrieve data lines overlapping regions specified in the format "chr:beginPos-endPos". Fast data retrieval also works over network if URI is given as a file name and in this case the index file will be downloaded if it is not present locally. OPTIONS OF TABIX
-p STR Input format for indexing. Valid values are: gff, bed, sam, vcf and psltab. This option should not be applied together with any of -s, -b, -e, -c and -0; it is not used for data retrieval because this setting is stored in the index file. [gff] -s INT Column of sequence name. Option -s, -b, -e, -S, -c and -0 are all stored in the index file and thus not used in data retrieval. [1] -b INT Column of start chromosomal position. [4] -e INT Column of end chromosomal position. The end column can be the same as the start column. [5] -S INT Skip first INT lines in the data file. [0] -c CHAR Skip lines started with character CHAR. [#] -0 Specify that the position in the data file is 0-based (e.g. UCSC files) rather than 1-based. -h Print the header/meta lines. -B The second argument is a BED file. When this option is in use, the input file may not be sorted or indexed. The entire input will be read sequentially. Nonetheless, with this option, the format of the input must be specificed correctly on the command line. -f Force to overwrite the index file if it is present. -l List the sequence names stored in the index file. EXAMPLE
(grep ^"#" in.gff; grep -v ^"#" in.gff | sort -k1,1 -k4,4n) | bgzip > sorted.gff.gz; tabix -p gff sorted.gff.gz; tabix sorted.gff.gz chr1:10,000,000-20,000,000; NOTES
It is straightforward to achieve overlap queries using the standard B-tree index (with or without binning) implemented in all SQL data- bases, or the R-tree index in PostgreSQL and Oracle. But there are still many reasons to use tabix. Firstly, tabix directly works with a lot of widely used TAB-delimited formats such as GFF/GTF and BED. We do not need to design database schema or specialized binary formats. Data do not need to be duplicated in different formats, either. Secondly, tabix works on compressed data files while most SQL databases do not. The GenCode annotation GTF can be compressed down to 4%. Thirdly, tabix is fast. The same indexing algorithm is known to work effi- ciently for an alignment with a few billion short reads. SQL databases probably cannot easily handle data at this scale. Last but not the least, tabix supports remote data retrieval. One can put the data file and the index at an FTP or HTTP server, and other users or even web services will be able to get a slice without downloading the entire file. AUTHOR
Tabix was written by Heng Li. The BGZF library was originally implemented by Bob Handsaker and modified by Heng Li for remote file access and in-memory caching. SEE ALSO
samtools(1) tabix-0.2.0 11 May 2010 tabix(1)
All times are GMT -4. The time now is 11:24 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy