Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

tabix(1) [debian man page]

tabix(1)						       Bioinformatics tools							  tabix(1)

NAME
bgzip - Block compression/decompression utility tabix - Generic indexer for TAB-delimited genome position files SYNOPSIS
bgzip [-cdhB] [-b virtualOffset] [-s size] [file] tabix [-0lf] [-p gff|bed|sam|vcf] [-s seqCol] [-b begCol] [-e endCol] [-S lineSkip] [-c metaChar] in.tab.bgz [region1 [region2 [...]]] DESCRIPTION
Tabix indexes a TAB-delimited genome position file in.tab.bgz and creates an index file in.tab.bgz.tbi when region is absent from the com- mand-line. The input data file must be position sorted and compressed by bgzip which has a gzip(1) like interface. After indexing, tabix is able to quickly retrieve data lines overlapping regions specified in the format "chr:beginPos-endPos". Fast data retrieval also works over network if URI is given as a file name and in this case the index file will be downloaded if it is not present locally. OPTIONS OF TABIX
-p STR Input format for indexing. Valid values are: gff, bed, sam, vcf and psltab. This option should not be applied together with any of -s, -b, -e, -c and -0; it is not used for data retrieval because this setting is stored in the index file. [gff] -s INT Column of sequence name. Option -s, -b, -e, -S, -c and -0 are all stored in the index file and thus not used in data retrieval. [1] -b INT Column of start chromosomal position. [4] -e INT Column of end chromosomal position. The end column can be the same as the start column. [5] -S INT Skip first INT lines in the data file. [0] -c CHAR Skip lines started with character CHAR. [#] -0 Specify that the position in the data file is 0-based (e.g. UCSC files) rather than 1-based. -h Print the header/meta lines. -B The second argument is a BED file. When this option is in use, the input file may not be sorted or indexed. The entire input will be read sequentially. Nonetheless, with this option, the format of the input must be specificed correctly on the command line. -f Force to overwrite the index file if it is present. -l List the sequence names stored in the index file. EXAMPLE
(grep ^"#" in.gff; grep -v ^"#" in.gff | sort -k1,1 -k4,4n) | bgzip > sorted.gff.gz; tabix -p gff sorted.gff.gz; tabix sorted.gff.gz chr1:10,000,000-20,000,000; NOTES
It is straightforward to achieve overlap queries using the standard B-tree index (with or without binning) implemented in all SQL data- bases, or the R-tree index in PostgreSQL and Oracle. But there are still many reasons to use tabix. Firstly, tabix directly works with a lot of widely used TAB-delimited formats such as GFF/GTF and BED. We do not need to design database schema or specialized binary formats. Data do not need to be duplicated in different formats, either. Secondly, tabix works on compressed data files while most SQL databases do not. The GenCode annotation GTF can be compressed down to 4%. Thirdly, tabix is fast. The same indexing algorithm is known to work effi- ciently for an alignment with a few billion short reads. SQL databases probably cannot easily handle data at this scale. Last but not the least, tabix supports remote data retrieval. One can put the data file and the index at an FTP or HTTP server, and other users or even web services will be able to get a slice without downloading the entire file. AUTHOR
Tabix was written by Heng Li. The BGZF library was originally implemented by Bob Handsaker and modified by Heng Li for remote file access and in-memory caching. SEE ALSO
samtools(1) tabix-0.2.0 11 May 2010 tabix(1)

Check Out this Related Man Page

Bio::Graphics::Glyph::vista_plot(3pm)			User Contributed Perl Documentation		     Bio::Graphics::Glyph::vista_plot(3pm)

NAME
Bio::Graphics::Glyph::vista_plot - The "vista_plot" glyph SYNOPSIS
See Bio::Graphics::Glyph, Bio::Graphics::Glyph::wiggle_xyplot and Bio::Graphics::Glyph::heat_map. DESCRIPTION
This glyph draws peak calls (features with discreet boundaries, i.e. putative transcription sites, over signal graph (wiggle_xyplot) requires a special load gff file that uses attributes 'wigfile' and 'peak_type' Example: 2L chip_seq vista 5407 23011573 . . . Name=ChipSeq Exp 1;wigfile=SomeWigFile.wigdb;peak_type=binding_site:exp1 The glyph will draw the wiggle file first, than overlay the peaks (if there are any) over signal graph. Elsewhere in the GFF3 file, there should be one or more features of type "binding_site:exp1", e.g.: 2L exp1 binding_site 91934 92005 . . . Options like 'balloon hover' and 'link' are available to customize interaction with peaks in detail view. BigWig support: Supported bigwig format also requires another attribute to be supplied in load gff file (fasta) which specifies sequence index file for the organism in use. The data file should have the 'bw' extension - it is used to detect the BigWig format by vista_plot 3L chip_seq vista 1 24543530 . . . Name=ChipSeq Exp 2;wigfile=SomeBigWigFile.bw;peak_type=binding_site:exp2;fasta=YourOrganism.fasta Note that all attributes should be present in load gff, as the code currently does not handle situation when only some of the attributes are in gff. To omit peak or signal drawing use "" (i.e. peak_type="") In both cases, the stanza code will look the same (only essential parameters shown): [VISTA_PLOT] feature = vista:chip_seq glyph = vista_plot label = 1 smoothing = mean smoothing_window = 10 bump density = 250 autoscale = local variance_band = 1 max_peak = 255 min_peak = 1 peakwidth = 3 start_color = lightgray end_color = black pos_color = blue neg_color = orange bgcolor = orange alpha = 80 fgcolor = black database = database_with_load_gff_data box_subparts = 1 bicolor_pivot = min key = VISTA plot OPTIONS
Options are the same as for wiggle_xyplot and heat_map Additional parameters: alpha set transparency for peak area. glyph_subtype Display only 'peaks', 'signal', 'density' or 'peaks+signal'. Aliases for 'peaks+signal' include "both" and "vista". Recommended global settings: for proper peak drawing transparency should be enabled by setting truecolors=1 in GBrowse.conf file BUGS
Please report them. SEE ALSO
Bio::Graphics::Panel Bio::Graphics::Glyph Bio::Graphics::Glyph::wiggle_xyplot Bio::Graphics::Glyph::heat_map GD AUTHOR
Peter Ruzanov pruzanov@oicr.on.ca Copyright (c) 2010 Ontario Institute for Cancer Research This package and its accompanying libraries is free software; you can redistribute it and/or modify it under the terms of the GPL (either version 1, or at your option, any later version) or the Artistic License 2.0. Refer to LICENSE for the full license text. In addition, please see DISCLAIMER.txt for disclaimers of warranty. perl v5.14.2 2012-02-20 Bio::Graphics::Glyph::vista_plot(3pm)
Man Page