I find myself doing this task pretty frequently from time to time, it's a bit brain-bending to think about it at first but actually is relatively straightforward:
So:
Not tested but should be at least pretty close.
Well, I hope this way you will respond to my inquiries.
I have 4 unix servers,with static ips (though i dont think this is an issue)....i can telnet and rlogin from one to the other....if i FTP from on et othe other and try to execute :
cd /user return
/user : no such file or... (1 Reply)
Hi,
I'm looking for someone who can think in sed. Basically, I need the trailing characters on every line in a file to be deleted. These characters are all in capitals, and always follow a number, but they often vary in number
For instance, on the line:
2006_10_9_p20_TALK
I'd want to... (4 Replies)
Hi All,
I am writing some data's into a file from C++ program. The files which i am writing is of fixed length . say 232 in length per line.
I am writing as . my c code is as
... (0 Replies)
Hi
I had deleted important files from my company server :(
the server is HPUX
and i don't know how to undo rm command or how to restore the files ..
iam appreciate for any help
Thanx ... (5 Replies)
Gurus,
I have one file which is having multiple columns and also this file is not always contain the exact columns; sometimes it contains 5 columns or 12 columns. Now, I need to find the difference from that particular file. Here is the sample file:
param1 | 10 | 20 | 30 |
param2 | 10 |... (6 Replies)
Hi Experts,
My requirement is to compare the second field/column in two files, if the second column is same in both the files then compare the first field. If the first is not matching then print the first and second fields of both the files.
first file (a .txt)
< 1210018971FF0000,... (6 Replies)
Hello,
I am PhD student (Biomedical sciences) and very new to Linux. I need some help with the following task :
I have files in the following format for their names :
An_A1_nnn_R1.txt; An_A1_nnm_R1.txt; An_A1_nnoo_R1.txt
An_A2_nnn_R1.txt; An_A2_nnm_R1.txt; An_A2_nno_R1.txt
... (8 Replies)
Hi,
I want to compare two files and print out their differences
e.g:
t1.txt
a,b,c,d
t2.txt
a,b,c,d,e,f
Output
e,f
Currently I do this long about way
tr ',' '\n' <t1.txt >t1.tmp
tr ',' '\n' <t2.txt >t2.tmp
diff t1.tmp t2.tmp > t12.tmp
I have to this comparison for 100 files, so... (3 Replies)
I have two file as given below which shows the ACL permissions of each file. I need to compare the source file with target file and list down the difference as specified below in required output. Can someone help me on this ?
Source File
*************
# file: /local/test_1
# owner: own
#... (4 Replies)
Discussion started by: sarathy_a35
4 Replies
LEARN ABOUT DEBIAN
tabix
tabix(1) Bioinformatics tools tabix(1)NAME
bgzip - Block compression/decompression utility
tabix - Generic indexer for TAB-delimited genome position files
SYNOPSIS
bgzip [-cdhB] [-b virtualOffset] [-s size] [file]
tabix [-0lf] [-p gff|bed|sam|vcf] [-s seqCol] [-b begCol] [-e endCol] [-S lineSkip] [-c metaChar] in.tab.bgz [region1 [region2 [...]]]
DESCRIPTION
Tabix indexes a TAB-delimited genome position file in.tab.bgz and creates an index file in.tab.bgz.tbi when region is absent from the com-
mand-line. The input data file must be position sorted and compressed by bgzip which has a gzip(1) like interface. After indexing, tabix is
able to quickly retrieve data lines overlapping regions specified in the format "chr:beginPos-endPos". Fast data retrieval also works over
network if URI is given as a file name and in this case the index file will be downloaded if it is not present locally.
OPTIONS OF TABIX -p STR Input format for indexing. Valid values are: gff, bed, sam, vcf and psltab. This option should not be applied together with any
of -s, -b, -e, -c and -0; it is not used for data retrieval because this setting is stored in the index file. [gff]
-s INT Column of sequence name. Option -s, -b, -e, -S, -c and -0 are all stored in the index file and thus not used in data retrieval.
[1]
-b INT Column of start chromosomal position. [4]
-e INT Column of end chromosomal position. The end column can be the same as the start column. [5]
-S INT Skip first INT lines in the data file. [0]
-c CHAR Skip lines started with character CHAR. [#]
-0 Specify that the position in the data file is 0-based (e.g. UCSC files) rather than 1-based.
-h Print the header/meta lines.
-B The second argument is a BED file. When this option is in use, the input file may not be sorted or indexed. The entire input will
be read sequentially. Nonetheless, with this option, the format of the input must be specificed correctly on the command line.
-f Force to overwrite the index file if it is present.
-l List the sequence names stored in the index file.
EXAMPLE
(grep ^"#" in.gff; grep -v ^"#" in.gff | sort -k1,1 -k4,4n) | bgzip > sorted.gff.gz;
tabix -p gff sorted.gff.gz;
tabix sorted.gff.gz chr1:10,000,000-20,000,000;
NOTES
It is straightforward to achieve overlap queries using the standard B-tree index (with or without binning) implemented in all SQL data-
bases, or the R-tree index in PostgreSQL and Oracle. But there are still many reasons to use tabix. Firstly, tabix directly works with a
lot of widely used TAB-delimited formats such as GFF/GTF and BED. We do not need to design database schema or specialized binary formats.
Data do not need to be duplicated in different formats, either. Secondly, tabix works on compressed data files while most SQL databases do
not. The GenCode annotation GTF can be compressed down to 4%. Thirdly, tabix is fast. The same indexing algorithm is known to work effi-
ciently for an alignment with a few billion short reads. SQL databases probably cannot easily handle data at this scale. Last but not the
least, tabix supports remote data retrieval. One can put the data file and the index at an FTP or HTTP server, and other users or even web
services will be able to get a slice without downloading the entire file.
AUTHOR
Tabix was written by Heng Li. The BGZF library was originally implemented by Bob Handsaker and modified by Heng Li for remote file access
and in-memory caching.
SEE ALSO samtools(1)tabix-0.2.0 11 May 2010 tabix(1)