VCF-ISEC(1) User Commands VCF-ISEC(1)NAME
vcf-isec - create intersections, unions, complements on bgzipped and tabix indexed VCF or tab-delimited files
SYNOPSIS
vcf-isec [OPTIONS] file1.vcf file2.vcf ...
DESCRIPTION
About: Create intersections, unions, complements on bgzipped and tabix indexed VCF or tab-delimited files.
Note that lines from all files can be intermixed together on the output, which can yield unexpected results.
OPTIONS -C, --chromosomes <list|file>
Process the given chromosomes (comma-separated list or one chromosome per line in a file).
-c, --complement
Output positions present in the first file but missing from the other files.
-d, --debug
Debugging information
-f, --force
Continue even if the script complains about differing columns.
-o, --one-file-only
Print only entries from the left-most file. Without -o, all unique positions will be printed.
-n, --nfiles [+-=]<int>
Output positions present in this many (=), this many or more (+), or this many or fewer (-) files.
-p, --prefix <path>
If present, multiple files will be created with all possible isec combinations. (Suitable for Venn Diagram analysis.)
-t, --tab <chr:pos:file>
Tab-delimited file with indexes of chromosome and position columns. (1-based indexes)
-w, --win <int>
In repetitive sequences, the same indel can be called at different positions. Consider records this far apart as matching (be it a
SNP or an indel).
-h, -?, --help
This help message.
EXAMPLES
bgzip file.vcf; tabix -p vcf file.vcf.gz bgzip file.tab; tabix -s 1 -b 2 -e 2 file.tab.gz
vcf-isec 0.1.5 July 2011 VCF-ISEC(1)
Check Out this Related Man Page
VCF-COMPARE(1) User Commands VCF-COMPARE(1)NAME
vcf-compare - compare bgzipped and tabix indexed VCF files
SYNOPSIS
compare-vcf [OPTIONS] file1.vcf file2.vcf ...
DESCRIPTION
About: Compare bgzipped and tabix indexed VCF files. (E.g. bgzip file.vcf; tabix -p vcf file.vcf.gz)
OPTIONS -c, --chromosomes <list|file>
Same as -r, left for backward compatibility. Please do not use as it will be dropped in the future.
-d, --debug
Debugging information. Giving the option multiple times increases verbosity
-H, --cmp-haplotypes
Compare haplotypes, not only positions
-m, --name-mapping <list|file>
Use with -H when comparing files with differing column names. The argument to this options is a comma-separated list or one mapping
per line in a file. The names are colon separated and must appear in the same order as the files on the command line.
-R, --refseq <file>
Compare the actual sequence, not just positions. Use with -w to compare indels.
-r, --regions <list|file>
Process the given regions (comma-separated list or one region per line in a file).
-s, --samples <list>
Process only the listed samples. Excluding unwanted samples may increase performance considerably.
-w, --win <int>
In repetitive sequences, the same indel can be called at different positions. Consider records this far apart as matching (be it a
SNP or an indel).
-h, -?, --help
This help message.
vcf-compare 0.1.5 July 2011 VCF-COMPARE(1)
I have searched the forum and tried different options. One of the options work but is very slow. The file has millions and millions of records.
It is a TAB delimited file which contains two types of records. Metadata and Detail records.
M PARTNER 8 LAST_BOOKED_DATE D YYYYMMDD ... (13 Replies)
Hi All,
Please help me out with a script which checks whether a given file say abc.txt is in ASCII format and data is tab-delimited. If the condition doesn't satisfy then it should generate error code "100" for file not in ASCII format and "105" if it is not in tab-delimited format.
If the... (9 Replies)
Hi all,
I have a file with single white space delimited values, I want to convert them to a tab delimited file.
I tried sed, tr ... but nothing is working.
Thanks,
Rajeevan D (16 Replies)
Hello all,
I'd like to know how to perform arithmetic on multiple files. I have got many tab-delimited files. Each file contains about 2000 rows and 2000 columns.
What I want to do is to to sum the values in each row & column in every file.
The following explains what I want to do;
... (9 Replies)
Hello!
I need to sort a file that is partly in English partly in Bulgarian.
The original file is an Excel file but I converted it to a tab-delimited text file. The encoding of the tab delimited file is UTF-8.
To sort the text, the script should test every line of the text file to see if... (9 Replies)
Hi,
I came across a very good script to convert a comma seperated to pipe delimited file in this forum. the script serves most of the requirement but looks like it does not handle embedded double quotes and commas i.e if the input is like
1234, "value","first,second", "LDC5"monitor",... (15 Replies)
Hey guys
I have got a tab-separated file and I want to copy only selected records from two columns at a time satisfying specified condition, and create a new file.
My tab separated file is like this
ID score ID score ID Score ID score ID score
1_11 0.80 2_23 0.74 2.36 0.78 2_34 0.75
A_34... (9 Replies)
I'm trying to remove all of the empty lines at the end of a Tab delimited file. They have no data just tabs.
I've tried may things, here are a couple:
sed /^\t.\t/d File1 > File2
sed /^\t{44}/d File1 > File2
What am I missing? (9 Replies)
I have two files. The first containing a header and six columns of data.
Example file 1:
Number SNP ID dbSNP RS ID Chromosome Result_Call Physical Position
787066 SNP_A-8575395 RS6650104 1 NOCALL 564477
786872 SNP_A-8575125 RS10458597 1 AA ... (13 Replies)
Hi,
i am trying to match strings from 2 different files based on position like below:-
file1 (tab delimited)
f07270 lololol fff
u12730 gggddd dddkkrr mmm
file2 (not tab delimited)
%f07270 APSLH bl%alalalalallaadsdsfdfdfdgsgfss
%g13450 GDIDFLRIP%ILITEAPPRKgsfgsgsf
%d08880... (11 Replies)
Thank you for 4 looking this post.
We have a tab delimited file where we are facing problem in a lot of funny character. I have tried using awk but failed that is not working.
In the 5th field ID which is supposed to be a integer only of that file, we are getting corrupted data as below.
I... (12 Replies)
If I am searching for AA then then BB in a loop, how do I make the output always contain 6 columns of comma separated data even when there may only be 4 search matches?
AA11
AA12
AA13
AA14
BB11
BB12
BB13
BB14
BB15
BB16
Final output:
AA11,AA12,AA13,AA14,,,... (14 Replies)
Hello,
I have two tab delimited text files. Both files have the same number of rows but not necessarily the same number of columns. The column headers look like,
File 1:
f0order CVorder Name f0 RI_9 E99 E199 E299 E399 E499 E599 E699 E799 E899 E999
File 2:... (9 Replies)
Please know that I am very new to unix and trying to learn 'on the job'. I'm only manipulating large tab-delimited files (millions of rows), but I'm stuck and don't know how to proceed with the following. Hoping for some friendly advice :)
I have 2 tab-delimited files - with differing column &... (10 Replies)