Hi gang.
I'm using a unix/mac system and i'm trying to sort a file (more than 1,000,000 lines).
chr1 100000965 100001001 -
chr1 100002155 100002191 +
chr1 100002165 100002201 +
chr1 100002525 100002561 -
chr1 10000364 ... (2 Replies)
Hi all,
in my csv file it'll look like this, and of course it may have more columns
US to UK;abc-hq-jcl;multimedia
UK to CN;def-ny-jkl;standard
DE to DM;abc-ab-klm;critical
FD to YM;la-yr-tym;standard
HY to MC;la-yr-ytm;multimedia
GT to KJ;def-ny-jrt;critical
I would like to group... (4 Replies)
Hi,
I have two files, one of which I would like to sort based on the order of the data in the second. I would like to do this using a simple unix statement.
My two files as follows:
File 1:
12345 1 2 2 2 0 0
12349 0 0 2 2 1 2
12350 1 2 1 2 2 2
.
.
.
File2:
12350... (3 Replies)
Hi
I have some files in directory and the names of files are like
jnhld_15233_2010-11-23
jnhld_15233_2007-10-01
jnhld_15233_2001-05-04
jnhld_15233_2011-11-11
jnhld_15233_2005-06-07
jnhld_15233_2000-04-01
..etc
How can i sort these files based on the date in the file name so that ... (4 Replies)
I have a tab delimited file with 5 columns
79 A B 20.2340 6.1488 8.5086 1.3838
87 A B 0.1310 0.0382 0.0054 0.1413
88 A B 46.1651 99.0000 21.8107 0.2203
89 A B 0.1400 0.1132 0.0151 0.1334
114 A B 0.1088 0.0522 0.0057 0.1083
115 A B... (2 Replies)
I would like to sort a tab delimited text file based on the absolute value of its second column. How do I go about doing that? Thanks!
Example input:
A -12
B 0
C -6
D 7
Output:
A -12
D 7
C -6
B 0 (4 Replies)
Hi,
I use UBUNTU 12.04.
I have a file with this structure:
Name 2 1245787 A G 12 14 12 14 ....
Name 1 1245789 C T 13 12 12 12.....
I would like to sort my file based on the second column so to have this output for example:
Name 1 1245789 C T 13 12 12 12.....
Name 2 1245787 A G 12 14... (4 Replies)
Hi,
I have two pipe separated files as below:
head -3 file1.txt
"HD"|"Nov 11 2016 4:08AM"|"0000000018"
"DT"|"240350264"|"56432"
"DT"|"240350264"|"56432"
head -3 file2.txt
"HD"|"Nov 15 2016 2:18AM"|"0000000019"
"DT"|"240350264"|"56432"
"DT"|"240350264"|"56432"
I want to list the... (6 Replies)
Hi,
I have file which contains data based on tags. Output of the file should be in order of tags.
Below are the files :
Tags.txt
f12
f13
f23
f45
f56
Original data is like this :
Data.txt
2017/01/04|09:07:00:021|R|XYZ|38|9|1234|f12=CAT|f23=APPLE|f45=PENCIL|f13=CAR... (5 Replies)
Hi All
I have a requirement to list all the files in chronological order based on the date value in the file name.For ex if I have three files as given below
ABC_TEST_20160103_1012.txt
ABC_TEST_20160229_1112.txt
ABC_TEST_20160229_1112.txt
I have written code as given below to list out... (2 Replies)
Discussion started by: ginrkf
2 Replies
LEARN ABOUT DEBIAN
bp_mask_by_search
BP_MASK_BY_SEARCH(1p) User Contributed Perl Documentation BP_MASK_BY_SEARCH(1p)NAME
mask_by_search - mask sequence(s) based on its alignment results
SYNOPSIS
mask_by_search.pl -f blast genomefile blastfile.bls > maskedgenome.fa
DESCRIPTION
Mask sequence based on significant alignments of another sequence. You need to provide the report file and the entire sequence data which
you want to mask. By default this will assume you have done a TBLASTN (or TFASTY) and try and mask the hit sequence assuming you've
provided the sequence file for the hit database. If you would like to do the reverse and mask the query sequence specify the -t/--type
query flag.
This is going to read in the whole sequence file into memory so for large genomes this may fall over. I'm using DB_File to prevent keeping
everything in memory, one solution is to split the genome into pieces (BEFORE you run the DB search though, you want to use the exact file
you BLASTed with as input to this program).
Below the double dash (--) options are of the form --format=fasta or --format fasta or you can just say -f fasta
By -f/--format I mean either are acceptable options. The =s or =n or =c specify these arguments expect a 'string'
Options:
-f/--format=s Search report format (fasta,blast,axt,hmmer,etc)
-sf/--sformat=s Sequence format (fasta,genbank,embl,swissprot)
--hardmask (booelean) Hard mask the sequence
with the maskchar [default is lowercase mask]
--maskchar=c Character to mask with [default is N], change
to 'X' for protein sequences
-e/--evalue=n Evalue cutoff for HSPs and Hits, only
mask sequence if alignment has specified evalue
or better
-o/--out/
--outfile=file Output file to save the masked sequence to.
-t/--type=s Alignment seq type you want to mask, the
'hit' or the 'query' sequence. [default is 'hit']
--minlen=n Minimum length of an HSP for it to be used
in masking [default 0]
-h/--help See this help information
AUTHOR - Jason Stajich
Jason Stajich, jason-at-bioperl-dot-org.
perl v5.14.2 2012-03-02 BP_MASK_BY_SEARCH(1p)