My files look like this
And I need to cut the sequences at the last "A" found in the following 'pattern' -highlighted for easier identification, the pattern is the actual file is not highlighted.
The expected result should look like this
Thus, all the sequences would end with AGCCCTA... (2 Replies)
This is what I would like to accomplish, I have an input file (file A) that consist of thousands of sequence elements with the same number of characters (length), each headed by a free text header starting with the chevron ‘>' character followed by the ID (all different IDs with different lenghts)... (9 Replies)
My file looks something like this
Wnat I need is to look for the Reference sequence (">Reference1") and based on the length of that sequence trim all the entries in that file. So, the rersulting file will contain all sequences with the same length, like this
Thus, all sequences will keep... (5 Replies)
I have a fastq file from small RNA sequencing with sequence lengths between 15 - 30. I wanted to filter sequence lengths between 21-25 and write to another fastq file. how can i do that? (4 Replies)
I have two files with thousands of sequences of different lengths. infile1 contains the actual sequences and infile2 the scores for each A, T, G and C in infile1. Something like this:
infile1:
>HZVJKYI01ECH5R
TTGATGTGCCAGCTGCCGTTGGTGTGCCAA
>HZVJKYI01AQWJ8
GGATATGATGATGAACTGGTTTGGCACACC... (4 Replies)
I have to remove sequences from a file based on the distance value. I am attaching the file containing the distances (Distance.xls)
The second file looks something like this:
Sequences.txt
>Sample1 Freq 59
ggatatgatgatgaactggt
>Sample1 Freq 54
ggatatgatgttgaactggt
>Sample1 Freq 44... (2 Replies)
Hello to all,
I would like to search sequences of bytes inside big binary file.
The bin file contains blocks of information, each block begins is estructured as follow:
1- Each block begins with the hex 32 (1 byte) and ends with FF. After the FF of the last block, it follows 33.
2- Next... (59 Replies)
I have a text file, input.fasta contains some protein sequences. input.fasta is shown below.
>P02649
MKVLWAALLVTFLAGCQAKVEQAVETEPEPELRQQTEWQSGQRWELALGRFWDYLRWVQT
LSEQVQEELLSSQVTQELRALMDETMKELKAYKSELEEQLTPVAEETRARLSKELQAAQA
RLGADMEDVCGRLVQYRGEVQAMLGQSTEELRVRLASHLRKLRKRLLRDADDLQKRLAVY... (8 Replies)
I have two files. File1 is shown below.
>153L:B|PDBID|CHAIN|SEQUENCE
RTDCYGNVNRIDTTGASCKTAKPEGLSYCGVSASKKIAERDLQAMDRYKTIIKKVGEKLCVEPAVIAGIISRESHAGKVL
KNGWGDRGNGFGLMQVDKRSHKPQGTWNGEVHITQGTTILINFIKTIQKKFPSWTKDQQLKGGISAYNAGAGNVRSYARM
DIGTTHDDYANDVVARAQYYKQHGY
>16VP:A|PDBID|CHAIN|SEQUENCE... (7 Replies)
I have this file:
>ID1
AA
>ID2
TTTTTT
>ID-3
AAAAAAAAA
>ID4
TTTTTTGGAGATCAGTAGCAGATGACAG-GGGGG-TGCACCCC
Add I am trying to use this script to output sequences longer than 15 characters:
sed -r '/^>/N;{/^.{,15}$/d}'
The desire output would be this:
>ID4... (8 Replies)
Discussion started by: Xterra
8 Replies
LEARN ABOUT DEBIAN
h5diff
h5diff(1) General Commands Manual h5diff(1)NAME
h5diff - Compares two HDF5 files and reports the differences.
SYNOPSIS
h5diff file1 file2 [OPTIONS] [object1 [object2 ] ]
DESCRIPTION
h5diff is a command line tool that compares two HDF5 files, file1 and file2, and reports the differences between them.
Optionally, h5diff will compare two objects within these files. If only one object, object1, is specified, h5diff will compare object1 in
file1 with object1 in file2. In two objects, object1 and object2, are specified, h5diff will compare object1 in file1 with object2 in
file2. These objects must be HDF5 datasets.
object1 and object2 must be expressed as absolute paths from the respective file's root group.
Additional information, with several sample cases, can be found in the document H5diff Examples.
OPTIONS
file1 file2
The HDF5 files to be compared.
-h Print all differences.
-r Print only the names of objects that differ; do not print the differences. These objects may be HDF5 datasets, groups, or named
datatypes.
-n count
Print difference up to count differences, then stop. count must be a positive integer.
-d delta
Print only differences that are greater than the limit delta. delta must be a positive number. The comparison criterion is whether
the absolute value of the difference of two corresponding values is greater than delta (e.g., |a-b| > delta, where a is a value in
file1 and b is a value in file2).
-p relative
Print only differences that are greater than a relative error. relative must be a positive number. The comparison criterion is
whether the absolute value of the difference 1 and the ratio of two corresponding values is greater than relative (e.g., |1-(b/a)| >
relative where a is a value in file1 and b is a value in file2).
object1 object2
Specific object(s) within the files to be compared.
EXAMPLES
The following h5diff call compares the object /a/b in file1 with the object /a/c in file2:
h5diff file1 file2 /a/b /a/c
This h5diff call compares the object /a/b in file1 with the same object in file2:
h5diff file1 file2 /a/b
And this h5diff call compares all objects in both files:
h5diff file1 file2
SEE ALSO h5dump(1), h5ls(1), h5repart(1), h5import(1), gif2h5(1), h52gif(1), h5perf(1)h5diff(1)