Sponsored Content
Top Forums Shell Programming and Scripting Extract sequences of bytes from binary for differents blocks Post 302843499 by Ophiuchus on Tuesday 13th of August 2013 02:02:48 PM
Old 08-13-2013
Hello to all,

If there is an option extract the sequences directly from the binary file would be better and faster I think,
I'm not sure if it is possible with bash or Perl or another option you can suggest me.

Hello Jotne,

Thanks for your help. Your script detects the position of the sequences but I would like to
extract those sequences to a new file having in the output file one line per information of each block
in binary.

Hello wisecracker,
I'll check the option you mention, but it is possible with your script to extract the complete byte sequence?

Thanks again for the help.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove first N bytes and last N bytes from a binary file on AIX.

Hi all, Does anybody know or guide me on how to remove the first N bytes and the last N bytes from a binary file? Is there any AWK or SED or any command that I can use to achieve this? Your help is greatly appreciated!! Best Regards, Naveen. (1 Reply)
Discussion started by: naveendronavall
1 Replies

2. UNIX for Advanced & Expert Users

Deal with binary sequences

Hello, I have come across the necessity for me to deal with binary sequences and I had a few questions. 1- Does any UNIX scripting language provide any tool or command for converting text data to binary sequences? Example of binary sequence: "0x97 0x93 0x85 0x40 0xd5 0xd6 0xd7" 2- If I want... (2 Replies)
Discussion started by: Indalecio
2 Replies

3. Shell Programming and Scripting

Extract sequence blocks

Hi, I have an one-line file consisting of a sequence of 660 letters. I would like to extract 9-letter blocks iteratively: ASDFGHJKLQWERTYUIOPZXCVBNM first block: ASDFGHJKL 1nd block: SDFGHJKLQ What I have so far only gives me the first block, can anyone please explain why? cat... (7 Replies)
Discussion started by: solli
7 Replies

4. Shell Programming and Scripting

extract blocks of text from a file

Hi, This is part of a large text file I need to separate out. I'd like some help to build a shell script that will extract the text between sets of dashed lines, write that to a new file using the whole or part of the first text string as the new file name, then move on to the next one and... (7 Replies)
Discussion started by: cajunfries
7 Replies

5. Linux

Why does ext3 allocate 8 blocks for files that are few bytes long

The title is clear: why does ext3 allocate 8 blocks for files that are few bytes long? If I create a file named "test", put a few chars in it, and then I run: stat test I get that "Blocks: 8" I searched in the web and found that ext does that, it allocates 8 blocks even if It doesn't need... (4 Replies)
Discussion started by: Tavo
4 Replies

6. UNIX for Dummies Questions & Answers

X bytes of 0, Y bytes of random data, Z bytes of 5, T bytes of 1. ??

Hello guys. I really hope someone will help me with this one.. So, I have to write this script who: - creates a file home/student/vmdisk of 10 mb - formats that file to ext3 - mounts that partition to /mnt/partition - creates a file /mnt/partition/data. In this file, there will... (1 Reply)
Discussion started by: razolo13
1 Replies

7. Shell Programming and Scripting

Extract sequences based on the list

Hi, I have a file with more than 28000 records and it looks like below.. >mm10_refflat_ABCD range=chr1:1234567-2345678 tgtgcacactacacatgactagtacatgactagac....so on >mm10_refflat_BCD range=chr1:3234567-4545678... tgtgcacactacacatgactagtatgtgcacactacacatgactagta . . . . . so on ... (2 Replies)
Discussion started by: Diya123
2 Replies

8. Shell Programming and Scripting

Extract length wise sequences from fastq file

I have a fastq file from small RNA sequencing with sequence lengths between 15 - 30. I wanted to filter sequence lengths between 21-25 and write to another fastq file. how can i do that? (4 Replies)
Discussion started by: empyrean
4 Replies

9. Shell Programming and Scripting

Extract the part of sequences from a file

I have a text file, input.fasta contains some protein sequences. input.fasta is shown below. >P02649 MKVLWAALLVTFLAGCQAKVEQAVETEPEPELRQQTEWQSGQRWELALGRFWDYLRWVQT LSEQVQEELLSSQVTQELRALMDETMKELKAYKSELEEQLTPVAEETRARLSKELQAAQA RLGADMEDVCGRLVQYRGEVQAMLGQSTEELRVRLASHLRKLRKRLLRDADDLQKRLAVY... (8 Replies)
Discussion started by: rahim42
8 Replies

10. Shell Programming and Scripting

Blocks of text in a file - extract when matches...

I sat down yesterday to write this script and have just realised that my methodology is broken........ In essense I have..... ----------------------------------------------------------------- (This line really is in the file) Service ID: 12345 ... (7 Replies)
Discussion started by: Bashingaway
7 Replies
RNACALIBRATE(1) 					      General Commands Manual						   RNACALIBRATE(1)

NAME
RNAcalibrate - calibrate statistics of secondary structure hybridisations of RNAs SYNOPSIS
RNAcalibrate [-h] [-d frequency_file] [-f from,to] [-k sample_size] [-l mean,std] [-m max_target_length] [-n max_query_length] [-u iloop_upper_limit] [-v bloop_upper_limit] [-s] [-t target_file] [-q query_file] [target] [query] DESCRIPTION
RNAcalibrate is a tool for calibrating minimum free energy (mfe) hybridisations performed with RNAhybrid. It searches a random database that can be given on the command line or otherwise generates random sequences according to given sample size, length distribution parame- ters and dinucleotide frequencies. To the empirical distribution of length normalised minimum free energies, parameters of an extreme value distribution (evd) are fitted. The output gives for each miRNA its name (or "command_line" if it was submitted on the command line), the number of data points the evd fit was done on, the location and the scale parameter. The location and scale parameters of the evd can then be given to RNAhybrid for the calculation of mfe p-values. OPTIONS
-h Give a short summary of command line options. -d frequency_file Generate random sequences according to dinucleotide frequencies given in frequency_file. See example directory for example files. -f from,to Forces all structures to have a helix from position from to position to with respect to the query. The first base has position 1. -k sample_size Generate sample_size random sequences. Default value is 5000. -l mean,std Generate random sequences with a normal length distribution of mean mean and standard deviation std. Default values are 500 and 300, respectively. -m max_target_length The maximum allowed length of a target sequence. The default value is 2000. This option only has an effect if a target file is given with the -t option (see below). -n max_query_length The maximum allowed length of a query sequence. The default value is 30. This option only has an effect if a query file is given with the -q option (see below). -u iloop_upper_limit The maximally allowed number of unpaired nucleotides in either side of an internal loop. -v bloop_upper_limit The maximally allowed number of unpaired nucleotides in a bulge loop. -s Generate random sequences according to the dinucleotide distribution of given targets (either with the -t option or on command line. If no -t is given, either the last argument (if a -q is given) or the second last argument (if no -q is given) to RNAcalibrate is taken as a target). See -t option. -t target_file Without the -s option, each of the target sequences in target_file is subject to hybridisation with each of the queries (which either are from the query_file or is the one query given on command line; see -q below). The sequences in the target_file have to be in FASTA format, ie. one line starting with a > and directly followed by a name, then one or more following lines with the sequence itself. Each individual sequence line must not have more than 1000 characters. With the -s option, the target (or target file) dinucleotide distribution is counted, and random sequences are generated according to this distribution. If no -t is given, random sequences are generated as described above (see -d option). -q query_file See -t option above. If no -q is given, the last argument to RNAcalibrate is taken as a query. REFERENCES
The energy parameters are taken from: Mathews DH, Sabina J, Zuker M, Turner DH. "Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure" J Mol Biol., 288 (5), pp 911-940, 1999 VERSION
This man page documents version 2.0 of RNAcalibrate. AUTHORS
Marc Rehmsmeier, Peter Steffen, Matthias Hoechsmann. LIMITATIONS
Character dependent energy values are only defined for [acgtuACGTU]. All other characters lead to values of zero in these cases. SEE ALSO
RNAhybrid, RNAeffective RNACALIBRATE(1)
All times are GMT -4. The time now is 07:34 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy