Extract sequences of bytes from binary for differents blocks


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract sequences of bytes from binary for differents blocks
# 22  
Old 08-18-2013
Hello ahamed,

I'll test your new code asap. Thanks so much.

I'm attaching the image. Is correct, the sequence that begins with 0x83 if followed by 17 bytes, the others sequences are followed by 16 bytes.
Because of that the 0x01 before the sequence 0x86.

Best regards
Extract sequences of bytes from binary for differents blocks-sub-block-sequencesjpg

Last edited by Ophiuchus; 08-18-2013 at 05:58 PM..
# 23  
Old 08-19-2013
Hello ahamed,

Thanks for your help. I've tried the code and extracts the sequences.

Some other things may be pending for me, but I'll investigate how to do it (like printing some bytes as decimal, etc).

I was trying to print all the sequences for each block in the same line, but I'm only able to print correctly the firsts sequences fine, but the sequences of sub-block still appears in different line.

I was trying modifying the print routine as below, but is no correct yet for my goal.
Code:
void print_data(const unsigned char *ptr, int len)
{
        int i;
        printf("%02x", ptr[0]);
        printf(" ");        
        for(i=1;i<=3;i++)
                printf("%02x", ptr[i]);
        printf(" ");        
        for(i=4;i<=11;i++)
                printf("%02x", ptr[i]);
        printf(" ");
        for(i=12;i<=19;i++)
                printf("%02x", ptr[i]);           
        printf("\n");        
        return;
}

The desired output is:
Code:
32 000001 991145278934550f 73494549232fffff 
32 000002 991145278934551f 73494554768fffff 80 0f 01 02 00000030 7349526905ffffff 00 81 .... 83 ... 87 ...

Thanks again.

Regards
# 24  
Old 08-20-2013
This is a starter Python 3.3.x code to extract the blocks using a user function...

This should be more than enough to get you going with your binary manipulation.

Once each block is placed into its respective _variable_ it is seriously easy to extract sub-blocks of data from each major block extracted from the binary file...
Code:
# BinarySearch.py
# DEMO to extract binary data...
# Written in such a way as to be easily understood.
# NOTE:- There are more Pythonic ways to do this...

import os
import sys

# The _variable_ used to test with...
block_one=b"?"
block_two=b"?"
subscript_begin=0

# Set to a _temporary_ directory. NOTE:- 4 different platforms shown...
# Change to suit your platform and path
if sys.platform=="linux2": os.chdir("/tmp/")
if sys.platform=="darwin": os.chdir("/Users/barrywalker/")
if sys.platform=="win32": os.chdir("C:\\Windows\\Temp\\")
if sys.platform=="amiga": os.chdir("T:")

# A basic "bin_extract()" function to extract binary data from a binary file...
def bin_extract(subscript_start=0, subscript_end=0, filename="?", char_start=b"\x00\xFF", char_end=b"\xFF\x00"):
	bin_var=b""
	toggle=0
	binfile=open(filename, "rb")
	for position in range(subscript_start, subscript_end, 1):
		binfile.seek(position)
		if binfile.read(2)==char_start:
			bin_var=bin_var+char_start[1:2]
			toggle=1
		if toggle>=1: bin_var=bin_var+binfile.read(1)
		if binfile.read(2)==char_end:
			bin_var=bin_var+char_end[0:1]
			toggle=0
			break
	binfile.close()
	return(bin_var, position)

# Test...
# ******** Do the first iteration and find the first block... *******
block_one, subscript_begin=bin_extract(0, 816, "BinaryFile", b"\xFF\x32", b"\xFF\x34")
print(block_one)
print(subscript_begin)
# When successful..........
# Just extract the required data from the _variable_ "new_binary" using standard simple Python methods
# and when finished with set new values to the function for the next block run using "subscript_begin"
# + or - 2 as the next "subscript_start" in the function, something like below...
block_two, subscript_begin=bin_extract((subscript_begin+2), 816, "BinaryFile", b"\xFF\x32", b"\xFF\x34")
print(block_two)
print(subscript_begin)
# DEMO end...

Hope this helps.
CYA.
# 25  
Old 08-20-2013
Hello wisecracker,

Thanks for your help.

I'm trying but I don't know how to run it.

I've saved your code without the lines after "#Test... ". I saved in script.py, then I've sent
Code:
>> import script

a
nd was accepted without errors.
Buth when I send the line "block_one,...." I get the error below:
Code:
>>> block_one, subscript_begin=bin_extract(0, 816, "binfile", b"\xFF\x32", b"\xFF\x34")
Traceback (most recent call last):
  File "<pyshell#27>", line 1, in <module>
    block_one, subscript_begin=bin_extract(0, 816, "binfile", b"\xFF\x32", b"\xFF\x34")
NameError: name 'bin_extract' is not defined

Why I'm doing wrong? how must be executed?

Thanks again.
# 26  
Old 08-20-2013
You do not know how to run Python code? ;oO
It is a DEMO so why are you importing it?

I assume you are using Windows...

Also that your uploaded pseudo-text file on here and this code are both inside the C:\Windows\Temp directory...

I have tested on a Macbook Pro and Windows Vista/7 and both give exactly the same results.

This file is called BinarySearch.py, your binary file I named BinaryFile.

So from a Python prompt...
Code:
>>> exec(open("C:\\Windows\\Temp\\BinarySearch.py").read())<CR>

Where <CR> is the ENTER key...

You should now be able to do the rest for yourself.

I am now rigged for silent running...
# 27  
Old 08-20-2013
Hello wisecracker,

Thanks for all help. I don't want to be annoying. Now It seems is runnig the script
in the way you said me, but when I run this:
Code:
>>> exec(open("C:\\Windows\\Temp\\BinarySearch.py").read())
b''
815

Then, when you say "Just extract the required data from the _variable_ "new_binary" using standard simple Python methods", which is the variable that contains the "binary"
block? is block_one or bin_var?

Why only is printing b''?

I'm using this
Code:
block_one, subscript_begin=bin_extract(0, 816, "binfile", b"\xFF\x32", b"\xFF\x34")
print(block_one)
print(subscript_begin)

# 28  
Old 08-21-2013
Remove the printf("\n"); from the function print_data and place it in the main function in here

Code:
                if(buf[0] == 0x32){
                        if(18 != fread(ptr+2, sizeof(char), 18, fp))
                                err("Insufficient data");
                        if(memcmp(buf+4, pat1, 4) && memcmp(buf+12, pat2, 2)){
                                fseek(fp, pos, SEEK_SET);
                        }else{
                                found = 1; //found the starting of the block with data
                                printf("\n");
                                print_data(ptr, 19);
                        }
                        continue;
                }

HTH
--ahamed
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Blocks of text in a file - extract when matches...

I sat down yesterday to write this script and have just realised that my methodology is broken........ In essense I have..... ----------------------------------------------------------------- (This line really is in the file) Service ID: 12345 ... (7 Replies)
Discussion started by: Bashingaway
7 Replies

2. Shell Programming and Scripting

Extract the part of sequences from a file

I have a text file, input.fasta contains some protein sequences. input.fasta is shown below. >P02649 MKVLWAALLVTFLAGCQAKVEQAVETEPEPELRQQTEWQSGQRWELALGRFWDYLRWVQT LSEQVQEELLSSQVTQELRALMDETMKELKAYKSELEEQLTPVAEETRARLSKELQAAQA RLGADMEDVCGRLVQYRGEVQAMLGQSTEELRVRLASHLRKLRKRLLRDADDLQKRLAVY... (8 Replies)
Discussion started by: rahim42
8 Replies

3. Shell Programming and Scripting

Extract length wise sequences from fastq file

I have a fastq file from small RNA sequencing with sequence lengths between 15 - 30. I wanted to filter sequence lengths between 21-25 and write to another fastq file. how can i do that? (4 Replies)
Discussion started by: empyrean
4 Replies

4. Shell Programming and Scripting

Extract sequences based on the list

Hi, I have a file with more than 28000 records and it looks like below.. >mm10_refflat_ABCD range=chr1:1234567-2345678 tgtgcacactacacatgactagtacatgactagac....so on >mm10_refflat_BCD range=chr1:3234567-4545678... tgtgcacactacacatgactagtatgtgcacactacacatgactagta . . . . . so on ... (2 Replies)
Discussion started by: Diya123
2 Replies

5. UNIX for Dummies Questions & Answers

X bytes of 0, Y bytes of random data, Z bytes of 5, T bytes of 1. ??

Hello guys. I really hope someone will help me with this one.. So, I have to write this script who: - creates a file home/student/vmdisk of 10 mb - formats that file to ext3 - mounts that partition to /mnt/partition - creates a file /mnt/partition/data. In this file, there will... (1 Reply)
Discussion started by: razolo13
1 Replies

6. Linux

Why does ext3 allocate 8 blocks for files that are few bytes long

The title is clear: why does ext3 allocate 8 blocks for files that are few bytes long? If I create a file named "test", put a few chars in it, and then I run: stat test I get that "Blocks: 8" I searched in the web and found that ext does that, it allocates 8 blocks even if It doesn't need... (4 Replies)
Discussion started by: Tavo
4 Replies

7. Shell Programming and Scripting

extract blocks of text from a file

Hi, This is part of a large text file I need to separate out. I'd like some help to build a shell script that will extract the text between sets of dashed lines, write that to a new file using the whole or part of the first text string as the new file name, then move on to the next one and... (7 Replies)
Discussion started by: cajunfries
7 Replies

8. Shell Programming and Scripting

Extract sequence blocks

Hi, I have an one-line file consisting of a sequence of 660 letters. I would like to extract 9-letter blocks iteratively: ASDFGHJKLQWERTYUIOPZXCVBNM first block: ASDFGHJKL 1nd block: SDFGHJKLQ What I have so far only gives me the first block, can anyone please explain why? cat... (7 Replies)
Discussion started by: solli
7 Replies

9. UNIX for Advanced & Expert Users

Deal with binary sequences

Hello, I have come across the necessity for me to deal with binary sequences and I had a few questions. 1- Does any UNIX scripting language provide any tool or command for converting text data to binary sequences? Example of binary sequence: "0x97 0x93 0x85 0x40 0xd5 0xd6 0xd7" 2- If I want... (2 Replies)
Discussion started by: Indalecio
2 Replies

10. Shell Programming and Scripting

Remove first N bytes and last N bytes from a binary file on AIX.

Hi all, Does anybody know or guide me on how to remove the first N bytes and the last N bytes from a binary file? Is there any AWK or SED or any command that I can use to achieve this? Your help is greatly appreciated!! Best Regards, Naveen. (1 Reply)
Discussion started by: naveendronavall
1 Replies
Login or Register to Ask a Question