Extract sequences of bytes from binary for differents blocks Post: 302845007

Sponsored Content

Top Forums Shell Programming and Scripting Extract sequences of bytes from binary for differents blocks Post 302845007 by wisecracker on Tuesday 20th of August 2013 02:36:12 PM

08-20-2013

Registered User

This is a starter Python 3.3.x code to extract the blocks using a user function...

This should be more than enough to get you going with your binary manipulation.

Once each block is placed into its respective _variable_ it is seriously easy to extract sub-blocks of data from each major block extracted from the binary file...

Code:

# BinarySearch.py
# DEMO to extract binary data...
# Written in such a way as to be easily understood.
# NOTE:- There are more Pythonic ways to do this...

import os
import sys

# The _variable_ used to test with...
block_one=b"?"
block_two=b"?"
subscript_begin=0

# Set to a _temporary_ directory. NOTE:- 4 different platforms shown...
# Change to suit your platform and path
if sys.platform=="linux2": os.chdir("/tmp/")
if sys.platform=="darwin": os.chdir("/Users/barrywalker/")
if sys.platform=="win32": os.chdir("C:\\Windows\\Temp\\")
if sys.platform=="amiga": os.chdir("T:")

# A basic "bin_extract()" function to extract binary data from a binary file...
def bin_extract(subscript_start=0, subscript_end=0, filename="?", char_start=b"\x00\xFF", char_end=b"\xFF\x00"):
	bin_var=b""
	toggle=0
	binfile=open(filename, "rb")
	for position in range(subscript_start, subscript_end, 1):
		binfile.seek(position)
		if binfile.read(2)==char_start:
			bin_var=bin_var+char_start[1:2]
			toggle=1
		if toggle>=1: bin_var=bin_var+binfile.read(1)
		if binfile.read(2)==char_end:
			bin_var=bin_var+char_end[0:1]
			toggle=0
			break
	binfile.close()
	return(bin_var, position)

# Test...
# ******** Do the first iteration and find the first block... *******
block_one, subscript_begin=bin_extract(0, 816, "BinaryFile", b"\xFF\x32", b"\xFF\x34")
print(block_one)
print(subscript_begin)
# When successful..........
# Just extract the required data from the _variable_ "new_binary" using standard simple Python methods
# and when finished with set new values to the function for the next block run using "subscript_begin"
# + or - 2 as the next "subscript_start" in the function, something like below...
block_two, subscript_begin=bin_extract((subscript_begin+2), 816, "BinaryFile", b"\xFF\x32", b"\xFF\x34")
print(block_two)
print(subscript_begin)
# DEMO end...

Hope this helps.
CYA.

wisecracker

View Public Profile for wisecracker

Find all posts by wisecracker

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove first N bytes and last N bytes from a binary file on AIX.

Hi all, Does anybody know or guide me on how to remove the first N bytes and the last N bytes from a binary file? Is there any AWK or SED or any command that I can use to achieve this? Your help is greatly appreciated!! Best Regards, Naveen.

2. UNIX for Advanced & Expert Users

Deal with binary sequences

Hello, I have come across the necessity for me to deal with binary sequences and I had a few questions. 1- Does any UNIX scripting language provide any tool or command for converting text data to binary sequences? Example of binary sequence: "0x97 0x93 0x85 0x40 0xd5 0xd6 0xd7" 2- If I want...

3. Shell Programming and Scripting

Extract sequence blocks

Hi, I have an one-line file consisting of a sequence of 660 letters. I would like to extract 9-letter blocks iteratively: ASDFGHJKLQWERTYUIOPZXCVBNM first block: ASDFGHJKL 1nd block: SDFGHJKLQ What I have so far only gives me the first block, can anyone please explain why? cat...

4. Shell Programming and Scripting

extract blocks of text from a file

Hi, This is part of a large text file I need to separate out. I'd like some help to build a shell script that will extract the text between sets of dashed lines, write that to a new file using the whole or part of the first text string as the new file name, then move on to the next one and...

5. Linux

Why does ext3 allocate 8 blocks for files that are few bytes long

The title is clear: why does ext3 allocate 8 blocks for files that are few bytes long? If I create a file named "test", put a few chars in it, and then I run: stat test I get that "Blocks: 8" I searched in the web and found that ext does that, it allocates 8 blocks even if It doesn't need...

6. UNIX for Dummies Questions & Answers

X bytes of 0, Y bytes of random data, Z bytes of 5, T bytes of 1. ??

Hello guys. I really hope someone will help me with this one.. So, I have to write this script who: - creates a file home/student/vmdisk of 10 mb - formats that file to ext3 - mounts that partition to /mnt/partition - creates a file /mnt/partition/data. In this file, there will...

7. Shell Programming and Scripting

Extract sequences based on the list

Hi, I have a file with more than 28000 records and it looks like below.. >mm10_refflat_ABCD range=chr1:1234567-2345678 tgtgcacactacacatgactagtacatgactagac....so on >mm10_refflat_BCD range=chr1:3234567-4545678... tgtgcacactacacatgactagtatgtgcacactacacatgactagta . . . . . so on ...

8. Shell Programming and Scripting

Extract length wise sequences from fastq file

I have a fastq file from small RNA sequencing with sequence lengths between 15 - 30. I wanted to filter sequence lengths between 21-25 and write to another fastq file. how can i do that?

9. Shell Programming and Scripting

Extract the part of sequences from a file

I have a text file, input.fasta contains some protein sequences. input.fasta is shown below. >P02649 MKVLWAALLVTFLAGCQAKVEQAVETEPEPELRQQTEWQSGQRWELALGRFWDYLRWVQT LSEQVQEELLSSQVTQELRALMDETMKELKAYKSELEEQLTPVAEETRARLSKELQAAQA RLGADMEDVCGRLVQYRGEVQAMLGQSTEELRVRLASHLRKLRKRLLRDADDLQKRLAVY...

10. Shell Programming and Scripting

Blocks of text in a file - extract when matches...

I sat down yesterday to write this script and have just realised that my methodology is broken........ In essense I have..... ----------------------------------------------------------------- (This line really is in the file) Service ID: 12345 ...

LEARN ABOUT CENTOS

fiz

FIZ(1)							      General Commands Manual							    FIZ(1)

NAME

       fiz - analyze damaged zoo archive for data recovery

SYNOPSIS

       fiz archive[.zoo]

DESCRIPTION

       Fiz  is	used to analyze damaged zoo archives and locate directory entries and file data in them.  The current version of fiz is 2.0 and it
       is meant to be used in conjunction with zoo version 2.0.  Fiz makes no assumptions about archive structure.  Instead,  it  simply  searches
       the  entire subject archive for tag values that mark the locations of directory entries and file data.  In a zoo archive, a directory entry
       contains information about a stored file such as its name, whether compressed or not, and its timestamp.  The file data are the actual data
       for the archived file, and may be either the original data, or the result of compressing the file.

       For  each directory entry found, fiz prints where in the archive it is located, the directory path and filename(s) found in it, whether the
       directory entry appears to be corrupted (indicated by [*CRC Error*]), and the value of the pointer to the file data that is  found  in  the
       directory  entry.   For	each block of file data found in the archive, fiz prints where in the archive the block begins.  In the case of an
       undamaged archive, the pointer to file data found in a directory entry will correspond to where fiz actually locates  the  data.   Here	is
       some sample output from fiz:

       ****************
	   2526: DIR  [changes] ==>   95
	   2587: DATA
       ****************
	   3909: DIR  [copyrite] ==> 1478
	   3970: DATA
	   4769: DATA
       ****************

       In  such  output,  DIR  indicates where fiz found a directory entry in the archive, and DATA indicates where fiz found file data in the ar-
       chive.  Filenames located by fiz are enclosed in square brackets, and the notation "==>	 95" indicates that the directory entry  found	by
       fiz  at	position 2526 has a file data pointer to position 95.  In actuality, fiz found file data at positions 2587, 3970, and 4769.  Since
       fiz found only two directory entries, and each directory entry corresponds to one file, one of the file data positions is an artifact.

       Once the locations of directory entries and file data are found, the @ modifier to zoo's archive list and extract commands can be used  and
       the  archive  contents  selectively  listed or extracted, skipping the damaged portion.	This is further described in the documentation for
       zoo(1).

       In the above case, commands to try giving to zoo might be x@2526,2587 (extract beginning at position 2526, and get file data from  position
       2587),  x@3090,3970  (extract at 3090, get data from 3970) and x@3909,4769 (extract at 3909, get data from 4769).  Once a correctly-matched
       directory entry/file data pair is found, zoo will in most cases synchronize with and correctly extract all files subsequently found in  the
       archive.  Trial and error should allow all undamaged files to be extracted.  Also note that self-extracting archives created using sez (the
       Self-Extracting Zoo utility for MS-DOS), which are normally executed on an MS-DOS system for extraction, can be extracted on non-MSDOS sys-
       tems in a similar way.

SEE ALSO

       zoo(1)

BUGS

       Random  byte patterns can occasionally be incorrectly recognized as tag values.	This occurs very rarely, however, and trial and error will
       usually permit all undamaged data to be extracted.

DIAGNOSTICS

       Fiz always exits with a status code of 0.

FUTURE DIRECTIONS

       Automation of data recovery from a damaged archive is potentially achievable.  However, since damaged archives occur only rarely, fiz as it
       currently stands is unlikely to change much in the near future.

AUTHOR

       Rahul Dhesi

								   Jan 31, 1988 							    FIZ(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove first N bytes and last N bytes from a binary file on AIX.

Discussion started by: naveendronavall

2. UNIX for Advanced & Expert Users

Deal with binary sequences

Discussion started by: Indalecio

3. Shell Programming and Scripting

Extract sequence blocks

Discussion started by: solli

4. Shell Programming and Scripting

extract blocks of text from a file

Discussion started by: cajunfries

5. Linux

Why does ext3 allocate 8 blocks for files that are few bytes long

Discussion started by: Tavo

6. UNIX for Dummies Questions & Answers

X bytes of 0, Y bytes of random data, Z bytes of 5, T bytes of 1. ??

Discussion started by: razolo13

7. Shell Programming and Scripting

Extract sequences based on the list

Discussion started by: Diya123