Extract sequences of bytes from binary for differents blocks


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract sequences of bytes from binary for differents blocks
# 1  
Old 08-13-2013
Extract sequences of bytes from binary for differents blocks

Hello to all,

I would like to search sequences of bytes inside big binary file.

The bin file contains blocks of information, each block begins is estructured as follow:

1- Each block begins with the hex 32 (1 byte) and ends with FF. After the FF of the last block, it follows 33.
2- Next sequence to extract is the correlative (3 bytes) --> I mean, 1, 2, 3...N
3- Next sequence to extract is Product Series (8 bytes) --> The first 4 bytes are always "99 11 45 27"
4- Next sequence to extract is Product Model (8 bytes) --> The first 2 bytes are always "73 49"

There some other sequences of bytes I need to extract for each block, but I need somebody could help me first
saying me how to begin to do this for the 4 items mentioned above.

It is possible to do it in shell script, combining awk etc or what do you suggest me?

PD: It would be better not to save an hexdump in a textfile first, because the binary could be of 2GB. Would be better a way
to extract the sequences directly from the binary.

Thanks in advance

an hexdump -C of the sample binary file is below:
Code:
31 45 4a 58 58 59 57 31 5f 44 31 32 31 31 33 30
38 30 37 31 33 34 34 06 99 11 45 27 89 34 55 ff
32 00 00 01 99 11 45 27 89 34 55 0f 73 49 45 49
23 2f ff ff 00 15 00 0a 48 00 01 5a 00 02 42 00
01 60 00 01 33 00 01 36 00 01 37 00 01 5b 00 01
7e 00 01 69 00 00 6a 00 00 79 00 00 93 00 01 22
00 00 21 00 01 09 00 01 0a 00 01 26 00 01 02 00
01 04 00 01 05 00 01 06 00 01 10 00 01 08 00 01
2b 00 00 2c 00 01 2d 00 01 2e 00 01 55 00 01 56
00 07 2a 00 00 2f 00 00 30 00 00 31 00 00 ff 34
00 80 09 32 c9 06 88 88 80 00 a0 00 80 09 35 c9
06 00 00 80 00 00 00 80 09 3c c9 06 88 88 80 00
80 00 80 09 43 c9 06 88 88 80 00 80 00 05 82 00
37 06 01 00 00 01 00 65 00 00 00 02 00 00 02 00
18 00 00 00 03 00 00 03 00 17 00 00 00 04 00 00
04 00 01 00 00 00 05 00 00 05 00 15 00 00 00 0a
00 ff ff 00 65 00 00 00 07 80 2e c9 18 05 91 73
49 52 69 53 1f ff ff ff 00 91 73 49 52 69 53 1f
ff ff 00 01 03 ca 03 08 08 fe cb 0a 00 00 00 00
00 00 00 00 00 00 cc 01 01 81 1b c9 0b 00 91 73
49 52 69 56 7f ff ff ff ca 06 00 00 00 00 00 00
cb 01 03 cc 01 01 ff 32 00 00 02 99 11 45 27 89
34 55 1f 73 49 45 54 76 8f ff ff 00 15 00 0a 48
00 01 5a 00 02 42 00 01 60 00 01 33 00 01 36 00
01 37 00 01 5b 00 01 66 00 01 65 00 01 77 00 01
78 00 01 7e 00 01 69 00 00 6a 00 00 79 00 00 93
00 01 22 00 00 21 00 01 09 00 01 0a 00 01 26 00
01 02 00 01 04 00 01 05 00 01 06 00 01 10 00 01
08 00 01 2b 00 00 2c 00 01 2d 00 01 2e 00 01 55
00 01 56 00 07 2a 00 00 2f 00 00 30 00 00 31 00
00 ff 34 00 80 09 32 c9 06 88 88 80 00 a0 00 80
09 35 c9 06 00 00 80 00 00 00 80 09 3c c9 06 88
88 80 00 80 00 80 09 43 c9 06 88 88 80 00 80 00
03 80 0f 01 02 00 00 00 30 73 49 52 69 05 ff ff
ff 00 81 0f 01 02 00 00 01 3a 73 49 52 69 55 9f
ff ff 00 83 10 01 0c 00 00 00 9f 73 49 52 69 05
ff ff ff 01 01 86 0f 01 0e 00 00 00 eb 73 49 52
69 59 6f ff ff 00 87 0f 01 01 00 06 f6 99 73 49
52 69 56 3f ff ff 00 84 0e 00 01 00 00 01 00 01
00 ff ff 00 00 01 01 85 06 00 03 79 00 01 ea 05
82 00 37 06 01 00 00 01 00 65 00 00 00 02 00 00
02 00 18 00 00 00 03 00 00 03 00 17 00 00 00 04
00 00 04 00 01 00 00 00 05 00 00 05 00 15 00 00
00 0a 00 ff ff 00 65 00 00 00 07 80 2e c9 18 00
91 73 49 52 69 53 9f ff ff ff 00 91 73 49 52 69
53 9f ff ff 00 01 03 ca 03 08 08 fe cb 0a 00 00
00 00 00 00 00 00 00 00 cc 01 01 81 1b c9 0b 00
91 73 49 52 69 56 7f ff ff ff ca 06 00 00 00 00
00 00 cb 01 03 cc 01 01 ff 33 31 33 30 38 30 37
31 33 34 34 30 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

# 2  
Old 08-13-2013
Here is some that will work with hexdump, have no idea on how to do it on a binary file.
Code:
awk '{$1=$1} {for (i=1;i<=NF;i++) {if (($i" "$(i+1)" "$(i+5)" "$(i+6)" "$(i+7)" "$(i+8)" "$(i+13)" "$(i+14))=="ff 32 99 11 45 27 73 49") {f=i+1;print "field="f,"row="int(f/16+1),"column="(f/16-int(f/16))*16} }}' RS="" hexdump
field=33 row=3 column=1
field=344 row=22 column=8

Code:
awk '
	{$1=$1} 
	{for (i=1;i<=NF;i++) 
		{if (($i" "$(i+1)" "$(i+5)" "$(i+6)" "$(i+7)" "$(i+8)" "$(i+13)" "$(i+14))=="ff 32 99 11 45 27 73 49") 
			{f=i+1;print "field="f,"row="int(f/16+1),"column="(f/16-int(f/16))*16}
		}
	}' RS="" hexdump

# 3  
Old 08-13-2013
Depending on the size of the file see if this idea will help you:-

https://www.unix.com/shell-programmin...ipulation.html

If you intend to attempt to put the binary values into a _string_variable_ then 0, (zero), is not possible directly under bash, all other values are possible. You will have to detect the 0's and slot in "\0" instead.

So from the 256 bytes of DEMO data in the pointer above only 255, (1 to 255), can be placed into a _variable_. It is easy to add 2 more bytes to represent a 0 as mentioned above but makes the DEMO string 257 bytes in size...

However transferring to another binary file is easy as shown in the DEMO...

Hope this will help you...
# 4  
Old 08-13-2013
Hello to all,

If there is an option extract the sequences directly from the binary file would be better and faster I think,
I'm not sure if it is possible with bash or Perl or another option you can suggest me.

Hello Jotne,

Thanks for your help. Your script detects the position of the sequences but I would like to
extract those sequences to a new file having in the output file one line per information of each block
in binary.

Hello wisecracker,
I'll check the option you mention, but it is possible with your script to extract the complete byte sequence?

Thanks again for the help.
# 5  
Old 08-13-2013
Python 3.x.x can easily handle binary and is fast at manipulating huge files.
# 6  
Old 08-14-2013
Thanks wisecraker.

Do you know some of python and maybe you can show me an example of how to extract those byte sequences using this language.

Thanks in advance.
# 7  
Old 08-14-2013
Can you upload a reduced version of this binary with at least one block?

--ahamed
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Blocks of text in a file - extract when matches...

I sat down yesterday to write this script and have just realised that my methodology is broken........ In essense I have..... ----------------------------------------------------------------- (This line really is in the file) Service ID: 12345 ... (7 Replies)
Discussion started by: Bashingaway
7 Replies

2. Shell Programming and Scripting

Extract the part of sequences from a file

I have a text file, input.fasta contains some protein sequences. input.fasta is shown below. >P02649 MKVLWAALLVTFLAGCQAKVEQAVETEPEPELRQQTEWQSGQRWELALGRFWDYLRWVQT LSEQVQEELLSSQVTQELRALMDETMKELKAYKSELEEQLTPVAEETRARLSKELQAAQA RLGADMEDVCGRLVQYRGEVQAMLGQSTEELRVRLASHLRKLRKRLLRDADDLQKRLAVY... (8 Replies)
Discussion started by: rahim42
8 Replies

3. Shell Programming and Scripting

Extract length wise sequences from fastq file

I have a fastq file from small RNA sequencing with sequence lengths between 15 - 30. I wanted to filter sequence lengths between 21-25 and write to another fastq file. how can i do that? (4 Replies)
Discussion started by: empyrean
4 Replies

4. Shell Programming and Scripting

Extract sequences based on the list

Hi, I have a file with more than 28000 records and it looks like below.. >mm10_refflat_ABCD range=chr1:1234567-2345678 tgtgcacactacacatgactagtacatgactagac....so on >mm10_refflat_BCD range=chr1:3234567-4545678... tgtgcacactacacatgactagtatgtgcacactacacatgactagta . . . . . so on ... (2 Replies)
Discussion started by: Diya123
2 Replies

5. UNIX for Dummies Questions & Answers

X bytes of 0, Y bytes of random data, Z bytes of 5, T bytes of 1. ??

Hello guys. I really hope someone will help me with this one.. So, I have to write this script who: - creates a file home/student/vmdisk of 10 mb - formats that file to ext3 - mounts that partition to /mnt/partition - creates a file /mnt/partition/data. In this file, there will... (1 Reply)
Discussion started by: razolo13
1 Replies

6. Linux

Why does ext3 allocate 8 blocks for files that are few bytes long

The title is clear: why does ext3 allocate 8 blocks for files that are few bytes long? If I create a file named "test", put a few chars in it, and then I run: stat test I get that "Blocks: 8" I searched in the web and found that ext does that, it allocates 8 blocks even if It doesn't need... (4 Replies)
Discussion started by: Tavo
4 Replies

7. Shell Programming and Scripting

extract blocks of text from a file

Hi, This is part of a large text file I need to separate out. I'd like some help to build a shell script that will extract the text between sets of dashed lines, write that to a new file using the whole or part of the first text string as the new file name, then move on to the next one and... (7 Replies)
Discussion started by: cajunfries
7 Replies

8. Shell Programming and Scripting

Extract sequence blocks

Hi, I have an one-line file consisting of a sequence of 660 letters. I would like to extract 9-letter blocks iteratively: ASDFGHJKLQWERTYUIOPZXCVBNM first block: ASDFGHJKL 1nd block: SDFGHJKLQ What I have so far only gives me the first block, can anyone please explain why? cat... (7 Replies)
Discussion started by: solli
7 Replies

9. UNIX for Advanced & Expert Users

Deal with binary sequences

Hello, I have come across the necessity for me to deal with binary sequences and I had a few questions. 1- Does any UNIX scripting language provide any tool or command for converting text data to binary sequences? Example of binary sequence: "0x97 0x93 0x85 0x40 0xd5 0xd6 0xd7" 2- If I want... (2 Replies)
Discussion started by: Indalecio
2 Replies

10. Shell Programming and Scripting

Remove first N bytes and last N bytes from a binary file on AIX.

Hi all, Does anybody know or guide me on how to remove the first N bytes and the last N bytes from a binary file? Is there any AWK or SED or any command that I can use to achieve this? Your help is greatly appreciated!! Best Regards, Naveen. (1 Reply)
Discussion started by: naveendronavall
1 Replies
Login or Register to Ask a Question