Sponsored Content
Top Forums Shell Programming and Scripting Extract sequences of bytes from binary for differents blocks Post 302844259 by ahamed101 on Sunday 18th of August 2013 04:39:08 PM
Old 08-18-2013
Can you please show me the bytes you need from the attached image?
In between I see a 0x01 which disrupts the 16 bytes.

--ahamed

---------- Post updated 08-18-13 at 01:39 PM ---------- Previous update was 08-17-13 at 08:39 PM ----------

Ok, I just figured out, the byte after the 0x8Z is actually the size of data.
Code:
#include <stdio.h>
#include <stdlib.h>

#define err(x) {printf("\nError: %s... Exiting...\n", x); exit(1);}

static unsigned char pat1[] = {0x99, 0x11, 0x45, 0x27};
static unsigned char pat2[] = {0x73, 0x49};
static unsigned char pat3[] = {0xff, 0x34};
static unsigned char intrim_pat1[][2] = { {0x03, 0x80}, {0x03, 0x81}, {0x03, 0x83}, {0x03, 0x86}, {0x03, 0x87} };
static unsigned char end[] = {0xff, 0x33};

void print_data(const unsigned char *ptr, int len)
{
        int i;
        for(i=0;i<=len;i++)
                printf("%02x ", ptr[i]);
        printf("\n");
        return;
}

void get_len_and_print(FILE *fp, unsigned char *ptr)
{
        int len = 0;

        //only buf[0] is populated at this stage
        if(1 != fread(ptr+1, sizeof(char), 1, fp))
                err("Insufficient data");
        len = *(ptr+1);
        if(len != fread(ptr+2, sizeof(char), len, fp))
                err("Insufficient data");
        print_data(ptr, len+1);
        return;
}

int main(int argc, char **argv)
{

        if(argc < 2)
                err("File name missing");

        char found = 0, more = 0, again = 0;
        unsigned char buf[32];
        unsigned char *ptr = buf;
        int pos = 0, i, len;
        int arr_size = (sizeof(intrim_pat1)/2);

        FILE *fp = fopen(argv[1], "rb");
        if(!fp) err("Unable to open the file");

        while(2 == fread(ptr, sizeof(char), 2, fp)){
                pos = ftell(fp);

                //check for end of file pattern
                if(found && !memcmp(buf, end, 2)){
                        found=0; //start over or stop??
                        continue;
                }

                //check for 0xff 0x34
                if(found && !(memcmp(buf, pat3, 2))){
                        more = 1;
                        continue;
                }

                if(found && more){
                        for(i=0; i < arr_size; i++){
                                // We got the intrim pattern.
                                if(!memcmp(buf, intrim_pat1[i], 2)){
                                        again=1;
                                        more=0;
                                        break;
                                }
                        }
                        if(again) {
                                // Now read the next 1 byte which is actually the size
                                if(1 != fread(ptr+2, sizeof(char), 1, fp))
                                        err("Insufficient data");
                                len = buf[2];
                                if(len != fread(ptr+3, sizeof(char), len, fp))
                                        err("Insufficient data");
                                print_data(ptr, len+2);

                                // Now check for the remaining pattern
                                // Assuming it will be in ascending order
                                // Break even if we dont find the very next pattern
                                // Also, reset the fp. We will flush all the data in buffer
                                // till now, since we have printed them already
                                pos = ftell(fp);

                                if(1 != fread(ptr, sizeof(char), 1, fp))
                                        err("Insufficient data");

                                while(i < arr_size){
                                        if( buf[0] == intrim_pat1[i][1]){
                                                get_len_and_print(fp, ptr);

                                                if(1 != fread(ptr, sizeof(char), 1, fp))
                                                        err("Insufficient data");
                                                pos = ftell(fp);
                                        }
                                        i++;
                                }
                                again=0;
                        }
                }
                if(buf[0] == 0x32){
                        if(18 != fread(ptr+2, sizeof(char), 18, fp))
                                err("Insufficient data");
                        if(memcmp(buf+4, pat1, 4) && memcmp(buf+12, pat2, 2)){
                                fseek(fp, pos, SEEK_SET);
                        }else{
                                found = 1; //found the starting of the block with data
                                print_data(ptr, 19);
                        }
                        continue;
                }
                pos--;
                if(fseek(fp, pos, SEEK_SET))
                        err("Error in seeking");
        }
        return 0;
}

You may want to test this extensively!
And I think I have given you enough information to proceed further.

--ahamed

Last edited by ahamed101; 08-18-2013 at 05:46 PM..
This User Gave Thanks to ahamed101 For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove first N bytes and last N bytes from a binary file on AIX.

Hi all, Does anybody know or guide me on how to remove the first N bytes and the last N bytes from a binary file? Is there any AWK or SED or any command that I can use to achieve this? Your help is greatly appreciated!! Best Regards, Naveen. (1 Reply)
Discussion started by: naveendronavall
1 Replies

2. UNIX for Advanced & Expert Users

Deal with binary sequences

Hello, I have come across the necessity for me to deal with binary sequences and I had a few questions. 1- Does any UNIX scripting language provide any tool or command for converting text data to binary sequences? Example of binary sequence: "0x97 0x93 0x85 0x40 0xd5 0xd6 0xd7" 2- If I want... (2 Replies)
Discussion started by: Indalecio
2 Replies

3. Shell Programming and Scripting

Extract sequence blocks

Hi, I have an one-line file consisting of a sequence of 660 letters. I would like to extract 9-letter blocks iteratively: ASDFGHJKLQWERTYUIOPZXCVBNM first block: ASDFGHJKL 1nd block: SDFGHJKLQ What I have so far only gives me the first block, can anyone please explain why? cat... (7 Replies)
Discussion started by: solli
7 Replies

4. Shell Programming and Scripting

extract blocks of text from a file

Hi, This is part of a large text file I need to separate out. I'd like some help to build a shell script that will extract the text between sets of dashed lines, write that to a new file using the whole or part of the first text string as the new file name, then move on to the next one and... (7 Replies)
Discussion started by: cajunfries
7 Replies

5. Linux

Why does ext3 allocate 8 blocks for files that are few bytes long

The title is clear: why does ext3 allocate 8 blocks for files that are few bytes long? If I create a file named "test", put a few chars in it, and then I run: stat test I get that "Blocks: 8" I searched in the web and found that ext does that, it allocates 8 blocks even if It doesn't need... (4 Replies)
Discussion started by: Tavo
4 Replies

6. UNIX for Dummies Questions & Answers

X bytes of 0, Y bytes of random data, Z bytes of 5, T bytes of 1. ??

Hello guys. I really hope someone will help me with this one.. So, I have to write this script who: - creates a file home/student/vmdisk of 10 mb - formats that file to ext3 - mounts that partition to /mnt/partition - creates a file /mnt/partition/data. In this file, there will... (1 Reply)
Discussion started by: razolo13
1 Replies

7. Shell Programming and Scripting

Extract sequences based on the list

Hi, I have a file with more than 28000 records and it looks like below.. >mm10_refflat_ABCD range=chr1:1234567-2345678 tgtgcacactacacatgactagtacatgactagac....so on >mm10_refflat_BCD range=chr1:3234567-4545678... tgtgcacactacacatgactagtatgtgcacactacacatgactagta . . . . . so on ... (2 Replies)
Discussion started by: Diya123
2 Replies

8. Shell Programming and Scripting

Extract length wise sequences from fastq file

I have a fastq file from small RNA sequencing with sequence lengths between 15 - 30. I wanted to filter sequence lengths between 21-25 and write to another fastq file. how can i do that? (4 Replies)
Discussion started by: empyrean
4 Replies

9. Shell Programming and Scripting

Extract the part of sequences from a file

I have a text file, input.fasta contains some protein sequences. input.fasta is shown below. >P02649 MKVLWAALLVTFLAGCQAKVEQAVETEPEPELRQQTEWQSGQRWELALGRFWDYLRWVQT LSEQVQEELLSSQVTQELRALMDETMKELKAYKSELEEQLTPVAEETRARLSKELQAAQA RLGADMEDVCGRLVQYRGEVQAMLGQSTEELRVRLASHLRKLRKRLLRDADDLQKRLAVY... (8 Replies)
Discussion started by: rahim42
8 Replies

10. Shell Programming and Scripting

Blocks of text in a file - extract when matches...

I sat down yesterday to write this script and have just realised that my methodology is broken........ In essense I have..... ----------------------------------------------------------------- (This line really is in the file) Service ID: 12345 ... (7 Replies)
Discussion started by: Bashingaway
7 Replies
All times are GMT -4. The time now is 07:43 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy