Extracting non zero records from Binary File


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extracting non zero records from Binary File
# 1  
Old 07-18-2011
Extracting non zero records from Binary File

Dear Experts,


I have one "binary file" which contains multiple records of fixed size 31744.
I need to extract only those records which have non-zero data.

Sample file could be:
Code:
   a6 82 (+31742 bytes)
   a6 00 12 00 (+31740 bytes)
   00 00 (00 31742 times)
   a6 00 12 34 (+31740 bytes)
   00 00 (00 31742 times)
   00 00 (00 31742 times)
   00 00 00 00 00 00

Required output:
Code:
   a6 82 (+31742 bytes)
   a6 00 12 00 (+31740 bytes)
   a6 00 12 34 (+31740 bytes)

Thanks,
Dhiraj

Last edited by Scott; 07-18-2011 at 02:11 PM.. Reason: Please start using code tags
# 2  
Old 07-18-2011
Follow these steps:
1) Extract one record with all zeros from the main file and create a new file (ie Zeros_File).
2) Run the following:
Code:
egrep -v -f Zeros_File Main_File

# 3  
Old 07-19-2011
Hi Shell_Life,

As my file is not standard text file, so I guess this would not work.
I need some way to operate on binary data.

Thanks,
# 4  
Old 07-19-2011
If you pass the binary file thru od you can get an integer value (0-255) for each byte of the file:

Code:
od -tu1 -An -w1 your_binary_file
 166
 130
 172
  87
  98
 228
  58
 100
 145
  40
...

It should then be pretty simple to process this output with awk, and use printf("%c", $1) within awk to convert the interger value back to a binary character.
# 5  
Old 07-19-2011
Try using dd to convert the file, e.g....
Code:
$ printf "abc\000\000\000de\000" >file1

$ dd if=file1 of=file2 cbs=3 conv=unblock
0+1 records in
0+1 records out
12 bytes (12 B) copied, 0 s, Infinity B/s

... you will now see newlines every 3 bytes...
Code:
$ od -hc file1; od -hc file2
0000000    6261    0063    0000    6564    0000
          a   b   c  \0  \0  \0   d   e  \0
0000011
0000000    6261    0a63    0000    0a00    6564    0a00
          a   b   c  \n  \0  \0  \0  \n   d   e  \0  \n
0000014

Use awk to remove null records and dd to convert back again...
Code:
$ awk '/[^\000]/' file2 > file3

$ dd if=file3 of=file4 cbs=3 conv=block
0+1 records in
0+1 records out
6 bytes (6 B) copied, 0 s, Infinity B/s

You should see that null records are removed...
Code:
$ od -hc file3; od -hc file4
0000000    6261    0a63    6564    0a00
          a   b   c  \n   d   e  \0  \n
0000010
0000000    6261    6463    0065
          a   b   c   d   e  \0
0000006

That's just an example with a block size of 3, you would use cbs=31744
This User Gave Thanks to Ygor For This Post:
# 6  
Old 07-19-2011
@Ygor, don't forget it's a binary file so there could be EOF CR or NULL characters dotted through the blocks of data. This will cause trouble for awk if the file is processed as-is.
# 7  
Old 07-19-2011
Read in chunks with dd, test against a file of the same size full of binary zeroes, print if nonzero.
Code:
dd if=/dev/zero of=zero bs=31744 count=1

while dd count=1 bs=31744 > test
do
        diff test zero > /dev/null || cat test
done < datain > dataout

rm -f test datafile

Or if you can use a solution in C:

Code:
#include <unistd.h>
#include <string.h>

int main(void)
{
        char buf[31744], zero[31744];
        ssize_t bpos=0;

        memset(zero, 0, sizeof(zero));

        while(1)
        {      // Read in entire chunk
                bpos=0;
                while(bpos < 31744)
                {
                        ssize_t r=read(STDIN_FILENO, buf+bpos, 31744-bpos);
                        if(r <= 0) // End of file
                                return(0);

                        bpos += r;
                }

                // check if zero
                if(memcmp(zero, buf, 31744) == 0) continue;

                bpos=0;
                // write out entire chunk
                while(bpos < 31744)
                {
                        ssize_t w=write(STDOUT_FILENO, buf+bpos, 31744-bpos);
                        if(w <= 0)
                                return(1); // write error

                        bpos += w;
                }
        }
}

use it like ./program < infile > outfile

Last edited by Corona688; 07-20-2011 at 12:57 PM.. Reason: typo fixed
These 2 Users Gave Thanks to Corona688 For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Separate records of a file on 2 types of records

Hi I am new to shell programming in unix Please if I can provide help. I have a file structure of a header record and "N" detail records. The header record will be the total number of detail records I need to split the file in 2: One for the header Another for all detail records Could... (1 Reply)
Discussion started by: jamcogar
1 Replies

2. Shell Programming and Scripting

Searching for a pattern and extracting records related to that pattern

Hi there, Looking forward to your advice for the below: I have a file which contains 2 paragraphs related to a particular pattern. I have to search for those paragraphs from a log file and then print a particular line from those paragraphs. Sample: I have one file with the fixed... (3 Replies)
Discussion started by: danish0909
3 Replies

3. Shell Programming and Scripting

Convert binary file to csv and then back to the binary format

Hello *nix specialists, Im working for a non profit organisation in Germany to transport DSL over WLAN to people in areas without no DSL. We are using Linksys WRT 54 router with DD-WRT firmware There are at the moment over 180 router running but we have to change some settings next time. So my... (7 Replies)
Discussion started by: digidax
7 Replies

4. Shell Programming and Scripting

Deleting duplicate records from file 1 if records from file 2 match

I have 2 files "File 1" is delimited by ";" and "File 2" is delimited by "|". File 1 below (3 record shown): Doc1;03/01/2012;New York;6 Main Street;Mr. Smith 1;Mr. Jones Doc2;03/01/2012;Syracuse;876 Broadway;John Davis;Barbara Lull Doc3;03/01/2012;Buffalo;779 Old Windy Road;Charles... (2 Replies)
Discussion started by: vestport
2 Replies

5. UNIX for Advanced & Expert Users

Searching and extracting records

Hello, I have a file with DNA sequences and I want to extract some records by searching them with a word in it and then write the whole record into another file. I am new to perl and having trouble to extract the whole record. Instead I am only able to write the line that contains the word. Can... (2 Replies)
Discussion started by: bjorngill
2 Replies

6. UNIX for Dummies Questions & Answers

Grep specific records from a file of records that are separated by an empty line

Hi everyone. I am a newbie to Linux stuff. I have this kind of problem which couldn't solve alone. I have a text file with records separated by empty lines like this: ID: 20 Name: X Age: 19 ID: 21 Name: Z ID: 22 Email: xxx@yahoo.com Name: Y Age: 19 I want to grep records that... (4 Replies)
Discussion started by: Atrisa
4 Replies

7. Shell Programming and Scripting

Extracting the records which contains atleast one double quote(")

Hi Experts, I have a file with some of records contain double quotes ("). I need to write these records in separate file and have to delete the same records from the original file. For Example: Orginal File : 100000,abcd,CRED,MO 100001,"efgh",CRED 100002,ijkl,CRED,TX... (3 Replies)
Discussion started by: vsairam
3 Replies

8. Shell Programming and Scripting

Extracting records with unique fields from a fixed width txt file

Greetings, I would like to extract records from a fixed width text file that have unique field elements. Data is structured like this: John A Smith NY Mary C Jones WA Adam J Clark PA Mary Jones WA Fieldname / start-end position Firstname 1-10... (8 Replies)
Discussion started by: sitney
8 Replies

9. Shell Programming and Scripting

Count No of Records in File without counting Header and Trailer Records

I have a flat file and need to count no of records in the file less the header and the trailer record. I would appreciate any and all asistance Thanks Hadi Lalani (2 Replies)
Discussion started by: guiguy
2 Replies

10. Solaris

compiled binary file gives "cannot execute binary file"

Hi, I have two Solaris machines. 1. SunOS X 5.8 Generic_108528-29 sun4u sparc SUNW,Sun-Blade-1500 2. SunOS Y 5.8 Generic_108528-13 sun4u sparc SUNW,Ultra-60 I am trying to buiild a project on both these machines. The Binary output file compiled on machine 2 runs on both the machines. Where... (0 Replies)
Discussion started by: scgupta
0 Replies
Login or Register to Ask a Question