Split binary file every occurrence of a group of characters


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Split binary file every occurrence of a group of characters
# 15  
Old 03-30-2013
The default allocation size of your filesystem is probably 4K.

Regards,
Alister
# 16  
Old 03-31-2013
Hum is that what would cause 2KB files to double. Interesting I have something to try.

---------- Post updated at 12:25 AM ---------- Previous update was at 12:14 AM ----------

Wow that worked. Splitting them at 4K stops the double sizing. Well I now learned too small of a file can take up unnecessary space. Awesome.

---------- Post updated at 02:13 AM ---------- Previous update was at 12:25 AM ----------

Okay hum here's another way of doing it i could try it would take longer but it would help in the end.

If I split them into 2K bits.

Have them numbered via my method so there in order. Then Take the 18th "I think thats right" bit and add it to the start of the file name I end up with:

Code:
BB-44 A7 AC 76 1C 31.mpg
E0-44 A7 AC 7A AC DD.mpg

Several Files later
Code:
E0-44 A7 B6 60 BC B1.mpg
BB-44 A7 B6 65 4D 5D.mpg

Which I have converted to

Code:
Start-0404 1007 1012 0706 0112 0301.mpg
Next-0404 1007 1012 0710 1012 1313.mpg

Several Files later
Code:
Next-0404 1007 1106 0600 1112 1101.mpg
Start-0404 1007 1106 0605 0413 0513.mpg

Converting the Letters to their appropriate HEX number resolves weird organization problems.

Have a script take Start.mpg and then each Next.mpg moving them to a new folder.
Starting a new folder every occurrence of start.

Then I use Cat to recombine the files of each folder. There may be a script way of making that more of a batch process.

I would end up with the correct file.
Each one starting with the code I'm looking to match.

---------- Post updated at 02:15 AM ---------- Previous update was at 02:13 AM ----------

Or Add say 001 to Start and each Next until the next occurrence of Start then Changing to 002 and so on.
I've used an applescript before to move several photos starting with the same number to a new folder before.

---------- Post updated at 02:24 AM ---------- Previous update was at 02:15 AM ----------

Well perhaps Start and Next would have to go near the end I don't know yet.

---------- Post updated at 02:51 PM ---------- Previous update was at 02:24 AM ----------

I discovered an application called Replace Pioneer. Which can separate files based on content. And do other renaming actions. I open file as HEX and then I can split based on content. Problem it dumps the Hex into straight Text which is okay xxd or something I used before can convert it back to binary.
Second problem I can't figure out how to get it to variable the changing bits. I was able to get it to correctly separate the files based on
Code:
00 00 01 BA

which created many 2K files. Same results as just doing
Code:
split 2k

---------- Post updated Mar 31st, 2013 at 10:25 AM ---------- Previous update was Mar 30th, 2013 at 02:51 PM ----------

Well Replace Pioneer does't seem to be helping much either.

Why is it so complicated for software to split a file on every occurrence of a word or number and keep the first 12 letters before that in the same files. LOL

---------- Post updated at 10:28 AM ---------- Previous update was at 10:25 AM ----------

So far I found a few codes i though might help me but they don't search binary they serch text.
Code:
sed 's/3d3d/\n&/g;s/^\n\(3d3d\)/\1/' temp |csplit -zf temp - '/^3d3d/' {*}

And
Code:
$ awk '/START/{x="F"++i;}{print > x;}' file2

I just get error's.

---------- Post updated at 10:40 AM ---------- Previous update was at 10:28 AM ----------

If I have to I can convert the file to HEX dump, Then Convert the occurrence I want the new files to find say to "Start" instead of
Code:
89 C3 F8 00 00 01 BB

all I need is it to include the word's or letter's before that in a specified range. So it includes the
Code:
000001BARandom

And stop each new file at that occurrence
Code:
89 C3 F8 00 00 01 BB

possibly renamed start, then removing the charters from the end of the file so it end's right before
Code:
000001BARandom89C3F8000001BB

or
Code:
000001BARandomStart


Last edited by Scrutinizer; 03-30-2013 at 05:47 AM.. Reason: code tags
# 17  
Old 03-31-2013
Hi.

Observations:
Quote:
Originally Posted by PatrickE
Why is it so complicated for software to split a file on every occurrence of a word or number and keep the first 12 letters before that in the same files. LOL
...
So far I found a few codes i though might help me but they don't search binary they serch text. ...
Because, in part:
Quote:
This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.
-- Unix philosophy - Wikipedia, the free encyclopedia

The perl language has facilities for reading byte-streams (read a block of data, use function unpack after reading the file). I have used it to read mixed-mode files -- ASCII intertwined with "binary" floating-point and integer internal values.

Now that I think about it, COBOL probably can do that as well, at least for some well-defined formats. I recall some folks in the Physics department where I worked using COBOL to process satellite data because of the superior record-handing characteristics.

However, in general, I try to stay as far away from such files as I can.

Best wishes ... cheers, drl
# 18  
Old 03-31-2013
Thank's I will look into those.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split a big file into multiple files based on first four characters

I have a requirement to split a huge file to smaller text files based on first four characters which look like ABCD 1234 DFGH RREX : : : : : 0000 Each of these records are OF EQUAL bytes with a different internal layout based on the above first digit identifier.. Any help to start... (5 Replies)
Discussion started by: etldev
5 Replies

2. UNIX for Dummies Questions & Answers

counting occurrence of characters in a string

Hello, I have a string like this 0:1:2:0:2:2:4:0:0:0:-200:500...... what i want is to break down how many different characters are there and their count. For example for above string it should display 0 - 5 times 1 - 1 times 2 - 3 times 4 - 1 times . . . I am stuck in writing... (8 Replies)
Discussion started by: exit86
8 Replies

3. Shell Programming and Scripting

split a string and convert to binary

Hi All, Iam new to unix scripting and I want a split a string into 4 characters each, leaving the last two characters and convert the splitted values into binary. For example: string='ffd80012ffe20000ffebfffeffea0007fff0ffd70014fff1fff0fff0fff201' this should split as ffd8 0012 ffe2 . .... (5 Replies)
Discussion started by: srinivasayedla
5 Replies

4. Shell Programming and Scripting

Deleting all characters before the last occurrence of /

Hi All, I have a text file with the following text in it: file:///About/accessibility.html file:///About/disclaimer.html file:///About/disclaimer.html#disclaimer file:///pubmed?term=%22Dacre%20I%22%5BAuthor%5D file:///pubmed?term=%22Madigan%20J%22%5BAuthor%5D... (8 Replies)
Discussion started by: shoaibjameel123
8 Replies

5. Shell Programming and Scripting

split file based on group count

Hi, can some one please help me to split the file based on groups. like in the below scenario x indicates the begining of the group and the file should be split each with 2 groups below there are 10 groups it should create 5 files. could you please help? (4 Replies)
Discussion started by: hitmansilentass
4 Replies

6. Shell Programming and Scripting

remove last characters after %EOF (pdf binary file)

Hi, I want to know how I can remove the last characters of ANY pdf file. I read it under "od" in the command shell to see which were the last characters: $od corruptedfile.pdf -c When I see the file, I need to keep only the last characters, or "end of the file": %EOF (obviously keeping all... (1 Reply)
Discussion started by: diegugawa
1 Replies

7. Shell Programming and Scripting

Split binary file with pattern

Hello! Have some problem with extract files from saved session. File contains any kind of special/printable characters. DATA NumberA DATA DATA Begin DATA1.1 DATA1.2 NumberB1 DATA1.3 DATA1.4 End DATA DATA DATA Begin DATA2.1 DATA2.2 NumberB2 DATA2.3 DATA2.4 End DATA DATA ... (4 Replies)
Discussion started by: vvild
4 Replies

8. Shell Programming and Scripting

Split file by data group

Hi all, I'm having a little trouble solving a file split I need to get done. I have the following data: 1. Light 1A. Light Soft texture: it's soft color: the color value is that of something light vital statistics: srm: 23 og: 1.035 sp: 1.065 comment: this is nice if you like... (8 Replies)
Discussion started by: mkastin
8 Replies

9. Shell Programming and Scripting

Split these into many ...(/etc/group)!!

Guys Following input line is from /etc/group file.As we know last entry in a line of /etc/group is userlist (all the users belonging to that group). I need to splilt this one line into 3 lines as shown below (3 because userlist has 3 names in it). Input: lp:!:11:root,lp,printq ... (13 Replies)
Discussion started by: ak835
13 Replies

10. Shell Programming and Scripting

Split a binary file into 2 basing on 2 delemiter string

Hi all, I have a binary file (orig.dat) and two special delimiter strings 'AAA' and 'BBB'. My binary file's content is as follow: <Data1.1>AAA<Data1.2>BBB <Data2.1>AAA<Data2.2>BBB ... <DataN.1>AAA<DataN.2>BBB DataX.Y might have any length, and contains any kind of special/printable... (1 Reply)
Discussion started by: Averell
1 Replies
Login or Register to Ask a Question