Binary write POSIX-ly.


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Binary write POSIX-ly.
# 1  
Old 08-27-2016
Binary write POSIX-ly.

Hi guys and gals...

I am now beyond the limits of my POSIX knowledge here.

Below is a piece of code that runs perfectly well on small string lengths, BYTE sizes up to around 1KB, (3KB of octal text).

It generates byte vlues from 0x00 to 0xFF.

It passes Shell Check's requirements and runs perfectly well under 'dash'

The problem is, as the octal string becomes large, say the equivalent of say 16KB of pure binary data in octal format this is disastrously slow.

Because I have such bizarre requirements and intend to use builtins only I am stuck for a MUCH quicker way.

The 256 byte code below takes about 0.1 seconds to do the 'binary' part...

Any ideas?
Code:
#!/usr/local/bin/dash
# o2b.sh
# This passes ShellCheck and runs under dash!

# 256 bytes of octal values from 000 to 377.
octal="000001002003004005006007010011012013014015016017020021022023024025026027030031032033034035036037040041042043044045046047050051052053054055056057060061\
062063064065066067070071072073074075076077100101102103104105106107110111112113114115116117120121122123124125126127130131132133134135136137140141142143144145\
146147150151152153154155156157160161162163164165166167170171172173174175176177200201202203204205206207210211212213214215216217220221222223224225226227230231232\
233234235236237240241242243244245246247250251252253254255256257260261262263264265266267270271272273274275276277300301302303304305306307310311312313314315316317\
320321322323324325326327330331332333334335336337340341342343344345346347350351352353354355356357360361362363364365366367370371372373374375376377"

binary()
{
	# Obtain octal string length.
	length="${#octal}"
	# Subscript position starts at 1 NOT 0."
	position=1
	# 3 character octal value to be read.
	ooo="???"
	while [ "$position" -lt "$length" ]
	do
		# These two lines obtain the octal value.
		subtext1=${octal%"${octal#${ooo}}"}
		subtext2=${subtext1#"${subtext1%???}"}
		# Convert to pure binary.
		printf '%b' \\"$subtext2"
		# Increment values needed.
		position=$(( position + 3 ))
		ooo=$ooo'???'
	done
}

binary "$octal" > /tmp/binary

# The line below is just for checking...
hexdump -C /tmp/binary

exit 0

Result:-
Code:
Last login: Sat Aug 27 22:02:46 on ttys000
AMIGA:barrywalker~> cd Desktop/Code/Shell
AMIGA:barrywalker~/Desktop/Code/Shell> ./o2b.sh
00000000  00 01 02 03 04 05 06 07  08 09 0a 0b 0c 0d 0e 0f  |................|
00000010  10 11 12 13 14 15 16 17  18 19 1a 1b 1c 1d 1e 1f  |................|
00000020  20 21 22 23 24 25 26 27  28 29 2a 2b 2c 2d 2e 2f  | !"#$%&'()*+,-./|
00000030  30 31 32 33 34 35 36 37  38 39 3a 3b 3c 3d 3e 3f  |0123456789:;<=>?|
00000040  40 41 42 43 44 45 46 47  48 49 4a 4b 4c 4d 4e 4f  |@ABCDEFGHIJKLMNO|
00000050  50 51 52 53 54 55 56 57  58 59 5a 5b 5c 5d 5e 5f  |PQRSTUVWXYZ[\]^_|
00000060  60 61 62 63 64 65 66 67  68 69 6a 6b 6c 6d 6e 6f  |`abcdefghijklmno|
00000070  70 71 72 73 74 75 76 77  78 79 7a 7b 7c 7d 7e 7f  |pqrstuvwxyz{|}~.|
00000080  80 81 82 83 84 85 86 87  88 89 8a 8b 8c 8d 8e 8f  |................|
00000090  90 91 92 93 94 95 96 97  98 99 9a 9b 9c 9d 9e 9f  |................|
000000a0  a0 a1 a2 a3 a4 a5 a6 a7  a8 a9 aa ab ac ad ae af  |................|
000000b0  b0 b1 b2 b3 b4 b5 b6 b7  b8 b9 ba bb bc bd be bf  |................|
000000c0  c0 c1 c2 c3 c4 c5 c6 c7  c8 c9 ca cb cc cd ce cf  |................|
000000d0  d0 d1 d2 d3 d4 d5 d6 d7  d8 d9 da db dc dd de df  |................|
000000e0  e0 e1 e2 e3 e4 e5 e6 e7  e8 e9 ea eb ec ed ee ef  |................|
000000f0  f0 f1 f2 f3 f4 f5 f6 f7  f8 f9 fa fb fc fd fe ff  |................|
00000100
AMIGA:barrywalker~/Desktop/Code/Shell> _

OSX 10.7.5, default terminal but calling dash in the script.
# 2  
Old 08-28-2016
Even though string operations on shell variables are relatively fast, stripping three characters off of a huge string takes a while. Consider splitting your huge string into shorter strings and feed them into your existing code in a loop. Note also that the standards say that octal values passed to the printf %b format specifier need to be in the format:
Quote:
"\0ddd", where ddd is a zero, one, two, or three-digit octal number that shall be
converted to a byte with the numeric value specified by the octal number
and your code is not supplying the leading 0 for octal values larger than 077.
I don't have dash installed on my system, but sh, bash, and ksh on OS X El Capitan Version 10.11.6 all produce the output you specified when running the following modified version of your script:
Code:
#!/bin/ksh
# alt_o2b.sh
# This has not been tested by ShellCheck and runs under sh, bash, and ksh!

# 256 bytes of octal values from 000 to 377.
split_octal="000001002003004005006007010011012013014015016017
020021022023024025026027030031032033034035036037040041042043044045046047
050051052053054055056057060061062063064065066067070071072073074075076077
100101102103104105106107110111112113114115116117120121122123124125126127
130131132133134135136137140141142143144145146147150151152153154155156157
160161162163164165166167170171172173174175176177200201202203204205206207
210211212213214215216217220221222223224225226227230231232233234235236237
240241242243244245246247250251252253254255256257260261262263264265266267
270271272273274275276277300301302303304305306307310311312313314315316317
320321322323324325326327330331332333334335336337340341342343344345346347
350351352353354355356357360361362363364365366367370371372373374375376377"

binary()
{
	printf '%s\n' $split_octal | while read octal
	do
		# Obtain octal string length.
		length="${#octal}"
		# Subscript position starts at 1 NOT 0."
		position=1
		# 3 character octal value to be read.
		ooo="???"
		while [ "$position" -lt "$length" ]
		do
			# These two lines obtain the octal value.
			subtext1=${octal%"${octal#${ooo}}"}
			subtext2=${subtext1#"${subtext1%???}"}
			# Convert to pure binary.
			printf '%b' "\0$subtext2"
			# Increment values needed.
			position=$(( position + 3 ))
			ooo=$ooo'???'
		done
	done
}

binary > /tmp/binary2

# The line below is just for checking...
hexdump -C /tmp/binary2

exit 0

Note that I changed the output file name form /tmp/binary to /tmp/binary2 so you can compare the results of the two scripts directly if you'd like to compare run-times of your script against this script and compare the output files produced.

When testing your script (using ksh instead of dash and with the 2nd operand to printf '%b' modified as shown in the script above, I get the same output as you got with both scripts. But, the script above ran in about 10% of the time needed to run your script. I would expect a considerably greater run time improvement for considerably longer input data.

Note also that although you were passing an argument to the binary function, the function you have defined does not use any positional parameters. Therefore, I have removed that operand from the function invocation.

Last edited by Don Cragun; 08-28-2016 at 08:16 AM.. Reason: Fix typo: s/iexdump/hexdump/ as noted in post #3.
This User Gave Thanks to Don Cragun For This Post:
# 3  
Old 08-28-2016
Hi Don...

Where would I be without you...

This is at least an order of magnitude faster than my test code.

Works perfectly in dash, except for your typo for hexdump near the end of the script.

Off to try some big binary files now. I will keep you informed over the next few days.

Shell Check makes minor warnings on line 20...

printf '%s\n' $split_octal | while read octal

...to be:-

printf '%s\n' "$split_octal" | while read -r octal

Bit I don't see why it needs changing as the file is always going to be multiple 3 digit octal values only, along with newlines of course, so I have left as is...

Thanks a lot I have something to get my teeth into now.

If only arrays were allowed...

Last edited by Don Cragun; 08-28-2016 at 08:18 AM..
# 4  
Old 08-28-2016
Isn't that approach a bit overcomplicated? Why not save some lines of code, some variables, and reduce the amount of data shoved to and fro?
Code:
binary () 
{ 
    printf '%s\n' $1 | while read octal; do
        while [ "${#octal}" -gt 0 ]; do
            subtx=${octal%"${octal#???}"};
            octal=${octal#"${subtx}"};
            printf '%b' "\0$subtx";
        done;
    done
}

Call it like binary "$aplit_octal".
This User Gave Thanks to RudiC For This Post:
# 5  
Old 08-28-2016
Quote:
Originally Posted by wisecracker
Hi Don...

Where would I be without you...

This is at least an order of magnitude faster than my test code.

Works perfectly in dash, except for your typo for hexdump near the end of the script.

Off to try some big binary files now. I will keep you informed over the next few days.

Shell Check makes minor warnings on line 20...

printf '%s\n' $split_octal | while read octal

...to be:-

printf '%s\n' "$split_octal" | while read -r octal

Bit I don't see why it needs changing as the file is always going to be multiple 3 digit octal values only, along with newlines of course, so I have left as is...

Thanks a lot I have something to get my teeth into now.

If only arrays were allowed...
Hi wisecracker,
Thank you for pointing out the iexdump (which has now been fixed in my post). I am not sure how that happened; I have hexdump in the code I copied and tried to paste.

It is good that you left the expansion $split_octal as is. With quotes around that expansion, the function won't work.

Quote:
Originally Posted by RudiC
Isn't that approach a bit overcomplicated? Why not save some lines of code, some variables, and reduce the amount of data shoved to and fro?
Code:
binary () 
{ 
    printf '%s\n' $1 | while read octal; do
        while [ "${#octal}" -gt 0 ]; do
            subtx=${octal%"${octal#???}"};
            octal=${octal#"${subtx}"};
            printf '%b' "\0$subtx";
        done;
    done
}

Call it like binary "$split_octal".
Yes. This is better. I hadn't noticed that the function was being called with an operand until after I had posted the script. One might also consider changing it to:
Code:
binary() { 
    while read octal
    do
        while [ "${#octal}" -gt 0 ]
        do
            subtx=${octal%"${octal#???}"}
            octal=${octal#"${subtx}"}
            printf '%b' "\0$subtx"
        done
    done
}

and invoke it with:
Code:
printf '%s\n' $split_octal | binary > output_file

or with:
Code:
binary < input_file > output_file

where input_file contains text similar to the string assigned to split_octal without the quotes (which would be handy if the data being processed is sometimes in a separate file and sometimes in a shell variable).
This User Gave Thanks to Don Cragun For This Post:
# 6  
Old 08-29-2016
Hi Don and RudiC...

WOW!

Both of your versions work infinitely faster than mine.

I must apologise about the operand, I thought I had removed that.

Thanks all...

Consider this part solved, next question coming soon...
# 7  
Old 08-29-2016
Do you have the xxd tool available? It provides a
Quote:
-r | -revert
reverse operation: convert (or patch) hexdump into binary.
So, if you have your input as hexadecimal - not octal - like
Code:
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
10
11
12
13
14
15
16
17
18
19
1A
1B
1C
1D
1E
1F

Code:
xxd -r -p file | hd
00000000  00 01 02 03 04 05 06 07  08 09 0a 0b 0c 0d 0e 0f  |................|
00000010  10 11 12 13 14 15 16 17  18 19 1a 1b 1c 1d 1e 1f  |................|
00000020

will convert it to a perfect binary file. And, pretty FAST!



EDIT: uudecode will also create binary files if fed with the data it understands.

Last edited by RudiC; 08-29-2016 at 09:13 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Change value for POSIX

Hi, I have a VM with following configration . 3.10.0-693.1.1.el7.x86_64 #1 SMP Thu Aug 3 08:15:31 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux My current POSIX is :-- Your environment variables take up 2011 bytes POSIX upper limit on argument length (this system): 2093093 POSIX smallest... (15 Replies)
Discussion started by: Abhayman
15 Replies

2. Shell Programming and Scripting

Is it possible to write write multiple cronjobs in shellscript??

Hi All, I need the answer of below question? 1) How to write multiple cronjobs in shellscript? Is there any way or we cant write in shellscript... Regards, Priyanka (2 Replies)
Discussion started by: pspriyanka
2 Replies

3. Shell Programming and Scripting

Convert binary file to csv and then back to the binary format

Hello *nix specialists, Im working for a non profit organisation in Germany to transport DSL over WLAN to people in areas without no DSL. We are using Linksys WRT 54 router with DD-WRT firmware There are at the moment over 180 router running but we have to change some settings next time. So my... (7 Replies)
Discussion started by: digidax
7 Replies

4. Programming

POSIX Thread Help

I want to create a program that creates 2 child process, and each of them creates 2 threads, and each thread prints its thread id. I0ve allread done that the outuput isn't the outuput i want. When a run the following comand "$./a.out | sort -u | wc -l" I have the folowing output 2 $: It should... (3 Replies)
Discussion started by: pharaoh
3 Replies

5. UNIX for Advanced & Expert Users

System V or POSIX

Hi , I am using UNIX network programming Vol1 (by R Stevens) book to learn about IPC. I would be using HP-UX,Solaris and Linux at my work. I have sections for POSIX and for System V in that book. I am quite confused in indentifying those OSs as POSIX or SYstem V. Can anyone please... (1 Reply)
Discussion started by: kumaran_5555
1 Replies

6. IP Networking

read/write,write/write lock with smbclient fails

Hi, We have smb client running on two of the linux boxes and smb server on another linux system. During a backup operation which uses smb, read of a file was allowed while write to the same file was going on.Also simultaneous writes to the same file were allowed.Following are the settings in the... (1 Reply)
Discussion started by: swatidas11
1 Replies

7. Programming

how to write a file to binary format in C ?

I'm in the Solaris environment. I want to write data to a file, but I don't want it to be easily read from the C shell. For example, here's my code: main () { FILE *fo; fo = fopen ("filename", "w"); fprintf (fo, "This is a test.\n"); fclose (fo); } Anyone can open up... (3 Replies)
Discussion started by: serendipity1276
3 Replies

8. Programming

Posix

HI, When i am configuring php in SUN Solaris. I am getting the below error. configure: error: Your system seems to lack POSIX threads. Do i need to install POSIX? If so can somebody let me know where can i download POSIX for Solaris 8? Thanks, (2 Replies)
Discussion started by: Krrishv
2 Replies

9. UNIX for Advanced & Expert Users

what can I get the posix standard?

I wanted study and write a unix like system. who can help me. ------------- Removed the garbled characters... not sure why they were there... (2 Replies)
Discussion started by: crashsky
2 Replies

10. Programming

ANSI C vs POSIX

can somebody explain about the ANSI C vs POSIX. say i was using open and fopen, i know that open is POSIX, and fopen is ANSI C. i read that that POSIX is a system call and ANSI C is like a standard library function. wouldn't the fopen function has to call on open function anyway to open any kind... (2 Replies)
Discussion started by: bb00y
2 Replies
Login or Register to Ask a Question