Wel, in a null terminated string application, null is always '' the empty string, as it reterminates the string, just as a strcpy of a 5 byte plus null string into a 20 byte buffer initialized with 19 spaces and a null creates a string length of 5.
Now, is it the only character that is equal to '' the empty string? Does read differentiate that from EOF in a while read loop?
There is possibly one _bug_ I can see in both my and your attempts is character 92, "\"
causing a pseudo escape on the next binary character it sees when running an
"somevar=echo -e......" into it more than once.
It also seems your 0x09 and 0x0A become NULL too.
I can understand why 0x0A _might_ occur but not 0x09, (Tab).
Quote:
Does read differentiate that from EOF in a while read loop?
On a binary file with single byte zero(s) uisng the default bash "read", yes, but on a
binary_string_variable, hmmm, hard to judge as placing byte zero into a variable is
not possible at this point.
Ignoring "read" FTTB I am going to try and force byte zero into the middlie of
a "Hello World." string, i.e. the space replaced with 0x00...
---------- Post updated 07-09-13 at 11:21 AM ---------- Previous update was 06-09-13 at 10:58 PM ----------
Hi DGPickett...
I tried this out from the command line and......
"read" has brought in the correct length of "Hello World.", 12, with the space shown as
character 0, but its file length is actually 13 characters in size...
So to return back to 13 characters again using pseudo-zero 2 backslashes will be needed
inside the file now making the file length 14.
This is a real killer for binary_string to file and back again manipulation.
I think the only way to account for binary zero is to use "\0" two characters and detect
those two characters inside the binary_string something like this...
Code:
if [ "${binary_string:$some_position:2}" == "\0" ]
then
echo "\0 at position $some_position in binary string."
# Do other binary string manipulation as required.
fi
The problem is that there may be two genuine independent "\" and "0" characters side
by side and not related to binary zero at all, just independent characters...
So, you can see shell is not a binary capable tool. Some bytes it can handle, but nulls, linefeeds, white space are problematic. But you can write trivial C to support it, like at tool that just does a tr of null to something else, trnull:
The problem is the same as mine, in that address 0x41 is also a(n) "!" so is this a binary zero or a real "!".
There is no way of knowing so I thought of using "@@" to substitute for binary zero as "\@" would present a similar problem as my "\0", but, would import into "read" easily.
But again "@@" may actually exist as real characters and not a substitution but at least it is starter approach to emulating binary zero...
All the 256 holes are always taken, so if your milieu does not have a limited character set, then you are right, the only escape is into longer strings for short characters, but then only context tells you that is the sort of file you have. That's why JAVA and xml use UTF-8 and label the content type. It started as a 'code page' discussion, a.k.a "what drum is on the printer".
There are apps for UNIX around that let you deal with image files like jpg and gif in text files of text numbers, like a decimal dump of a bmp file. I suppose if you run binary through the right dump you can then process it in shell back to a binary file.
Last edited by DGPickett; 09-18-2013 at 05:37 PM..
Here is an idea using extended characters for 0x00, 0x0A and 0x1B that can easily be imported by the "read" command...
Code:
#!/bin/bash --posix
# Using OSX 10.7.5, bash default terminal...
# Alt-8 = • - zero...
bin_zero="•"
# Alt-7 = ¶ - newline...
new_line="¶"
# Alt-6 = § - escape...
esc_char="§"
# A simple string using •, ¶, and § as a real binary substitutes.
#
echo $bin_zero"This is a binary "$bin_zero$new_line$esc_char" zero, newline and escape character cop-out."$esc_char$bin_zero$new_line > /tmp/bin.dat
# Check it and echo has included a newline character for added fun...
hexdump -C /tmp/bin.dat
#
read bin_text < /tmp/bin.dat
echo "$bin_text"
# Newline is stripped as expected...
#
# Now sawp extended characters into escape characters.
text=$(echo "$bin_text" | sed 's/•/\\0/g;s/¶/\\n/g;s/§/\\\\/g')
# Done!
# Put escape charatcer version of string into a file.
echo -n "$text" > /tmp/newbin.dat
# Done!
# Check that escape charaters for _newline_ and zero exist.
hexdump -C /tmp/newbin.dat
echo "$text"
# Done
# Now resave to the real file with binary 0x00s and 0x0A.
echo -e -n "$text" > /tmp/newbin.dat
# Done
# Prove it exists.
hexdump -C /tmp/newbin.dat
echo "Now converted to binary using extended characters..."
# Yup, it does...
The results:-
Code:
Now converted to binary using extended characters...
AMIGA:barrywalker~> ./bin_test.sh
00000000 e2 80 a2 54 68 69 73 20 69 73 20 61 20 62 69 6e |...This is a bin|
00000010 61 72 79 20 e2 80 a2 c2 b6 c2 a7 20 7a 65 72 6f |ary ....... zero|
00000020 2c 20 6e 65 77 6c 69 6e 65 20 61 6e 64 20 65 73 |, newline and es|
00000030 63 61 70 65 20 63 68 61 72 61 63 74 65 72 20 63 |cape character c|
00000040 6f 70 2d 6f 75 74 2e c2 a7 e2 80 a2 c2 b6 0a |op-out.........|
0000004f
•This is a binary •¶§ zero, newline and escape character cop-out.§•¶
00000000 5c 30 54 68 69 73 20 69 73 20 61 20 62 69 6e 61 |\0This is a bina|
00000010 72 79 20 5c 30 5c 6e 5c 5c 20 7a 65 72 6f 2c 20 |ry \0\n\\ zero, |
00000020 6e 65 77 6c 69 6e 65 20 61 6e 64 20 65 73 63 61 |newline and esca|
00000030 70 65 20 63 68 61 72 61 63 74 65 72 20 63 6f 70 |pe character cop|
00000040 2d 6f 75 74 2e 5c 5c 5c 30 5c 6e |-out.\\\0\n|
0000004b
\0This is a binary \0\n\\ zero, newline and escape character cop-out.\\\0\n
00000000 00 54 68 69 73 20 69 73 20 61 20 62 69 6e 61 72 |.This is a binar|
00000010 79 20 00 0a 5c 20 7a 65 72 6f 2c 20 6e 65 77 6c |y ..\ zero, newl|
00000020 69 6e 65 20 61 6e 64 20 65 73 63 61 70 65 20 63 |ine and escape c|
00000030 68 61 72 61 63 74 65 72 20 63 6f 70 2d 6f 75 74 |haracter cop-out|
00000040 2e 5c 00 0a |.\..|
00000044
Now converted to binary using extended characters...
AMIGA:barrywalker~>
EDIT:
I didn't want to use "sed" but longhand became a large program and defeated the object...
Noticed a minor bug and squashed it...
Last edited by wisecracker; 09-19-2013 at 06:26 PM..
Reason: See above...
Whenever you escape null to \0 you must minimally escape \ to \\. IFS has to have one character in it, so if you put in newline only, \n, a utility that escaped just those three would make data shell friendly.
Code:
#include <stdio.h>
void p_putchar( int c ){
if ( EOF == putchar( c )){
if ( ferror( stdout )){
perror( "stdout" );
exit( 1 );
}
exit( 0 );
}
}
int main (){
int c ;
while ( EOF != ( c = getchar() )){
switch ( c ){
case '\\':
p_putchar( c );
break ;
case '\0':
p_putchar( '\\' );
c = '0' ;
break ;
case '\n':
p_putchar( '\\' );
c = 'n' ;
break ;
default:
break ;
}
p_putchar( c );
}
if ( ferror( stdin )){
perror( "stdin" );
exit( 2 );
}
}
$ all256|bin2txt|od -bc
0000000 \ 0 001 002 003 004 005 006 007 \b \t \ n 013 \f \r
134 060 001 002 003 004 005 006 007 010 011 134 156 013 014 015
0000020 016 017 020 021 022 023 024 025 026 027 030 031 032 033 034 035
016 017 020 021 022 023 024 025 026 027 030 031 032 033 034 035
0000040 036 037 ! " # $ % & ' ( ) * + , -
036 037 040 041 042 043 044 045 046 047 050 051 052 053 054 055
0000060 . / 0 1 2 3 4 5 6 7 8 9 : ; < =
056 057 060 061 062 063 064 065 066 067 070 071 072 073 074 075
0000100 > ? @ A B C D E F G H I J K L M
076 077 100 101 102 103 104 105 106 107 110 111 112 113 114 115
0000120 N O P Q R S T U V W X Y Z [ \ \
116 117 120 121 122 123 124 125 126 127 130 131 132 133 134 134
0000140 ] ^ _ ` a b c d e f g h i j k l
135 136 137 140 141 142 143 144 145 146 147 150 151 152 153 154
0000160 m n o p q r s t u v w x y z { |
155 156 157 160 161 162 163 164 165 166 167 170 171 172 173 174
0000200 } ~ 177 200 201 202 203 204 205 206 207 210 211 212 213 214
175 176 177 200 201 202 203 204 205 206 207 210 211 212 213 214
0000220 215 216 217 220 221 222 223 224 225 226 227 230 231 232 233 234
215 216 217 220 221 222 223 224 225 226 227 230 231 232 233 234
0000240 235 236 237 240 241 242 243 244 245 246 247 250 251 252 253 254
235 236 237 240 241 242 243 244 245 246 247 250 251 252 253 254
0000260 255 256 257 260 261 262 263 264 265 266 267 270 271 272 273 274
255 256 257 260 261 262 263 264 265 266 267 270 271 272 273 274
0000300 275 276 277 300 301 302 303 304 305 306 307 310 311 312 313 314
275 276 277 300 301 302 303 304 305 306 307 310 311 312 313 314
0000320 315 316 317 320 321 322 323 324 325 326 327 330 331 332 333 334
315 316 317 320 321 322 323 324 325 326 327 330 331 332 333 334
0000340 335 336 337 340 341 342 343 344 345 346 347 350 351 352 353 354
335 336 337 340 341 342 343 344 345 346 347 350 351 352 353 354
0000360 355 356 357 360 361 362 363 364 365 366 367 370 371 372 373 374
355 356 357 360 361 362 363 364 365 366 367 370 371 372 373 374
0000400 375 376 377
375 376 377
0000403
$
Forgot the exit(0) for no error stdout EOF last time!
Last edited by DGPickett; 09-19-2013 at 05:44 PM..
Dear Moderator
I am not able to post any new thread or post reply to mine old thread.
Kindly help as i am stuck on one problem and needed suggestion.
Regards
Jaydeep (1 Reply)
Apologies for any typos, and IF this has been done before...
This is yet another building block. The code generates a 256 byte binary file of _characters_ 0x00 to 0xFF for general usage and generates another binary file manipulated in a basic way.
I need this facility for a kids project I am... (0 Replies)
Hello *nix specialists,
Im working for a non profit organisation in Germany to transport DSL over WLAN to people in areas without no DSL. We are using Linksys WRT 54 router with DD-WRT firmware There are at the moment over 180 router running but we have to change some settings next time. So my... (7 Replies)