Quote:
Originally Posted by
hanshot1stx
Alright Don, so I have spent the last few days playing with this now and have run into a couple quirks. First off are some things about the file. We will call the original EBCDIC file with all of the data data.ebc. I go ahead and do the simple conversion using dd to get a new file, data.ascii. Running the wc command gives me
0 lines in data.ebc, with 64454170 bytes
5948 lines in data.ascii with 64454170 bytes
OK. This is good! You translated EBCDIC bytes to the corresponding ASCII bytes and no bytes were added or lost. But, even though this is an ASCII file, it is not a text file; the <newline> characters are just binary data in your file; not line terminators.
Quote:
Then I use the tr command to get rid of newlines so that I have one line in my new.ascii file. Then I go through new.ascii and cut the first two bytes, get 01, and write that to a file, increment and repeat. This works perfectly until I get to bytes 16880, in which the program then gets thrown off. Interestingly in data.ascii, there are 16507 bytes in the first line. So somehow I need to make it so that I have either a file that has only one line (since using tr to delete '\n' seems to be causing issues) or I need a file that has 422 bytes on each line, so that the first two bytes of each line correspond to either 01,02,03,...,12,13.
Ouch. No! Don't remove ANY bytes from
data.ascii. Those <newline> characters you're seeing in that file are probably the ASCII byte values corresponding to some of the binary packed decimal data bytes in your input.
The data in
data.ascii is just a stream of bytes containing the records in your data; there are no record separators in
data.ebc nor in
data.ascii. In addition to <newline> characters, there are probably also <nul> (all bits 0) bytes that should not appear in a
text file. But, we aren't going to treat
data.ascii (or
data.ebc) as a text file.
Can you show us a table where the 1st column gives us the 1st two characters of your records (the two bytes that specify the record type), the 2nd column gives us the length in bytes of records of that type (either with or without the two bytes specifying the record type, but tell us whether or not the record size given includes those bytes), and the 3rd column gives us the name of the file that to which records of this type should be appended? (Are these output files supposed to be ASCII or EBCDIC? On first read of your requirements, I thought wanted to feed ASCII data to your C++ converter and then take the output from your C++ converter and translate that back to EBCDIC. Reading your first post again, it isn't clear to me whether the C++ converter wants EBCDIC input or ASCII input.)