Sponsored Content
Top Forums Shell Programming and Scripting EBCDIC File Split Based On Record Key Post 302963159 by Don Cragun on Thursday 24th of December 2015 02:29:31 PM
Old 12-24-2015
Quote:
Originally Posted by hanshot1stx
Alright Don, so I have spent the last few days playing with this now and have run into a couple quirks. First off are some things about the file. We will call the original EBCDIC file with all of the data data.ebc. I go ahead and do the simple conversion using dd to get a new file, data.ascii. Running the wc command gives me

0 lines in data.ebc, with 64454170 bytes
5948 lines in data.ascii with 64454170 bytes
OK. This is good! You translated EBCDIC bytes to the corresponding ASCII bytes and no bytes were added or lost. But, even though this is an ASCII file, it is not a text file; the <newline> characters are just binary data in your file; not line terminators.
Quote:
Then I use the tr command to get rid of newlines so that I have one line in my new.ascii file. Then I go through new.ascii and cut the first two bytes, get 01, and write that to a file, increment and repeat. This works perfectly until I get to bytes 16880, in which the program then gets thrown off. Interestingly in data.ascii, there are 16507 bytes in the first line. So somehow I need to make it so that I have either a file that has only one line (since using tr to delete '\n' seems to be causing issues) or I need a file that has 422 bytes on each line, so that the first two bytes of each line correspond to either 01,02,03,...,12,13.
Ouch. No! Don't remove ANY bytes from data.ascii. Those <newline> characters you're seeing in that file are probably the ASCII byte values corresponding to some of the binary packed decimal data bytes in your input.

The data in data.ascii is just a stream of bytes containing the records in your data; there are no record separators in data.ebc nor in data.ascii. In addition to <newline> characters, there are probably also <nul> (all bits 0) bytes that should not appear in a text file. But, we aren't going to treat data.ascii (or data.ebc) as a text file.

Can you show us a table where the 1st column gives us the 1st two characters of your records (the two bytes that specify the record type), the 2nd column gives us the length in bytes of records of that type (either with or without the two bytes specifying the record type, but tell us whether or not the record size given includes those bytes), and the 3rd column gives us the name of the file that to which records of this type should be appended? (Are these output files supposed to be ASCII or EBCDIC? On first read of your requirements, I thought wanted to feed ASCII data to your C++ converter and then take the output from your C++ converter and translate that back to EBCDIC. Reading your first post again, it isn't clear to me whether the C++ converter wants EBCDIC input or ASCII input.)
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How to count the record count in an EBCDIC file.

How do I get the record count in an EBCDIC file on a Linux Box. :confused: (1 Reply)
Discussion started by: oracle8
1 Replies

2. Shell Programming and Scripting

How to split a file record

-Hi, I have a problem with parcing/spliting a file record into two parts and assigning the split parts to two viriables. The record is as follows: ftrn facc ttrd feed xref fsdb fcp ruldb csdb omom fordr ftxn fodb fsdc texc oxox reng ttrn ttxn fqdb ... (5 Replies)
Discussion started by: aoussenko
5 Replies

3. Shell Programming and Scripting

Split long record into csv file

Hi I receive a mainframe file which has very long records (1100 chars) with no field delimiters. I need to parse each record and output a comma delimited (csv) file. The record layout is fixed. If there weren't so many fields and records I would read the file into Excel, as a "fixed width"... (10 Replies)
Discussion started by: wvdeijk
10 Replies

4. Shell Programming and Scripting

Split a record based on particular match

Hi , I have a requirement to split the record based on particular match using UNIX. Case1: Input Record : 10.44.48.63;"Personals/Dating;sports";1441 Output Records : 10.44.48.63;Personals/Dating;1441;Original 10.44.48.63;sports;1441;Dummy Case2: Input Record : ... (5 Replies)
Discussion started by: mksuneel
5 Replies

5. Shell Programming and Scripting

split record based on delimiter

Hi, My inputfile contains field separaer is ^. 12^inms^ 13^fakdks^ssk^s3 23^avsd^ 13^fakdks^ssk^a4 I wanted to print only 2 delimiter occurence i.e 12^inms^ 23^avsd^ (4 Replies)
Discussion started by: Jairaj
4 Replies

6. Shell Programming and Scripting

Split file when the key field change !

Hello, I have the following example data file: Rv.Global_Sk,1077.160523,D,16/09/2011 Rv.Global_Sk,1077.08098,D,17/09/2011 Rv.Global_Sk,1077.001445,D,18/09/2011 Rv.Global_Sk,1072.660733,D,19/09/2011 Rv.Global_Sk,1070.381557,D,20/09/2011 Rv.Global_Sk,1071.971747,D,21/09/2011... (4 Replies)
Discussion started by: csierra
4 Replies

7. Shell Programming and Scripting

Fetching record based on Uniq Key from huge file.

Hi i want to fetch 100k record from a file which is looking like as below. XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX ... (17 Replies)
Discussion started by: lathigara
17 Replies

8. Shell Programming and Scripting

Split a large file in n records and skip a particular record

Hello All, I have a large file, more than 50,000 lines, and I want to split it in even 5000 records. Which I can do using sed '1d;$d;' <filename> | awk 'NR%5000==1{x="F"++i;}{print > x}'Now I need to add one more condition that is not to break the file at 5000th record if the 5000th record... (20 Replies)
Discussion started by: ibmtech
20 Replies

9. UNIX for Advanced & Expert Users

Removing Header and Trailer record of a EBCDIC file

I have a EBCDIC multi layout file which has a header record which is 21 bytes, The Detail records are 2427 bytes long and the trailer record is 9 bytes long. Is there a command to remove the header as well as trailer record and read only the detail records while at the same time not altering... (1 Reply)
Discussion started by: abhilashnair
1 Replies

10. UNIX for Advanced & Expert Users

How to split large file with different record delimiter?

Hi, I have received a file which is 20 GB. We would like to split the file into 4 equal parts and process it to avoid memory issues. If the record delimiter is unix new line, I could use split command either with option l or b. The problem is that the line terminator is |##| How to use... (5 Replies)
Discussion started by: Ravi.K
5 Replies
SPLIT(1)						    BSD General Commands Manual 						  SPLIT(1)

NAME
split -- split a file into pieces SYNOPSIS
split [-a suffix_length] [-b byte_count[k|m] | -l line_count -n chunk_count] [file [name]] DESCRIPTION
The split utility reads the given file and breaks it up into files of 1000 lines each. If file is a single dash or absent, split reads from the standard input. file itself is not altered. The options are as follows: -a Use suffix_length letters to form the suffix of the file name. -b Create smaller files byte_count bytes in length. If 'k' is appended to the number, the file is split into byte_count kilobyte pieces. If 'm' is appended to the number, the file is split into byte_count megabyte pieces. -l Create smaller files line_count lines in length. -n Split file into chunk_count smaller files. If additional arguments are specified, the first is used as the name of the input file which is to be split. If a second additional argument is specified, it is used as a prefix for the names of the files into which the file is split. In this case, each file into which the file is split is named by the prefix followed by a lexically ordered suffix using suffix_length characters in the range ``a-z''. If -a is not speci- fied, two letters are used as the suffix. If the name argument is not specified, 'x' is used. STANDARDS
The split utility conforms to IEEE Std 1003.1-2001 (``POSIX.1''). HISTORY
A split command appeared in Version 6 AT&T UNIX. The -a option was introduced in NetBSD 2.0. Before that, if name was not specified, split would vary the first letter of the filename to increase the number of possible output files. The -a option makes this unnecessary. BSD
May 28, 2007 BSD
All times are GMT -4. The time now is 11:43 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy