Continuous Flat file parsing


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Continuous Flat file parsing
# 1  
Old 04-01-2010
Continuous Flat file parsing

Hi all,
I'm looking for some tips on an ideal method of parsing a huge fixed length flat file (~500gb) into a delimited text file. We have to do this because our data warehouse platform only accepts delimited file loads. In the past, we've done this with SAS (only on smaller ~40GB files) by importing into a SAS dataset using an input statement then dumping to a tab delimited text file using a simple proc export. I want to make this process more efficient and get SAS out of the process. I know this can all be done with Perl, Python, Java, etc. but I don't have any experience w/ those tools. Any suggestions or thoughts would be much appreciated.

One other item I forgot to mention is that the file contains 5 different file layouts which is identified by the first 2 bytes of the row (each row is 276 bytes wide). I provided a piece of my SAS code that shows 2 of the layouts. Thanks in advance.

SAS Code snippet:
Code:
 
INFILE MYFILE LRECL=276TRUNCOVER;
INPUT @1 RECTYPE $CHAR2. @;
SELECT (RECTYPE);
WHEN('CO') DO; 
INPUT
@ 16 a $6.
@ 22 b $1.
@ 23 c $7.
@ 30 d $6.
@ 36 e $2.
@ 38 f $6.
@ 44 g $1.
@ 45 h $38.
@263 x $14.;
OUTPUT CO_DATA;
END;
WHEN('HD') DO; 
INPUT
@ 16 aa $6.
@ 22 bb $2.
@ 24 cc $2.
@263 x $14.;
OUTPUT HD_DATA;
END;


Last edited by pludi; 04-01-2010 at 10:38 AM..
# 2  
Old 04-01-2010
I doubt you'll receive any useful help because of the dearth of information. You want help processing a file that contains 5 types of records. You should provide at least one sample of each type from the file to be processed. You should also provide, for each type, the desired output. If it's too wide and it breaks the forum's layout, attach them in a small text file.

From what the info you gave, even someone who is familiar with SAS has no clue about three of the format types.

I'm willing to help if i can, but I don't have enough information.

Regards,
Alister
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Creating a Continuous File Reading-Executing Shell Script

I need to write something that will read and execute all the files(Mainly executable scripts) inside one or more folders; in other words, a continuous chain with a break when finished. I'm new to shell and need syntax help. I'm on Ubuntu 12.10-Gnome btw. Here are some main highlights I think... (2 Replies)
Discussion started by: linuxlololol
2 Replies

2. Shell Programming and Scripting

XML Parsing having optional tags into flat file

In xml file i have following data where some tags like<ChrgBr> may not be present in every next file. So i want these values to be stored in some variable like var1="405360,00" , var2="DEBT" and so on ,but if <ChrgBr> tag has no value or is absent var2 should have space like var2=" " so that i... (1 Reply)
Discussion started by: sandipgawale
1 Replies

3. Shell Programming and Scripting

Continuous checking of a file

I have a requirement like this... I want to go to a particular server for which i have acess .I want to do a ssh to that server from one server and check if a file is theer or not..and i need the script to chcek continuosly till it finds the file.When it finds the file i want it to come out... (9 Replies)
Discussion started by: kanta_bhakti
9 Replies

4. Shell Programming and Scripting

Continuous log file transfer to remote server

I have several production servers and 1 offline server. Production server continuously generates new log files for my application. Depending on time of day new files may be generated every few seconds and at other times every few hours. I also have an offline server where I would like to pull log... (3 Replies)
Discussion started by: yoda9691
3 Replies

5. Shell Programming and Scripting

Searching for Log / Bad file and Reading and writing to a flat file

Need to develop a unix shell script for the below requirement and I need your assistance: 1) search for file.log and file.bad file in a directory and read them 2) pull out "Load_Start_Time", "Data_File_Name", "Error_Type" from log file 4) concatinate each row from bad file as... (3 Replies)
Discussion started by: mlpathir
3 Replies

6. Shell Programming and Scripting

Parsing of file for Report Generation (String parsing and splitting)

Hey guys, I have this file generated by me... i want to create some HTML output from it. The problem is that i am really confused about how do I go about reading the file. The file is in the following format: TID1 Name1 ATime=xx AResult=yyy AExpected=yyy BTime=xx BResult=yyy... (8 Replies)
Discussion started by: umar.shaikh
8 Replies

7. Programming

compare XML/flat file with UNIX file system structure

Before i start doing something, I wanted to know whether the approach to compare XML file with UNIX file system structure. I have a pre-configured file(contains a list of paths to executables) and i need to check against the UNIX directory structure. what are the various approches should i use ? I... (6 Replies)
Discussion started by: shafi2all
6 Replies

8. Shell Programming and Scripting

Flat File in HP-UX

Hi I have a file like below . IF the record starts with ASD then change the 20th offset to "K" follwed by that 20th offset value & if the record starts with ASDR then change the 38th offset to "K" followed by 38th offset value . But here the condition is the next value ... (0 Replies)
Discussion started by: Krishnaramjis
0 Replies

9. Shell Programming and Scripting

Help with a Flat File!!!

Hi All, I need a help with a shelll script program.I get a '|' separated file which sometime has a encrypted column.And this encryption sometime breaks the last column data into a new line and this is not picked by the ETL.So when i run a script,it should append back the broken new line data... (11 Replies)
Discussion started by: kumarsaravana_s
11 Replies

10. Shell Programming and Scripting

Removing dupicate lines in the file ..(they are not continuous)

I have duplicates records in a file, but they are not consecutive. I want to remove the duplicates , using script. Can some one help me in writing a ksh script to implement this task. Ex file is like below. 1234 5689 4556 1234 4444 (7 Replies)
Discussion started by: Srini75
7 Replies
Login or Register to Ask a Question