Sponsored Content
Full Discussion: Continuous Flat file parsing
Top Forums Shell Programming and Scripting Continuous Flat file parsing Post 302409472 by cg2 on Thursday 1st of April 2010 09:30:08 AM
Old 04-01-2010
Continuous Flat file parsing

Hi all,
I'm looking for some tips on an ideal method of parsing a huge fixed length flat file (~500gb) into a delimited text file. We have to do this because our data warehouse platform only accepts delimited file loads. In the past, we've done this with SAS (only on smaller ~40GB files) by importing into a SAS dataset using an input statement then dumping to a tab delimited text file using a simple proc export. I want to make this process more efficient and get SAS out of the process. I know this can all be done with Perl, Python, Java, etc. but I don't have any experience w/ those tools. Any suggestions or thoughts would be much appreciated.

One other item I forgot to mention is that the file contains 5 different file layouts which is identified by the first 2 bytes of the row (each row is 276 bytes wide). I provided a piece of my SAS code that shows 2 of the layouts. Thanks in advance.

SAS Code snippet:
Code:
 
INFILE MYFILE LRECL=276TRUNCOVER;
INPUT @1 RECTYPE $CHAR2. @;
SELECT (RECTYPE);
WHEN('CO') DO; 
INPUT
@ 16 a $6.
@ 22 b $1.
@ 23 c $7.
@ 30 d $6.
@ 36 e $2.
@ 38 f $6.
@ 44 g $1.
@ 45 h $38.
@263 x $14.;
OUTPUT CO_DATA;
END;
WHEN('HD') DO; 
INPUT
@ 16 aa $6.
@ 22 bb $2.
@ 24 cc $2.
@263 x $14.;
OUTPUT HD_DATA;
END;


Last edited by pludi; 04-01-2010 at 10:38 AM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing dupicate lines in the file ..(they are not continuous)

I have duplicates records in a file, but they are not consecutive. I want to remove the duplicates , using script. Can some one help me in writing a ksh script to implement this task. Ex file is like below. 1234 5689 4556 1234 4444 (7 Replies)
Discussion started by: Srini75
7 Replies

2. Shell Programming and Scripting

Help with a Flat File!!!

Hi All, I need a help with a shelll script program.I get a '|' separated file which sometime has a encrypted column.And this encryption sometime breaks the last column data into a new line and this is not picked by the ETL.So when i run a script,it should append back the broken new line data... (11 Replies)
Discussion started by: kumarsaravana_s
11 Replies

3. Shell Programming and Scripting

Flat File in HP-UX

Hi I have a file like below . IF the record starts with ASD then change the 20th offset to "K" follwed by that 20th offset value & if the record starts with ASDR then change the 38th offset to "K" followed by 38th offset value . But here the condition is the next value ... (0 Replies)
Discussion started by: Krishnaramjis
0 Replies

4. Programming

compare XML/flat file with UNIX file system structure

Before i start doing something, I wanted to know whether the approach to compare XML file with UNIX file system structure. I have a pre-configured file(contains a list of paths to executables) and i need to check against the UNIX directory structure. what are the various approches should i use ? I... (6 Replies)
Discussion started by: shafi2all
6 Replies

5. Shell Programming and Scripting

Parsing of file for Report Generation (String parsing and splitting)

Hey guys, I have this file generated by me... i want to create some HTML output from it. The problem is that i am really confused about how do I go about reading the file. The file is in the following format: TID1 Name1 ATime=xx AResult=yyy AExpected=yyy BTime=xx BResult=yyy... (8 Replies)
Discussion started by: umar.shaikh
8 Replies

6. Shell Programming and Scripting

Searching for Log / Bad file and Reading and writing to a flat file

Need to develop a unix shell script for the below requirement and I need your assistance: 1) search for file.log and file.bad file in a directory and read them 2) pull out "Load_Start_Time", "Data_File_Name", "Error_Type" from log file 4) concatinate each row from bad file as... (3 Replies)
Discussion started by: mlpathir
3 Replies

7. Shell Programming and Scripting

Continuous log file transfer to remote server

I have several production servers and 1 offline server. Production server continuously generates new log files for my application. Depending on time of day new files may be generated every few seconds and at other times every few hours. I also have an offline server where I would like to pull log... (3 Replies)
Discussion started by: yoda9691
3 Replies

8. Shell Programming and Scripting

Continuous checking of a file

I have a requirement like this... I want to go to a particular server for which i have acess .I want to do a ssh to that server from one server and check if a file is theer or not..and i need the script to chcek continuosly till it finds the file.When it finds the file i want it to come out... (9 Replies)
Discussion started by: kanta_bhakti
9 Replies

9. Shell Programming and Scripting

XML Parsing having optional tags into flat file

In xml file i have following data where some tags like<ChrgBr> may not be present in every next file. So i want these values to be stored in some variable like var1="405360,00" , var2="DEBT" and so on ,but if <ChrgBr> tag has no value or is absent var2 should have space like var2=" " so that i... (1 Reply)
Discussion started by: sandipgawale
1 Replies

10. Shell Programming and Scripting

Creating a Continuous File Reading-Executing Shell Script

I need to write something that will read and execute all the files(Mainly executable scripts) inside one or more folders; in other words, a continuous chain with a break when finished. I'm new to shell and need syntax help. I'm on Ubuntu 12.10-Gnome btw. Here are some main highlights I think... (2 Replies)
Discussion started by: linuxlololol
2 Replies
SMP_REP_MANUFACTURER(8) 					     SMP_UTILS						   SMP_REP_MANUFACTURER(8)

NAME
smp_rep_manufacturer - invoke REPORT MANUFACTURER INFORMATION SMP function SYNOPSIS
smp_rep_manufacturer [--help] [--hex] [--interface=PARAMS] [--raw] [--sa=SAS_ADDR] [--verbose] [--version] [--zero] SMP_DEVICE[,N] DESCRIPTION
Sends a SAS Management Protocol (SMP) REPORT MANUFACTURER INFORMATION function request to a SMP target. The SMP target is identified by the SMP_DEVICE and the SAS_ADDR. Depending on the interface, the SAS_ADDR may be deduced from the SMP_DEVICE. The mpt interface uses SMP_DEVICE to identify a HBA (an SMP initiator) and needs the additional ,N to differentiate between HBAs if there are multiple present. OPTIONS
Mandatory arguments to long options are mandatory for short options as well. -h, --help output the usage message then exit. -H, --hex output the response (less the CRC field) in hexadecimal. -I, --interface=PARAMS interface specific parameters. In this case "interface" refers to the path through the operating system to the SMP initiator. See the smp_utils man page for more information. -r, --raw send the response (less the CRC field) to stdout in binary. All error messages are sent to stderr. -s, --sa=SAS_ADDR specifies the SAS address of the SMP target device. Typically this is an expander. This option may not be needed if the SMP_DEVICE has the target's SAS address within it. The SAS_ADDR is in decimal but most SAS addresses are shown in hexadecimal. To give a number in hexadecimal either prefix it with '0x' or put a trailing 'h' on it. -v, --verbose increase the verbosity of the output. Can be used multiple times -V, --version print the version string and then exit. -z, --zero zero the Allocated Response Length field in the request. This is required for strict SAS-1.1 compliance. However this option should not be given in SAS-2 and later; if it is given an abridged response may result. CONFORMING TO
The SMP REPORT MANUFACTURER function was introduced in SAS-1, with small additions in SAS-1.1 . This function remains unaltered in SAS-2 and SPL-1. AUTHORS
Written by Douglas Gilbert. REPORTING BUGS
Report bugs to <dgilbert at interlog dot com>. COPYRIGHT
Copyright (C) 2006-2011 Douglas Gilbert This software is distributed under a FreeBSD license. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PUR- POSE. SEE ALSO
smp_utils, smp_discover(smp_utils) smp_utils-0.96 May 2011 SMP_REP_MANUFACTURER(8)
All times are GMT -4. The time now is 02:57 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy