Sponsored Content
Full Discussion: Continuous Flat file parsing
Top Forums Shell Programming and Scripting Continuous Flat file parsing Post 302409472 by cg2 on Thursday 1st of April 2010 09:30:08 AM
Old 04-01-2010
Continuous Flat file parsing

Hi all,
I'm looking for some tips on an ideal method of parsing a huge fixed length flat file (~500gb) into a delimited text file. We have to do this because our data warehouse platform only accepts delimited file loads. In the past, we've done this with SAS (only on smaller ~40GB files) by importing into a SAS dataset using an input statement then dumping to a tab delimited text file using a simple proc export. I want to make this process more efficient and get SAS out of the process. I know this can all be done with Perl, Python, Java, etc. but I don't have any experience w/ those tools. Any suggestions or thoughts would be much appreciated.

One other item I forgot to mention is that the file contains 5 different file layouts which is identified by the first 2 bytes of the row (each row is 276 bytes wide). I provided a piece of my SAS code that shows 2 of the layouts. Thanks in advance.

SAS Code snippet:
Code:
 
INFILE MYFILE LRECL=276TRUNCOVER;
INPUT @1 RECTYPE $CHAR2. @;
SELECT (RECTYPE);
WHEN('CO') DO; 
INPUT
@ 16 a $6.
@ 22 b $1.
@ 23 c $7.
@ 30 d $6.
@ 36 e $2.
@ 38 f $6.
@ 44 g $1.
@ 45 h $38.
@263 x $14.;
OUTPUT CO_DATA;
END;
WHEN('HD') DO; 
INPUT
@ 16 aa $6.
@ 22 bb $2.
@ 24 cc $2.
@263 x $14.;
OUTPUT HD_DATA;
END;


Last edited by pludi; 04-01-2010 at 10:38 AM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing dupicate lines in the file ..(they are not continuous)

I have duplicates records in a file, but they are not consecutive. I want to remove the duplicates , using script. Can some one help me in writing a ksh script to implement this task. Ex file is like below. 1234 5689 4556 1234 4444 (7 Replies)
Discussion started by: Srini75
7 Replies

2. Shell Programming and Scripting

Help with a Flat File!!!

Hi All, I need a help with a shelll script program.I get a '|' separated file which sometime has a encrypted column.And this encryption sometime breaks the last column data into a new line and this is not picked by the ETL.So when i run a script,it should append back the broken new line data... (11 Replies)
Discussion started by: kumarsaravana_s
11 Replies

3. Shell Programming and Scripting

Flat File in HP-UX

Hi I have a file like below . IF the record starts with ASD then change the 20th offset to "K" follwed by that 20th offset value & if the record starts with ASDR then change the 38th offset to "K" followed by 38th offset value . But here the condition is the next value ... (0 Replies)
Discussion started by: Krishnaramjis
0 Replies

4. Programming

compare XML/flat file with UNIX file system structure

Before i start doing something, I wanted to know whether the approach to compare XML file with UNIX file system structure. I have a pre-configured file(contains a list of paths to executables) and i need to check against the UNIX directory structure. what are the various approches should i use ? I... (6 Replies)
Discussion started by: shafi2all
6 Replies

5. Shell Programming and Scripting

Parsing of file for Report Generation (String parsing and splitting)

Hey guys, I have this file generated by me... i want to create some HTML output from it. The problem is that i am really confused about how do I go about reading the file. The file is in the following format: TID1 Name1 ATime=xx AResult=yyy AExpected=yyy BTime=xx BResult=yyy... (8 Replies)
Discussion started by: umar.shaikh
8 Replies

6. Shell Programming and Scripting

Searching for Log / Bad file and Reading and writing to a flat file

Need to develop a unix shell script for the below requirement and I need your assistance: 1) search for file.log and file.bad file in a directory and read them 2) pull out "Load_Start_Time", "Data_File_Name", "Error_Type" from log file 4) concatinate each row from bad file as... (3 Replies)
Discussion started by: mlpathir
3 Replies

7. Shell Programming and Scripting

Continuous log file transfer to remote server

I have several production servers and 1 offline server. Production server continuously generates new log files for my application. Depending on time of day new files may be generated every few seconds and at other times every few hours. I also have an offline server where I would like to pull log... (3 Replies)
Discussion started by: yoda9691
3 Replies

8. Shell Programming and Scripting

Continuous checking of a file

I have a requirement like this... I want to go to a particular server for which i have acess .I want to do a ssh to that server from one server and check if a file is theer or not..and i need the script to chcek continuosly till it finds the file.When it finds the file i want it to come out... (9 Replies)
Discussion started by: kanta_bhakti
9 Replies

9. Shell Programming and Scripting

XML Parsing having optional tags into flat file

In xml file i have following data where some tags like<ChrgBr> may not be present in every next file. So i want these values to be stored in some variable like var1="405360,00" , var2="DEBT" and so on ,but if <ChrgBr> tag has no value or is absent var2 should have space like var2=" " so that i... (1 Reply)
Discussion started by: sandipgawale
1 Replies

10. Shell Programming and Scripting

Creating a Continuous File Reading-Executing Shell Script

I need to write something that will read and execute all the files(Mainly executable scripts) inside one or more folders; in other words, a continuous chain with a break when finished. I'm new to shell and need syntax help. I'm on Ubuntu 12.10-Gnome btw. Here are some main highlights I think... (2 Replies)
Discussion started by: linuxlololol
2 Replies
MFI(4)							   BSD Kernel Interfaces Manual 						    MFI(4)

NAME
mfi -- LSI MegaRAID SAS driver SYNOPSIS
To compile this driver into the kernel, place the following lines in your kernel configuration file: device pci device mfi Alternatively, to load the driver as a module at boot time, place the following line in loader.conf(5): mfi_load="YES" DESCRIPTION
This driver is for LSI's next generation PCI Express SAS RAID controllers. Access to RAID arrays (logical disks) from this driver is pro- vided via /dev/mfid? device nodes. A simple management interface is also provided on a per-controller basis via the /dev/mfi? device node. The mfi name is derived from the phrase "MegaRAID Firmware Interface", which is substantially different than the old "MegaRAID" interface and thus requires a new driver. Older SCSI and SATA MegaRAID cards are supported by amr(4) and will not work with this driver. Two sysctls are provided to tune the mfi driver's behavior when a request is made to remove a mounted volume. By default the driver will disallow any requests to remove a mounted volume. If the sysctl dev.mfi.%d.delete_busy_volumes is set to 1, then the driver will allow mounted volumes to be removed. A tunable is provided to adjust the mfi driver's behaviour when attaching to a card. By default the driver will attach to all known cards with high probe priority. If the tunable hw.mfi.mrsas_enable is set to 1, then the driver will reduce its probe priority to allow mrsas to attach to the card instead of mfi. HARDWARE
The mfi driver supports the following hardware: o LSI MegaRAID SAS 1078 o LSI MegaRAID SAS 8408E o LSI MegaRAID SAS 8480E o LSI MegaRAID SAS 9240 o LSI MegaRAID SAS 9260 o Dell PERC5 o Dell PERC6 o IBM ServeRAID M1015 SAS/SATA o IBM ServeRAID M1115 SAS/SATA o IBM ServeRAID M5015 SAS/SATA o IBM ServeRAID M5110 SAS/SATA o IBM ServeRAID-MR10i o Intel RAID Controller SRCSAS18E o Intel RAID Controller SROMBSAS18E FILES
/dev/mfid? array/logical disk interface /dev/mfi? management interface DIAGNOSTICS
mfid%d: Unable to delete busy device An attempt was made to remove a mounted volume. SEE ALSO
amr(4), pci(4), mfiutil(8) HISTORY
The mfi driver first appeared in FreeBSD 6.1. AUTHORS
The mfi driver and this manual page were written by Scott Long <scottl@FreeBSD.org>. BUGS
The driver does not support big-endian architectures at this time. BSD
July 15, 2013 BSD
All times are GMT -4. The time now is 04:31 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy