The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Moving a part of the text in a file srikanthgoodboy Shell Programming and Scripting 6 05-04-2009 10:58 AM
awk, perl Script for processing a single line text file hmsadiq Shell Programming and Scripting 1 04-12-2009 03:44 PM
Need help to modify perl script: Text file with line and more than 1 space srsahu75 Shell Programming and Scripting 3 03-20-2009 05:28 PM
Shell script to search for text in a file and copy file imeadows UNIX for Dummies Questions & Answers 9 11-12-2008 09:12 PM
Perl script to load text file into DB field aristegui Shell Programming and Scripting 2 09-15-2008 03:55 AM

Reply
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 2 Weeks Ago
frans's Avatar
frans frans is offline
Registered User
  
 

Join Date: Oct 2009
Location: Drôme, France
Posts: 88
I can try if i can get a sample of data but i believe that there are powerful (and free) tools to manage such tables. I don't know what kind of system and distribution you are using but i'm sure you can find an appropriate one.
Did you get that xml code by exporting from DB2 ?
  #2 (permalink)  
Old 2 Weeks Ago
asandy1234 asandy1234 is offline
Registered User
  
 

Join Date: Oct 2009
Posts: 14
Quote:
Originally Posted by frans View Post
I can try if i can get a sample of data but i believe that there are powerful (and free) tools to manage such tables. I don't know what kind of system and distribution you are using but i'm sure you can find an appropriate one.
Did you get that xml code by exporting from DB2 ?
Hi Frans,
Let me explain the scenario.
We are using SAS DI studio to do the ETL.
The source DB2 table has 6 columns and one of the column is CLOB datatype containing the xml. When we imported into SAS DI it used lot of memory for the CLOB datatype, since we needed just the part of the xml we thought we'll process the xml thru shell scripting to get the string we need.
Instead of exporting the db2 table and processing the xml, it will be easier if we can read the xml column from the db2 table process it and save it in another new column in the same db2 source table.

Hope I'm clear.

Thanks,
  #3 (permalink)  
Old 2 Weeks Ago
frans's Avatar
frans frans is offline
Registered User
  
 

Join Date: Oct 2009
Location: Drôme, France
Posts: 88
Doesn't the ETL module of SAS DI provide any ability to pre-process xml data ?
Don't you have any tool (SQL...) to extract data from the DB2 ? It could then be processed and stored in an understandable format for the ETL.
It would be a pity that the ETL could'nt do that by entering a couple of lines of code to configure it.
Bash, Awk and Sed can do a lot, Perl could be more appropriate, hey can all proccess any kind of CSV by coding scripts.
  #4 (permalink)  
Old 2 Weeks Ago
asandy1234 asandy1234 is offline
Registered User
  
 

Join Date: Oct 2009
Posts: 14
Quote:
Originally Posted by frans View Post
Doesn't the ETL module of SAS DI provide any ability to pre-process xml data ?
Don't you have any tool (SQL...) to extract data from the DB2 ? It could then be processed and stored in an understandable format for the ETL.
It would be a pity that the ETL could'nt do that by entering a couple of lines of code to configure it.
Bash, Awk and Sed can do a lot, Perl could be more appropriate, hey can all proccess any kind of CSV by coding scripts.
Hi Frans,
Ok let me explain the complexity of the process.
First of all SAS DI processes xml well, the issue here is all the xml doesn't have one XSD. There are more than 100 XSD's to process these xml's and now we have to group xml's according to their XSD's to be able for any xml parser to read it(which by itself is a big process and SAS doesn't recognize XSD if it is more than three levels so we need to create an equivalent .map file) secondly reading all xml thru SAS DI will translate into 100's of tables.
We can reduce the number tables by selecting only the nodes we need, but again there is complexity in associating the records from xml tables to their original records from db2 table as there is no link inside the xml.
Next is there is not much time line for this project(1 month) and finally the memory and the load window constrains, so taking all this into consideration we thought it will be better if we could process the xml as string.

Thanks,
  #5 (permalink)  
Old 2 Weeks Ago
frans's Avatar
frans frans is offline
Registered User
  
 

Join Date: Oct 2009
Location: Drôme, France
Posts: 88
OK.
P.S. I'm totally beginner with Sed but found a one-line command that works with your sample :
Code:
sed '/\(<text>.*<\/text>\)/!d;s/<text>\(.*\)<\/text>/\1/;s/\(<.*>\)//' inputfile > outputfile
Or if the <text>... ... </text> may be on multiple lines then procede so:
Code:
cat inputfile | tr "\n" " " | sed '/\(<text>.*<\/text>\)/!d;s/<text>\(.*\)<\/text>/\1/;s/\(<.*>\)//' > outputfile
These commands look weird but Sed is really powerful to process data streams.
The output file doesn't contain any reference to any record and/or field. I think we schould also extract something from the xml to get an index for further processing. Am i right ?
Ok, I don't know where you are but here it's 1:00 AM, à demain.
  #6 (permalink)  
Old 2 Weeks Ago
asandy1234 asandy1234 is offline
Registered User
  
 

Join Date: Oct 2009
Posts: 14
I'm in Canada, so my time is GMT -7.00. We decided to consider this a seperate project and assigned the task to our PERL developers and right now they are working on this.
Atleast for time being I think we can relax. I really appreciated your effort.
Thank you very much.
Reply

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 09:43 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0