Perl or Awk script to copy a part of text file.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Perl or Awk script to copy a part of text file.
# 22  
Old 11-05-2009
I can try if i can get a sample of data but i believe that there are powerful (and free) tools to manage such tables. I don't know what kind of system and distribution you are using but i'm sure you can find an appropriate one.
Did you get that xml code by exporting from DB2 ?
# 23  
Old 11-05-2009
Quote:
Originally Posted by frans
I can try if i can get a sample of data but i believe that there are powerful (and free) tools to manage such tables. I don't know what kind of system and distribution you are using but i'm sure you can find an appropriate one.
Did you get that xml code by exporting from DB2 ?
Hi Frans,
Let me explain the scenario.
We are using SAS DI studio to do the ETL.
The source DB2 table has 6 columns and one of the column is CLOB datatype containing the xml. When we imported into SAS DI it used lot of memory for the CLOB datatype, since we needed just the part of the xml we thought we'll process the xml thru shell scripting to get the string we need.
Instead of exporting the db2 table and processing the xml, it will be easier if we can read the xml column from the db2 table process it and save it in another new column in the same db2 source table.

Hope I'm clear.

Thanks,
# 24  
Old 11-05-2009
Doesn't the ETL module of SAS DI provide any ability to pre-process xml data ?
Don't you have any tool (SQL...) to extract data from the DB2 ? It could then be processed and stored in an understandable format for the ETL.
It would be a pity that the ETL could'nt do that by entering a couple of lines of code to configure it.
Bash, Awk and Sed can do a lot, Perl could be more appropriate, hey can all proccess any kind of CSV by coding scripts.
# 25  
Old 11-05-2009
Quote:
Originally Posted by frans
Doesn't the ETL module of SAS DI provide any ability to pre-process xml data ?
Don't you have any tool (SQL...) to extract data from the DB2 ? It could then be processed and stored in an understandable format for the ETL.
It would be a pity that the ETL could'nt do that by entering a couple of lines of code to configure it.
Bash, Awk and Sed can do a lot, Perl could be more appropriate, hey can all proccess any kind of CSV by coding scripts.
Hi Frans,
Ok let me explain the complexity of the process.
First of all SAS DI processes xml well, the issue here is all the xml doesn't have one XSD. There are more than 100 XSD's to process these xml's and now we have to group xml's according to their XSD's to be able for any xml parser to read it(which by itself is a big process and SAS doesn't recognize XSD if it is more than three levels so we need to create an equivalent .map file) secondly reading all xml thru SAS DI will translate into 100's of tables.
We can reduce the number tables by selecting only the nodes we need, but again there is complexity in associating the records from xml tables to their original records from db2 table as there is no link inside the xml.
Next is there is not much time line for this project(1 month) and finally the memory and the load window constrains, so taking all this into consideration we thought it will be better if we could process the xml as string.

Thanks,
# 26  
Old 11-05-2009
OK.
P.S. I'm totally beginner with Sed but found a one-line command that works with your sample :
Code:
sed '/\(<text>.*<\/text>\)/!d;s/<text>\(.*\)<\/text>/\1/;s/\(<.*>\)//' inputfile > outputfile

Or if the <text>... ... </text> may be on multiple lines then procede so:
Code:
cat inputfile | tr "\n" " " | sed '/\(<text>.*<\/text>\)/!d;s/<text>\(.*\)<\/text>/\1/;s/\(<.*>\)//' > outputfile

These commands look weird but Sed is really powerful to process data streams.
The output file doesn't contain any reference to any record and/or field. I think we schould also extract something from the xml to get an index for further processing. Am i right ?
Ok, I don't know where you are but here it's 1:00 AM, à demain.
# 27  
Old 11-06-2009
I'm in Canada, so my time is GMT -7.00. We decided to consider this a seperate project and assigned the task to our PERL developers and right now they are working on this.
Atleast for time being I think we can relax. I really appreciated your effort.
Thank you very much.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help in UNIX shell to copy part of file name to new file name

Hi, I want to do the following in a Unix shell script and wonder if someone could assist me? I want to take files in a specific directory that start with the name pxpur012 and copy them to the same directory with the file name not containg pxpur012. For example, I have files like... (4 Replies)
Discussion started by: lnemitz
4 Replies

2. Shell Programming and Scripting

Not able to copy the file in perl cgi script

Hello experts, I am facing an very typical problem and hope the issue can be solved. I have a page download.cgi in /cgi-bin folder. use CGI; use CGI::Carp qw ( fatalsToBrowser ); use File::Copy copy("C:\\Program Files\\Apache Software... (8 Replies)
Discussion started by: scriptscript
8 Replies

3. Shell Programming and Scripting

Copy a file from local host to a list of remote hosts --- perl script

Hi friends, i need to prepare a script ( in perl) i have a file called "demo.exe" in my local unix host. i have a list of remote hosts in a file "hosts.txt" now i need to push "demo.exe" file to all the hosts in "hosts.txt" file. for this i need to prepare a script(in perl, but shell... (5 Replies)
Discussion started by: siva kumar
5 Replies

4. Shell Programming and Scripting

Copy part of file between two strings to another

I am a newbie to shell scripting I have a large log file , i need to work on the part of the log file for a particular date. Is there a way to find the first occurance of the date string and last occurance of the next day date date string and move this section to a new file. to explain it... (3 Replies)
Discussion started by: swayam123
3 Replies

5. UNIX for Dummies Questions & Answers

Copy the last part since the file has been updated

Input File1 constatntly running and growing in size. My Program Erorr ddmmyy hh:mm:ss My Program Error **Port 123 terminated **ID PIN 12345 Comamnd Successful Command Terminated Command Successful Command Terminated **My Program Erorr ddmmyy hh:mm:ss My Program Error **Port 345... (3 Replies)
Discussion started by: eurouno
3 Replies

6. UNIX for Dummies Questions & Answers

Copy a part of file

Hi, I want to copy text between expressions ">bcr1" and ">bcr2" to another file. Any simple solutions? Thanks (4 Replies)
Discussion started by: alpesh
4 Replies

7. Shell Programming and Scripting

search needed part in text file (awk?)

Hello! I have text file: From aaa@bbb Fri Jun 1 10:04:29 2010 --____OSPHWOJQGRPHNTTXKYGR____ Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline My code '234565'. ... (2 Replies)
Discussion started by: candyme
2 Replies

8. Shell Programming and Scripting

shell script to take input from a text file and perform check on each servers and copy files

HI all, I want to script where all the server names will be in a text file like server1 server2 server3 . and the script should take servernames from a text file and perform copy of files if the files are not present on those servers.after which it should take next servername till the end of... (0 Replies)
Discussion started by: joseph.dmello
0 Replies

9. Shell Programming and Scripting

awk, perl Script for processing a single line text file

I need a script to process a huge single line text file: The sample of the text is: "forward_inline_item": "Inline", "options_region_Australia": "Australia", "server_event_err_msg": "There was an error attempting to save", "Token": "Yes", "family": "Family","pwd_login_tab": "Enter Your... (1 Reply)
Discussion started by: hmsadiq
1 Replies

10. UNIX for Dummies Questions & Answers

Shell script to search for text in a file and copy file

Compete noob question.... I need a script to search through a directory and find files containing text string abcde1234 for example and then copy that file with that text string to another directory help please :eek: (9 Replies)
Discussion started by: imeadows
9 Replies
Login or Register to Ask a Question