UNIX shell script required to read two columns from xml


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers UNIX shell script required to read two columns from xml
# 1  
Old 03-08-2019
UNIX shell script required to read two columns from xml

Hello All,

I am new to unix scripting. I need to read FROMINSTANCE, FROMFIELD from below XML code and need to write a script for
Code:
insert into SQ_EPIC values( "DID", "PROJECT_NAME")
Select DID, PROJECT_NAME from EPIC

Code:
<CONNECTOR TOINSTANCETYPE="Source Qualifier" TOINSTANCE="SQ_EPIC" TOFIELD="DID" FROMINSTANCETYPE="Source Definition" FROMINSTANCE="EPIC" FROMFIELD="DID"/>

<CONNECTOR TOINSTANCETYPE="Source Qualifier" TOINSTANCE="SQ_EPIC" TOFIELD="PROJECT_NAME" FROMINSTANCETYPE="Source Definition" FROMINSTANCE="EPIC" FROMFIELD="PROJECT_NAME"/>

<CONNECTOR TOINSTANCETYPE="Target Definition" TOINSTANCE="STAGE_EPIC" TOFIELD="DID" FROMINSTANCETYPE="Source Qualifier" FROMINSTANCE="SQ_EPIC" FROMFIELD="DID"/>

<CONNECTOR TOINSTANCETYPE="Target Definition" TOINSTANCE="STAGE_EPIC" TOFIELD="PROJECT_NAME"FROMINSTANCETYPE="Source Qualifier" FROMINSTANCE="SQ_EPIC" FROMFIELD="PROJECT_NAME"/>

Kindly some help me how to proceed for this requirement.

Thanks in Advance!!!

Regards,
Sekhar Lekkala.

Last edited by RudiC; 03-08-2019 at 05:23 PM..
# 2  
Old 03-08-2019
Handling XML is not trivial, but we get asked for it a lot, and I have a script for the purpose:

Code:
# yanx.awk v0.0.9, Tyler Montbriand, 2019.  Yet another noncompliant XML parser
###############################################################################
# XML is a pain to process in the shell, but people need it all the time.
# I've been using and improving this kludge since 2014 or so.  It parses and
# stacks tags and digests parameters, allowing simple XML processing and
# extraction to be managed with a handful of lines addendum.
#
# I've restricted my use of GNU features enough that this script will run on
# busybox's awk.  I think it works with mawk except -e is unsupported.
# You can work around that by running multiple files, i.e.
# mawk -f yanx.awk -f mystuff.awk inputfile
###############################################################################
# Basic use:
#
# Fed this XML, <body><html a="b">Your Web Browser Hates This</html></body>
# yanx will read it token-by-token as so:
#     Line 1:  Empty, skipped
#     Line 2:  $1="body"
#     Line 3:  $1="html a="b"", $2="Your web browser hates this"
#     Line 4:  $1="/html"
#     Line 5:  $1="/body", $2="\n"
#
# The script sets a few new "special" variables along the way.
# TAG           The name of the current tag, uppercased.
# CTAG          If close-tag, name in uppercase.
# TAGS          List of nested tags, like HTML%BODY%, including current tag
# LTAGS         List of nested tags, not including current tag
# ARGS          Array of tag parameters, uppercased.  i.e. ARGS["HREF"]
# DEP           How many tags deep it's nested, including current tag.
#
###############################################################################
# Examples:
# # Rewrite cdata of all divs
# awk -f yanx.awk -e 'TAGS ~ /^DIV%/ { $2="quux froob" } 1' input
# # Extract href's from every link
# awk -f yanx.awk -e 'TAGS~/^A%/ && ("HREF" in ARGS) {
#       print ARGS["HREF"] }' ORS="\n" input
###############################################################################
# Known Bugs:
# A short XML script can't possibly handle DOD, etc.  Entities a la &lt;
# are not translated either.
#
# I've done my best to make it swallow <!--, <? ?> and other such fancy
# XML syntax without choking, but that doesn't mean it handles them
# properly either.
#
# It's an XML parser, not an HTML parser.  It probably won't swallow a
# wild-from-the internet HTML web page without some cleanup first:
# javascript, tags inside comments, etc will be mangled instead of ignored.
#
# Last: Because of its design, when printing raw HTML, yanx adds an extra <
# to the end of the file.  This is because < belongs at the beginning of
# a token but awk is told it's printed at the end.  There is no equivalent
# "line prefix" variable that I know of, if you want it to print smarter
# you'll have to print the <'s yourself, by setting ORS=" and
# printing lines like print "<" $0
###############################################################################
BEGIN {
        FS=">"; OFS=">";
        RS="<"; ORS="<"
}

# After match("qwertyuiop", /rty/)
#       rbefore("qwertyuiop") is "qwe",
#       rmid("qwertyuipo")    is "r"
#       rall("qwertyuiop")    is "rty"
#       rafter("qwertyuiop")  is "uiop"

# !?!?!
# function rbefore(STR)   { return(substr(STR, N, RSTART-1)); }# before match
function rbefore(STR)   { return(substr(STR, 0, RSTART-1)); }# before match
function rmid(STR)      { return(substr(STR, RSTART, 1)); }  # First char match
function rall(STR)      { return(substr(STR, RSTART, RLENGTH)); }# Entire match
function rafter(STR)    { return(substr(STR, RSTART+RLENGTH)); }# after match

function aquote(OUT, A, PFIX, TA) { # Turns Q SUBSEP R into A[PFIX":"Q]=R
        if(OUT)
        {
                if(PFIX) PFIX=PFIX":"
                split(OUT, TA, SUBSEP);
                A[toupper(PFIX) toupper(TA[1])]=TA[2];
        }

        return("");
}

# Intended to be less stupid about quoted text in XML/HTML.
# Splits a='b' c='d' e='f' into A[PFIX":"a]=b, A[PFIX":"c]=d, etc.
function qsplit(STR, A, PFIX, X, OUT) {
        sub(/\/$/, "", STR);    # Self-closing tags, mumblegrumble
        while(STR && match(STR, /([ \n\t]+)|[\x27\x22=]/))
        {
                OUT = OUT rbefore(STR);
                RMID=rmid(STR);

                if((RMID == "'") || (RMID == "\""))     # Quote characters
                {
                        if(!Q)          Q=RMID;         # Begin quote section
                        else if(Q == RMID)      Q="";   # End quote section
                        else                    OUT = OUT RMID; # Quoted quote
                } else if(RMID == "=") {
                        if(Q)   OUT=OUT RMID; else OUT=OUT SUBSEP;
                } else if((RMID=="\r")||(RMID=="\n")||(RMID=="\t")||(RMID==" ")) {
                        if(Q)   OUT = OUT rall(STR); # Literal quoted whitespace
                        else    OUT = aquote(OUT, A, PFIX); # Unquoted WS, next block
                }
                STR=rafter(STR); # Strip off the text we've processed already.
        }

        aquote(OUT STR, A, PFIX); # Process any text we haven't already.
}


{ SPEC=0 ; TAG="" }

NR==1 {
        if(ORS == RS) print;
        next } # The first "line" is blank when RS=<

/^[!?]/ { SPEC=1    }   # XML specification junk

# Handle open-tags
(!SPEC) && match($1, /^[^\/ \r\n\t>]+/) {
        CTAG=""
        TAG=substr(toupper($1), RSTART, RLENGTH);
        if((!SPEC) && !($1 ~ /\/$/))
        {
                TAGS=TAG "%" TAGS;
                DEP++;
                LTAGS=TAGS
        }

        for(X in ARGS) delete ARGS[X];

        qsplit(rafter($1), ARGS, "", "", "");
}

# Handle close-tags
(!SPEC) && /^[\/]/ {
        sub(/^\//, "", $1);
        LTAGS=TAGS
        CTAG=toupper($1)
        TAG=""
#        sub("^.*" toupper($1) "%", "", TAGS);
        sub("^" toupper($1) "%", "", TAGS);
        $1="/"$1
        DEP=split(TAGS, TA, "%")-1;
        # Update TAG with tag on top of stack, if any
#       if(DEP < 0) {   DEP=0;  TAG=""  }
#       else { TAG=TA[DEP]; }
}

Case in point: Congratulations, this specific XML found a bug in yanx, hence v 0.0.9.

Code:
$ awk -f yanx-0.0.9.awk -e 'TAG=="CONNECTOR" { print ARGS["FROMINSTANCE"], ARGS["FROMFIELD"] }' ORS="\n" OFS="\t" input.xml
EPIC    DID
EPIC    PROJECT_NAME
SQ_EPIC DID
SQ_EPIC PROJECT_NAME

$

These 2 Users Gave Thanks to Corona688 For This Post:
# 3  
Old 03-08-2019
how to insert

Thank you so much for your quick response !!!!
I need to pass below output into insert query dynamically.
Code:
$ awk -f yanx-0.0.9.awk -e 'TAG=="CONNECTOR" { print ARGS["FROMINSTANCE"], ARGS["FROMFIELD"] }' ORS="\n" OFS="\t" input.xml
EPIC    DID
EPIC    PROJECT_NAME
SQ_EPIC DID
SQ_EPIC PROJECT_NAME

Code:
insert into SQ_EPIC values( "DID", "PROJECT_NAME")
Select DID, PROJECT_NAME from EPIC

Please help me
Regards,
Sekhar


Moderator's Comments:
Mod Comment Seriously: Please use CODE tags as required by forum rules!

Last edited by RudiC; 03-08-2019 at 05:19 PM.. Reason: Added CODE tags.
# 4  
Old 03-08-2019
From your output into the SQL query: what goes where?
# 5  
Old 03-09-2019
Hello,

Here is my question
In my XML code below I need to fetch TOINSTANCETYPE, FROMINSTANCE, FROMFIELD then need create a Insert statement.
For example if TOINSTANCETYPE="Source Qualifier" then corresponding FROMINSTANCE is a source else if TOINSTANCETYPE="Target Definition" then corresponding FROMINSTANCE is a target.

My insert query looks like
Code:
insert into SQ_EPIC values( "DID", "PROJECT_NAME")
Select DID, PROJECT_NAME from EPIC

Here table / column names I need to pass it as dynamically.

Below is XML code I am using
Code:
<CONNECTOR TOINSTANCETYPE="Source Qualifier" TOINSTANCE="SQ_EPIC" TOFIELD="DID" FROMINSTANCETYPE="Source Definition" FROMINSTANCE="EPIC" FROMFIELD="DID"/>

<CONNECTOR TOINSTANCETYPE="Source Qualifier" TOINSTANCE="SQ_EPIC" TOFIELD="PROJECT_NAME" FROMINSTANCETYPE="Source Definition" FROMINSTANCE="EPIC" FROMFIELD="PROJECT_NAME"/>

<CONNECTOR TOINSTANCETYPE="Target Definition" TOINSTANCE="STAGE_EPIC" TOFIELD="DID" FROMINSTANCETYPE="Source Qualifier" FROMINSTANCE="SQ_EPIC" FROMFIELD="DID"/>

<CONNECTOR TOINSTANCETYPE="Target Definition" TOINSTANCE="STAGE_EPIC" TOFIELD="PROJECT_NAME"FROMINSTANCETYPE="Source Qualifier" FROMINSTANCE="SQ_EPIC" FROMFIELD="PROJECT_NAME"/>

Thanks in advance!!!

Regards,
Sekhar Lekkala.




Moderator's Comments:
Mod Comment Please use CODE tags as required by forum rules! You have received an warning infraction. Stick to the rules.

Last edited by RudiC; 03-09-2019 at 05:24 AM.. Reason: Added CODE tags again (and again...).
# 6  
Old 03-09-2019
You didn't answer my question but just repeated (and extended) the original request. The TOINSTANCETYPE is an extension to your original request. How and where does it fit?





To slightly narrow down my question: from your four line, eight field xml extraction result, whích goes where in a three "variable" (table, column1, colum2) insert statement? Is the order of the fields and results constant, or how are the fields targeted?
# 7  
Old 03-09-2019
Hi Rudi,

Greetings!!!
I am sorry for not responding to your question.

Yes, you are true the three "variable" (table, column1, colum2)
example: for source [EPIC (table name), DID (coulum1), Project_name (column2)]
for Target [SQ_EPIC (table name), DID (coulum1), Project_name (column2)]
Need a script to generate insert script dynamically.

Thanks in Advance!!!
Regards,
Sekhar Lekkala.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Read xml tags and then remove the tag using shell script

<Start> <Header> This is header section </Header> <Body> <Body_start> This is body section <a> <b> <c> <st>111</st> </c> <d> <st>blank</st> </d> </b> </a> </Body_start> <Body_section> This is body section (3 Replies)
Discussion started by: RJG
3 Replies

2. Shell Programming and Scripting

UNIX Shell script to work with .xml file

Hi Team, Could you please help me on below query: I want to retrieve XML elements from one .xml file. This .xml file has commented tags as well. so i am planning to write Unix command/script which 1.will chekc for this .xml file 2. it will ignore the commented XML lines. i.e. XML tags between... (3 Replies)
Discussion started by: waiting4u
3 Replies

3. Red Hat

How to read an xml file through shell script?

Hey , can we read an xml file and make changes in it through shell script. Thanks (4 Replies)
Discussion started by: ramsavi
4 Replies

4. Shell Programming and Scripting

Shell Script to read XML file

Hi unix Gurus, I am really new to Unix Scripting. Please help me to create a shell script which reads the xml file and from that i need to fetch a particular information. For example <SOURCE BUSINESSNAME ="" DATABASETYPE ="Teradata" DBDNAME ="DWPROD3" DESCRIPTION ="" NAME... (5 Replies)
Discussion started by: SmilePlease
5 Replies

5. UNIX for Advanced & Expert Users

Shell Script to read XML tags and the data within that tag

Hi unix Gurus, I am really new to Unix Scripting. Please help me to create a shell script which reads the xml file and from that i need to fetch a particular information. For example <SOURCE BUSINESSNAME ="" DATABASETYPE ="Teradata" DBDNAME ="DWPROD3" DESCRIPTION ="" NAME... (2 Replies)
Discussion started by: SmilePlease
2 Replies

6. Shell Programming and Scripting

Unix Script to read the XML file from Website

Hi Experts, I need a unix shell script which can copy the xml file from the below pasted website and paste in in my unix directory. http://www.westpac.co.nz/olcontent/olcontent.nsf/fx.xml Thanks in Advance... (8 Replies)
Discussion started by: phani333
8 Replies

7. Shell Programming and Scripting

How to send email using shell script in UNIX, Is any environment setup required in Mac OS X ?

Hi All, I am using Mac OS X (Leopard OS). I am very new to UNIX. My requirement is that, by running a shell script, I create a log file. So I have to send a mail having that log file attached. What I tried to do is, I simply tried to check,whether this direct command works or not. So I... (2 Replies)
Discussion started by: Afreen
2 Replies

8. Shell Programming and Scripting

Help required on basic Unix Bourne Shell Script

Howdy People :), I'm a newbie & its my first question here. I've started learning Unix Bourne Shell scripting recently and struggling already :p Can someone PLEASE help me with the following problem. Somehow my script is not working. Display an initial prompt of the form: Welcome to... (1 Reply)
Discussion started by: methopoth
1 Replies

9. UNIX for Advanced & Expert Users

Shell script failing to read large Xml record-urgent critical help

Hi All, I have shell script running on AIX 5.3 box. It has 7 to 8 "sed" commands piped(|) together. It has a an Xml file as its input which has many records internally. There are certain record which which have more than hundered tags.The script is taking a huge amount of time more than 1.5 hrs... (10 Replies)
Discussion started by: aixjadoo
10 Replies

10. Shell Programming and Scripting

shell script required to convert rows to columns

Hi Friends, I have a log file as below siteid = HYD spc = 100 rset = RS_D_M siteid = DEL spc = 200 rset = RS_K_L siteid = DEL2 spc = 210 rset = RS_D_M Now I need a output like column wise as below. siteid SPC rset HYD 100 RS_D_M (2 Replies)
Discussion started by: suresh3566
2 Replies
Login or Register to Ask a Question