UNIX shell script required to read two columns from xml


Login or Register to Reply

 
Thread Tools Search this Thread
# 1  
Old 2 Weeks Ago
UNIX shell script required to read two columns from xml

Hello All,

I am new to unix scripting. I need to read FROMINSTANCE, FROMFIELD from below XML code and need to write a script for
Code:
insert into SQ_EPIC values( "DID", "PROJECT_NAME")
Select DID, PROJECT_NAME from EPIC

Code:
<CONNECTOR TOINSTANCETYPE="Source Qualifier" TOINSTANCE="SQ_EPIC" TOFIELD="DID" FROMINSTANCETYPE="Source Definition" FROMINSTANCE="EPIC" FROMFIELD="DID"/>

<CONNECTOR TOINSTANCETYPE="Source Qualifier" TOINSTANCE="SQ_EPIC" TOFIELD="PROJECT_NAME" FROMINSTANCETYPE="Source Definition" FROMINSTANCE="EPIC" FROMFIELD="PROJECT_NAME"/>

<CONNECTOR TOINSTANCETYPE="Target Definition" TOINSTANCE="STAGE_EPIC" TOFIELD="DID" FROMINSTANCETYPE="Source Qualifier" FROMINSTANCE="SQ_EPIC" FROMFIELD="DID"/>

<CONNECTOR TOINSTANCETYPE="Target Definition" TOINSTANCE="STAGE_EPIC" TOFIELD="PROJECT_NAME"FROMINSTANCETYPE="Source Qualifier" FROMINSTANCE="SQ_EPIC" FROMFIELD="PROJECT_NAME"/>

Kindly some help me how to proceed for this requirement.

Thanks in Advance!!!

Regards,
Sekhar Lekkala.

Last edited by RudiC; 2 Weeks Ago at 04:23 PM..
# 2  
Old 2 Weeks Ago
Handling XML is not trivial, but we get asked for it a lot, and I have a script for the purpose:

Code:
# yanx.awk v0.0.9, Tyler Montbriand, 2019.  Yet another noncompliant XML parser
###############################################################################
# XML is a pain to process in the shell, but people need it all the time.
# I've been using and improving this kludge since 2014 or so.  It parses and
# stacks tags and digests parameters, allowing simple XML processing and
# extraction to be managed with a handful of lines addendum.
#
# I've restricted my use of GNU features enough that this script will run on
# busybox's awk.  I think it works with mawk except -e is unsupported.
# You can work around that by running multiple files, i.e.
# mawk -f yanx.awk -f mystuff.awk inputfile
###############################################################################
# Basic use:
#
# Fed this XML, <body><html a="b">Your Web Browser Hates This</html></body>
# yanx will read it token-by-token as so:
#     Line 1:  Empty, skipped
#     Line 2:  $1="body"
#     Line 3:  $1="html a="b"", $2="Your web browser hates this"
#     Line 4:  $1="/html"
#     Line 5:  $1="/body", $2="\n"
#
# The script sets a few new "special" variables along the way.
# TAG           The name of the current tag, uppercased.
# CTAG          If close-tag, name in uppercase.
# TAGS          List of nested tags, like HTML%BODY%, including current tag
# LTAGS         List of nested tags, not including current tag
# ARGS          Array of tag parameters, uppercased.  i.e. ARGS["HREF"]
# DEP           How many tags deep it's nested, including current tag.
#
###############################################################################
# Examples:
# # Rewrite cdata of all divs
# awk -f yanx.awk -e 'TAGS ~ /^DIV%/ { $2="quux froob" } 1' input
# # Extract href's from every link
# awk -f yanx.awk -e 'TAGS~/^A%/ && ("HREF" in ARGS) {
#       print ARGS["HREF"] }' ORS="\n" input
###############################################################################
# Known Bugs:
# A short XML script can't possibly handle DOD, etc.  Entities a la &lt;
# are not translated either.
#
# I've done my best to make it swallow <!--, <? ?> and other such fancy
# XML syntax without choking, but that doesn't mean it handles them
# properly either.
#
# It's an XML parser, not an HTML parser.  It probably won't swallow a
# wild-from-the internet HTML web page without some cleanup first:
# javascript, tags inside comments, etc will be mangled instead of ignored.
#
# Last: Because of its design, when printing raw HTML, yanx adds an extra <
# to the end of the file.  This is because < belongs at the beginning of
# a token but awk is told it's printed at the end.  There is no equivalent
# "line prefix" variable that I know of, if you want it to print smarter
# you'll have to print the <'s yourself, by setting ORS=" and
# printing lines like print "<" $0
###############################################################################
BEGIN {
        FS=">"; OFS=">";
        RS="<"; ORS="<"
}

# After match("qwertyuiop", /rty/)
#       rbefore("qwertyuiop") is "qwe",
#       rmid("qwertyuipo")    is "r"
#       rall("qwertyuiop")    is "rty"
#       rafter("qwertyuiop")  is "uiop"

# !?!?!
# function rbefore(STR)   { return(substr(STR, N, RSTART-1)); }# before match
function rbefore(STR)   { return(substr(STR, 0, RSTART-1)); }# before match
function rmid(STR)      { return(substr(STR, RSTART, 1)); }  # First char match
function rall(STR)      { return(substr(STR, RSTART, RLENGTH)); }# Entire match
function rafter(STR)    { return(substr(STR, RSTART+RLENGTH)); }# after match

function aquote(OUT, A, PFIX, TA) { # Turns Q SUBSEP R into A[PFIX":"Q]=R
        if(OUT)
        {
                if(PFIX) PFIX=PFIX":"
                split(OUT, TA, SUBSEP);
                A[toupper(PFIX) toupper(TA[1])]=TA[2];
        }

        return("");
}

# Intended to be less stupid about quoted text in XML/HTML.
# Splits a='b' c='d' e='f' into A[PFIX":"a]=b, A[PFIX":"c]=d, etc.
function qsplit(STR, A, PFIX, X, OUT) {
        sub(/\/$/, "", STR);    # Self-closing tags, mumblegrumble
        while(STR && match(STR, /([ \n\t]+)|[\x27\x22=]/))
        {
                OUT = OUT rbefore(STR);
                RMID=rmid(STR);

                if((RMID == "'") || (RMID == "\""))     # Quote characters
                {
                        if(!Q)          Q=RMID;         # Begin quote section
                        else if(Q == RMID)      Q="";   # End quote section
                        else                    OUT = OUT RMID; # Quoted quote
                } else if(RMID == "=") {
                        if(Q)   OUT=OUT RMID; else OUT=OUT SUBSEP;
                } else if((RMID=="\r")||(RMID=="\n")||(RMID=="\t")||(RMID==" ")) {
                        if(Q)   OUT = OUT rall(STR); # Literal quoted whitespace
                        else    OUT = aquote(OUT, A, PFIX); # Unquoted WS, next block
                }
                STR=rafter(STR); # Strip off the text we've processed already.
        }

        aquote(OUT STR, A, PFIX); # Process any text we haven't already.
}


{ SPEC=0 ; TAG="" }

NR==1 {
        if(ORS == RS) print;
        next } # The first "line" is blank when RS=<

/^[!?]/ { SPEC=1    }   # XML specification junk

# Handle open-tags
(!SPEC) && match($1, /^[^\/ \r\n\t>]+/) {
        CTAG=""
        TAG=substr(toupper($1), RSTART, RLENGTH);
        if((!SPEC) && !($1 ~ /\/$/))
        {
                TAGS=TAG "%" TAGS;
                DEP++;
                LTAGS=TAGS
        }

        for(X in ARGS) delete ARGS[X];

        qsplit(rafter($1), ARGS, "", "", "");
}

# Handle close-tags
(!SPEC) && /^[\/]/ {
        sub(/^\//, "", $1);
        LTAGS=TAGS
        CTAG=toupper($1)
        TAG=""
#        sub("^.*" toupper($1) "%", "", TAGS);
        sub("^" toupper($1) "%", "", TAGS);
        $1="/"$1
        DEP=split(TAGS, TA, "%")-1;
        # Update TAG with tag on top of stack, if any
#       if(DEP < 0) {   DEP=0;  TAG=""  }
#       else { TAG=TA[DEP]; }
}

Case in point: Congratulations, this specific XML found a bug in yanx, hence v 0.0.9.

Code:
$ awk -f yanx-0.0.9.awk -e 'TAG=="CONNECTOR" { print ARGS["FROMINSTANCE"], ARGS["FROMFIELD"] }' ORS="\n" OFS="\t" input.xml
EPIC    DID
EPIC    PROJECT_NAME
SQ_EPIC DID
SQ_EPIC PROJECT_NAME

$

These 2 Users Gave Thanks to Corona688 For This Post:
rbatte1 (2 Weeks Ago) sekhar.lsb (2 Weeks Ago)
# 3  
Old 2 Weeks Ago
how to insert

Thank you so much for your quick response !!!!
I need to pass below output into insert query dynamically.
Code:
$ awk -f yanx-0.0.9.awk -e 'TAG=="CONNECTOR" { print ARGS["FROMINSTANCE"], ARGS["FROMFIELD"] }' ORS="\n" OFS="\t" input.xml
EPIC    DID
EPIC    PROJECT_NAME
SQ_EPIC DID
SQ_EPIC PROJECT_NAME

Code:
insert into SQ_EPIC values( "DID", "PROJECT_NAME")
Select DID, PROJECT_NAME from EPIC

Please help me
Regards,
Sekhar


Moderator's Comments:
Mod Comment Seriously: Please use CODE tags as required by forum rules!

Last edited by RudiC; 2 Weeks Ago at 04:19 PM.. Reason: Added CODE tags.
# 4  
Old 2 Weeks Ago
From your output into the SQL query: what goes where?
# 5  
Old 2 Weeks Ago
Hello,

Here is my question
In my XML code below I need to fetch TOINSTANCETYPE, FROMINSTANCE, FROMFIELD then need create a Insert statement.
For example if TOINSTANCETYPE="Source Qualifier" then corresponding FROMINSTANCE is a source else if TOINSTANCETYPE="Target Definition" then corresponding FROMINSTANCE is a target.

My insert query looks like
Code:
insert into SQ_EPIC values( "DID", "PROJECT_NAME")
Select DID, PROJECT_NAME from EPIC

Here table / column names I need to pass it as dynamically.

Below is XML code I am using
Code:
<CONNECTOR TOINSTANCETYPE="Source Qualifier" TOINSTANCE="SQ_EPIC" TOFIELD="DID" FROMINSTANCETYPE="Source Definition" FROMINSTANCE="EPIC" FROMFIELD="DID"/>

<CONNECTOR TOINSTANCETYPE="Source Qualifier" TOINSTANCE="SQ_EPIC" TOFIELD="PROJECT_NAME" FROMINSTANCETYPE="Source Definition" FROMINSTANCE="EPIC" FROMFIELD="PROJECT_NAME"/>

<CONNECTOR TOINSTANCETYPE="Target Definition" TOINSTANCE="STAGE_EPIC" TOFIELD="DID" FROMINSTANCETYPE="Source Qualifier" FROMINSTANCE="SQ_EPIC" FROMFIELD="DID"/>

<CONNECTOR TOINSTANCETYPE="Target Definition" TOINSTANCE="STAGE_EPIC" TOFIELD="PROJECT_NAME"FROMINSTANCETYPE="Source Qualifier" FROMINSTANCE="SQ_EPIC" FROMFIELD="PROJECT_NAME"/>

Thanks in advance!!!

Regards,
Sekhar Lekkala.




Moderator's Comments:
Mod Comment Please use CODE tags as required by forum rules! You have received an warning infraction. Stick to the rules.

Last edited by RudiC; 2 Weeks Ago at 04:24 AM.. Reason: Added CODE tags again (and again...).
# 6  
Old 2 Weeks Ago
You didn't answer my question but just repeated (and extended) the original request. The TOINSTANCETYPE is an extension to your original request. How and where does it fit?





To slightly narrow down my question: from your four line, eight field xml extraction result, whích goes where in a three "variable" (table, column1, colum2) insert statement? Is the order of the fields and results constant, or how are the fields targeted?
# 7  
Old 2 Weeks Ago
Hi Rudi,

Greetings!!!
I am sorry for not responding to your question.

Yes, you are true the three "variable" (table, column1, colum2)
example: for source [EPIC (table name), DID (coulum1), Project_name (column2)]
for Target [SQ_EPIC (table name), DID (coulum1), Project_name (column2)]
Need a script to generate insert script dynamically.

Thanks in Advance!!!
Regards,
Sekhar Lekkala.
Login or Register to Reply

|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
Read xml tags and then remove the tag using shell script RJG Shell Programming and Scripting 3 01-18-2017 06:30 AM
How to pass the parameter in xml file in UNIX shell script? Debalina Roy Shell Programming and Scripting 3 05-05-2015 11:58 AM
How to pass the parameter in xml file in UNIX shell script? Debalina Roy Shell Programming and Scripting 2 05-05-2015 08:12 AM
UNIX Shell script to work with .xml file waiting4u Shell Programming and Scripting 3 05-18-2014 02:17 AM
How to read an xml file through shell script? ramsavi Red Hat 4 05-08-2013 06:16 AM
Shell Script to read XML file SmilePlease Shell Programming and Scripting 5 04-03-2013 07:55 AM
Shell Script to read XML tags and the data within that tag SmilePlease UNIX for Advanced & Expert Users 2 04-03-2013 07:16 AM
Howto compare the columns of 2 diff tables of 2 different schemas in UNIX shell script Rajkumar Gopal Shell Programming and Scripting 2 03-27-2013 03:01 AM
Unix Script to read the XML file from Website phani333 Shell Programming and Scripting 8 02-01-2011 08:23 PM
How to send email using shell script in UNIX, Is any environment setup required in Mac OS X ? Afreen Shell Programming and Scripting 2 05-05-2010 02:13 PM
Help required on basic Unix Bourne Shell Script methopoth Shell Programming and Scripting 1 03-25-2010 05:44 AM
edit columns in unix shell script kingpeejay Shell Programming and Scripting 10 06-22-2009 08:01 PM
How to remove xml namespace from xml file using shell script? Gary1978 Shell Programming and Scripting 10 10-26-2008 09:32 PM
Shell script failing to read large Xml record-urgent critical help aixjadoo UNIX for Advanced & Expert Users 10 06-16-2008 06:22 AM
shell script required to convert rows to columns suresh3566 Shell Programming and Scripting 2 05-07-2008 05:25 AM