awk and or sed command to sum the value in repeating tags in a XML | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

awk and or sed command to sum the value in repeating tags in a XML

Shell Programming and Scripting


Tags
aix, awk, shellscript, unix

Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 12-26-2012
bk_12345 bk_12345 is offline
Registered User
 
Join Date: Sep 2012
Last Activity: 27 December 2012, 6:32 AM EST
Posts: 10
Thanks: 0
Thanked 0 Times in 0 Posts
awk and or sed command to sum the value in repeating tags in a XML

I have a XML in which <Amt Ccy="EUR">3.1</Amt> tag repeats. This is under another tag <Main>. I need to sum all the values of <Amt Ccy=""> (Ccy may vary) coming under <Main> using awk and or sed command.

can some help?

Sample looks like below


Code:
<root>
    <Main>
            <someothertag>..</someothertag>
        <Amt Ccy="EUR">3.1</Amt>
    </Main>
                .
                .
                .
                some other tags
    <Main>
          <someothertag>..</someothertag>
             <Amt Ccy="SGD">51</Amt>
    </Main>
    <another>
      <Amt Ccy="EUR">10</Amt>
     </another>
</root>


Last edited by bk_12345; 12-27-2012 at 03:05 AM..
Sponsored Links
    #2  
Old 12-26-2012
joeyg's Avatar
joeyg joeyg is offline Forum Staff  
modérateur
 
Join Date: Dec 2007
Last Activity: 26 August 2014, 2:14 PM EDT
Location: Out running a Marathon.
Posts: 2,322
Thanks: 71
Thanked 153 Times in 142 Posts
Your example data has different tags than what you specified in your instructions.

Code:
<amount>3.1</amount>

vs.

Code:
<Amt Ccy="EUR">3.1</Amt>

Sponsored Links
    #3  
Old 12-27-2012
bk_12345 bk_12345 is offline
Registered User
 
Join Date: Sep 2012
Last Activity: 27 December 2012, 6:32 AM EST
Posts: 10
Thanks: 0
Thanked 0 Times in 0 Posts
Edited the post
    #4  
Old 12-27-2012
RudiC RudiC is offline Forum Advisor  
Registered User
 
Join Date: Jul 2012
Last Activity: 1 September 2014, 3:01 PM EDT
Location: Aachen, Germany
Posts: 4,159
Thanks: 69
Thanked 995 Times in 943 Posts
This will do the task for the sample file you posted:
Code:
awk '/Amt Ccy/ {sum+=$3} END {print sum}' FS="[<>]" file

It relies on the amount line NOT spanning several lines; there are solutions for that case in these fora.

And, be advised that summing up different currencies will almost certainly run you into trouble with the financial guys - awk offers ways to sum up into different curr. arrays. This is left for your exercise.
Sponsored Links
    #5  
Old 12-27-2012
sathyaonnuix's Avatar
sathyaonnuix sathyaonnuix is offline
Registered User
 
Join Date: Aug 2012
Last Activity: 26 June 2014, 12:23 PM EDT
Posts: 129
Thanks: 35
Thanked 14 Times in 13 Posts
@RudiC: The User is looking to sum up the values which comes between <Main> tags
Yours will include the Amt in <another> tag, can you please re-edit it.

---------- Post updated at 03:36 AM ---------- Previous update was at 03:23 AM ----------


Code:
<root>
    <Main>
            <someothertag>..</someothertag>
        <Amt Ccy="EUR">3.1</Amt>
    </Main>
                .
                .
                .
                some other tags
    <Main>
          <someothertag>..</someothertag>
             <Amt Ccy="SGD">51</Amt>
    </Main>
    <another>
      <Amt Ccy="EUR">10</Amt>
     </another>
</root>


Code:
# sed -n "/\<Main\>/,/\/Main>/p" file | grep Amt |  sed -e "s/<[^>]*>//g" -e "s/ //g" | awk '{ sum+=$1; } END { print sum }'
54.1

Sponsored Links
    #6  
Old 12-27-2012
chidori chidori is offline
Registered User
 
Join Date: Jun 2011
Last Activity: 3 December 2013, 12:16 PM EST
Posts: 215
Thanks: 51
Thanked 3 Times in 3 Posts
Got some idea , after going through Rudic and Satyaonunix post. I could come up with this oneliner


Code:
[chidori@test tmp]$ awk '/\<Main\>/,/\/Main>/{if($0~/Amt/){gsub(/[^0-9.]/,"");sum+=$0}}END{ print sum }' infile
54.1


Last edited by chidori; 12-27-2012 at 05:33 AM..
Sponsored Links
    #7  
Old 12-27-2012
fpmurphy's Avatar
fpmurphy fpmurphy is offline Forum Staff  
who?
 
Join Date: Dec 2003
Last Activity: 31 August 2014, 11:02 AM EDT
Location: /dev/ph
Posts: 4,787
Thanks: 63
Thanked 424 Times in 393 Posts
Unfortunately neither awk or sed are optimal tools for use with XML documents. You really need to use a stylesheet transformation language such as XSLT.

Suppose our input document is:

Code:
<root>
    <main>
        <amount currency="EUR">3.1</amount>
    </main>
    <main>
        <amount currency="SGD">51</amount>
    </main>
    <main>
        <amount currency="USD">72.50</amount>
    </main>
    <main>
        <amount currency="SGD">101</amount>
    </main>
    <main>
        <amount currency="EUR">10</amount>
    </main>
</root>

The following stylesheet

Code:
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">

    <xsl:output indent="yes"/>
    <xsl:variable name="currencies" select="//amount"/>

    <xsl:template match="/">
        <currency-sums>
            <xsl:for-each select="$currencies">
                <xsl:if test="generate-id(.)=
                              generate-id( $currencies[ @currency =
                              current()/@currency ] )">
                    <sum currency="{@currency}">
                         <xsl:value-of select="sum( $currencies[@currency=
                                               current()/@currency] )"/>
                    </sum>
                </xsl:if>
             </xsl:for-each>
         </currency-sums>
     </xsl:template>

</xsl:stylesheet>

will produce

Code:
<?xml version="1.0"?>
<currency-sums>
  <sum currency="EUR">13.1</sum>
  <sum currency="SGD">152</sum>
  <sum currency="USD">72.5</sum>
</currency-sums>

The stylesheet can easily be modified just to produce two columns instead of a XML document

Code:
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">

    <xsl:output method="text" indent="yes"/>
    <xsl:variable name="currencies" select="//amount"/>

    <xsl:template match="/">
        <currency-sums>
            <xsl:for-each select="$currencies">
                <xsl:if test="generate-id(.)=
                              generate-id( $currencies[ @currency =
                              current()/@currency ] )">
                    <xsl:value-of select="@currency"/><xsl:text>  </xsl:text>
                    <sum currency="{@currency}">
                        <xsl:value-of select="sum( $currencies[@currency=
                                               current()/@currency] )"/>
                    </sum>
                         <xsl:text>
</xsl:text>
                </xsl:if>
             </xsl:for-each>
         </currency-sums>
     </xsl:template>

</xsl:stylesheet>

will produce:

Code:
EUR  13.1
SGD  152
USD  72.5

Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Shell script to extract data in repeating tags from xml sailendra Shell Programming and Scripting 2 09-10-2012 07:32 AM
xml to csv using sed and awk command sbk UNIX for Dummies Questions & Answers 5 03-01-2012 02:24 PM
Using sed command to remove multiple instances of repeating headers in one file? rebazon UNIX for Dummies Questions & Answers 1 09-29-2011 01:55 PM
Repeating awk command nistleloy Shell Programming and Scripting 2 05-12-2009 04:50 PM
help sum columns by break in first column with awk or sed or something. syadnom Shell Programming and Scripting 6 02-01-2009 04:23 AM



All times are GMT -4. The time now is 05:10 AM.