Unix/Linux Go Back    


Shell Programming and Scripting Unix shell scripting - KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and shell scripts and shell scripting languages here.

awk and or sed command to sum the value in repeating tags in a XML

Shell Programming and Scripting


Tags
aix, awk, shellscript, unix

Closed Linux or Unix Question    
 
Thread Tools Search this Thread Display Modes
    #1  
Old Unix and Linux 12-26-2012
bk_12345 bk_12345 is offline
Registered User
 
Join Date: Sep 2012
Last Activity: 27 December 2012, 6:32 AM EST
Posts: 10
Thanks: 0
Thanked 0 Times in 0 Posts
awk and or sed command to sum the value in repeating tags in a XML

I have a XML in which <Amt Ccy="EUR">3.1</Amt> tag repeats. This is under another tag <Main>. I need to sum all the values of <Amt Ccy=""> (Ccy may vary) coming under <Main> using awk and or sed command.

can some help?

Sample looks like below


Code:
<root>
    <Main>
            <someothertag>..</someothertag>
        <Amt Ccy="EUR">3.1</Amt>
    </Main>
                .
                .
                .
                some other tags
    <Main>
          <someothertag>..</someothertag>
             <Amt Ccy="SGD">51</Amt>
    </Main>
    <another>
      <Amt Ccy="EUR">10</Amt>
     </another>
</root>


Last edited by bk_12345; 12-27-2012 at 03:05 AM..
Sponsored Links
    #2  
Old Unix and Linux 12-26-2012
joeyg's Unix or Linux Image
joeyg joeyg is offline Forum Staff  
modérateur
 
Join Date: Dec 2007
Last Activity: 28 July 2015, 11:10 AM EDT
Location: Within two miles of a Dunkin donuts.
Posts: 2,360
Thanks: 85
Thanked 164 Times in 150 Posts
Your example data has different tags than what you specified in your instructions.

Code:
<amount>3.1</amount>

vs.

Code:
<Amt Ccy="EUR">3.1</Amt>

Sponsored Links
    #3  
Old Unix and Linux 12-27-2012
bk_12345 bk_12345 is offline
Registered User
 
Join Date: Sep 2012
Last Activity: 27 December 2012, 6:32 AM EST
Posts: 10
Thanks: 0
Thanked 0 Times in 0 Posts
Edited the post
    #4  
Old Unix and Linux 12-27-2012
RudiC RudiC is offline Forum Advisor  
Registered User
 
Join Date: Jul 2012
Last Activity: 1 August 2015, 4:33 AM EDT
Location: Aachen, Germany
Posts: 6,408
Thanks: 112
Thanked 1,770 Times in 1,656 Posts
This will do the task for the sample file you posted:
Code:
awk '/Amt Ccy/ {sum+=$3} END {print sum}' FS="[<>]" file

It relies on the amount line NOT spanning several lines; there are solutions for that case in these fora.

And, be advised that summing up different currencies will almost certainly run you into trouble with the financial guys - awk offers ways to sum up into different curr. arrays. This is left for your exercise.
Sponsored Links
    #5  
Old Unix and Linux 12-27-2012
sathyaonnuix's Unix or Linux Image
sathyaonnuix sathyaonnuix is offline
Registered User
 
Join Date: Aug 2012
Last Activity: 26 June 2014, 12:23 PM EDT
Posts: 129
Thanks: 35
Thanked 14 Times in 13 Posts
@RudiC: The User is looking to sum up the values which comes between <Main> tags
Yours will include the Amt in <another> tag, can you please re-edit it.

---------- Post updated at 03:36 AM ---------- Previous update was at 03:23 AM ----------


Code:
<root>
    <Main>
            <someothertag>..</someothertag>
        <Amt Ccy="EUR">3.1</Amt>
    </Main>
                .
                .
                .
                some other tags
    <Main>
          <someothertag>..</someothertag>
             <Amt Ccy="SGD">51</Amt>
    </Main>
    <another>
      <Amt Ccy="EUR">10</Amt>
     </another>
</root>


Code:
# sed -n "/\<Main\>/,/\/Main>/p" file | grep Amt |  sed -e "s/<[^>]*>//g" -e "s/ //g" | awk '{ sum+=$1; } END { print sum }'
54.1

Sponsored Links
    #6  
Old Unix and Linux 12-27-2012
chidori chidori is offline
Registered User
 
Join Date: Jun 2011
Last Activity: 3 December 2013, 12:16 PM EST
Posts: 215
Thanks: 51
Thanked 3 Times in 3 Posts
Got some idea , after going through Rudic and Satyaonunix post. I could come up with this oneliner


Code:
[chidori@test tmp]$ awk '/\<Main\>/,/\/Main>/{if($0~/Amt/){gsub(/[^0-9.]/,"");sum+=$0}}END{ print sum }' infile
54.1


Last edited by chidori; 12-27-2012 at 05:33 AM..
Sponsored Links
    #7  
Old Unix and Linux 12-27-2012
fpmurphy's Unix or Linux Image
fpmurphy fpmurphy is offline Forum Staff  
who?
 
Join Date: Dec 2003
Last Activity: 28 July 2015, 6:16 PM EDT
Location: /dev/ph
Posts: 4,942
Thanks: 70
Thanked 461 Times in 427 Posts
Unfortunately neither awk or sed are optimal tools for use with XML documents. You really need to use a stylesheet transformation language such as XSLT.

Suppose our input document is:

Code:
<root>
    <main>
        <amount currency="EUR">3.1</amount>
    </main>
    <main>
        <amount currency="SGD">51</amount>
    </main>
    <main>
        <amount currency="USD">72.50</amount>
    </main>
    <main>
        <amount currency="SGD">101</amount>
    </main>
    <main>
        <amount currency="EUR">10</amount>
    </main>
</root>

The following stylesheet

Code:
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">

    <xsl:output indent="yes"/>
    <xsl:variable name="currencies" select="//amount"/>

    <xsl:template match="/">
        <currency-sums>
            <xsl:for-each select="$currencies">
                <xsl:if test="generate-id(.)=
                              generate-id( $currencies[ @currency =
                              current()/@currency ] )">
                    <sum currency="{@currency}">
                         <xsl:value-of select="sum( $currencies[@currency=
                                               current()/@currency] )"/>
                    </sum>
                </xsl:if>
             </xsl:for-each>
         </currency-sums>
     </xsl:template>

</xsl:stylesheet>

will produce

Code:
<?xml version="1.0"?>
<currency-sums>
  <sum currency="EUR">13.1</sum>
  <sum currency="SGD">152</sum>
  <sum currency="USD">72.5</sum>
</currency-sums>

The stylesheet can easily be modified just to produce two columns instead of a XML document

Code:
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">

    <xsl:output method="text" indent="yes"/>
    <xsl:variable name="currencies" select="//amount"/>

    <xsl:template match="/">
        <currency-sums>
            <xsl:for-each select="$currencies">
                <xsl:if test="generate-id(.)=
                              generate-id( $currencies[ @currency =
                              current()/@currency ] )">
                    <xsl:value-of select="@currency"/><xsl:text>  </xsl:text>
                    <sum currency="{@currency}">
                        <xsl:value-of select="sum( $currencies[@currency=
                                               current()/@currency] )"/>
                    </sum>
                         <xsl:text>
</xsl:text>
                </xsl:if>
             </xsl:for-each>
         </currency-sums>
     </xsl:template>

</xsl:stylesheet>

will produce:

Code:
EUR  13.1
SGD  152
USD  72.5

Sponsored Links
Closed Linux or Unix Question

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Linux More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Shell script to extract data in repeating tags from xml sailendra Shell Programming and Scripting 2 09-10-2012 07:32 AM
xml to csv using sed and awk command sbk UNIX for Dummies Questions & Answers 5 03-01-2012 02:24 PM
Using sed command to remove multiple instances of repeating headers in one file? rebazon UNIX for Dummies Questions & Answers 1 09-29-2011 01:55 PM
Repeating awk command nistleloy Shell Programming and Scripting 2 05-12-2009 04:50 PM
help sum columns by break in first column with awk or sed or something. syadnom Shell Programming and Scripting 6 02-01-2009 04:23 AM



All times are GMT -4. The time now is 04:16 AM.