awk and or sed command to sum the value in repeating tags in a XML


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk and or sed command to sum the value in repeating tags in a XML
# 1  
Old 12-26-2012
awk and or sed command to sum the value in repeating tags in a XML

I have a XML in which <Amt Ccy="EUR">3.1</Amt> tag repeats. This is under another tag <Main>. I need to sum all the values of <Amt Ccy=""> (Ccy may vary) coming under <Main> using awk and or sed command.

can some help?

Sample looks like below

Code:
<root>
    <Main>
            <someothertag>..</someothertag>
        <Amt Ccy="EUR">3.1</Amt>
    </Main>
                .
                .
                .
                some other tags
    <Main>
          <someothertag>..</someothertag>
             <Amt Ccy="SGD">51</Amt>
    </Main>
    <another>
      <Amt Ccy="EUR">10</Amt>
     </another>
</root>


Last edited by bk_12345; 12-27-2012 at 04:05 AM..
# 2  
Old 12-26-2012
Your example data has different tags than what you specified in your instructions.
Code:
<amount>3.1</amount>

vs.
Code:
<Amt Ccy="EUR">3.1</Amt>

# 3  
Old 12-27-2012
Edited the post
# 4  
Old 12-27-2012
This will do the task for the sample file you posted:
Code:
awk '/Amt Ccy/ {sum+=$3} END {print sum}' FS="[<>]" file

It relies on the amount line NOT spanning several lines; there are solutions for that case in these fora.

And, be advised that summing up different currencies will almost certainly run you into trouble with the financial guys - awk offers ways to sum up into different curr. arrays. This is left for your exercise.
# 5  
Old 12-27-2012
@RudiC: The User is looking to sum up the values which comes between <Main> tags
Yours will include the Amt in <another> tag, can you please re-edit it.

---------- Post updated at 03:36 AM ---------- Previous update was at 03:23 AM ----------

Code:
<root>
    <Main>
            <someothertag>..</someothertag>
        <Amt Ccy="EUR">3.1</Amt>
    </Main>
                .
                .
                .
                some other tags
    <Main>
          <someothertag>..</someothertag>
             <Amt Ccy="SGD">51</Amt>
    </Main>
    <another>
      <Amt Ccy="EUR">10</Amt>
     </another>
</root>

Code:
# sed -n "/\<Main\>/,/\/Main>/p" file | grep Amt |  sed -e "s/<[^>]*>//g" -e "s/ //g" | awk '{ sum+=$1; } END { print sum }'
54.1

# 6  
Old 12-27-2012
Got some idea , after going through Rudic and Satyaonunix post. I could come up with this oneliner

Code:
[chidori@test tmp]$ awk '/\<Main\>/,/\/Main>/{if($0~/Amt/){gsub(/[^0-9.]/,"");sum+=$0}}END{ print sum }' infile
54.1


Last edited by chidori; 12-27-2012 at 06:33 AM..
# 7  
Old 12-27-2012
Unfortunately neither awk or sed are optimal tools for use with XML documents. You really need to use a stylesheet transformation language such as XSLT.

Suppose our input document is:
Code:
<root>
    <main>
        <amount currency="EUR">3.1</amount>
    </main>
    <main>
        <amount currency="SGD">51</amount>
    </main>
    <main>
        <amount currency="USD">72.50</amount>
    </main>
    <main>
        <amount currency="SGD">101</amount>
    </main>
    <main>
        <amount currency="EUR">10</amount>
    </main>
</root>

The following stylesheet
Code:
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">

    <xsl:output indent="yes"/>
    <xsl:variable name="currencies" select="//amount"/>

    <xsl:template match="/">
        <currency-sums>
            <xsl:for-each select="$currencies">
                <xsl:if test="generate-id(.)=
                              generate-id( $currencies[ @currency =
                              current()/@currency ] )">
                    <sum currency="{@currency}">
                         <xsl:value-of select="sum( $currencies[@currency=
                                               current()/@currency] )"/>
                    </sum>
                </xsl:if>
             </xsl:for-each>
         </currency-sums>
     </xsl:template>

</xsl:stylesheet>

will produce
Code:
<?xml version="1.0"?>
<currency-sums>
  <sum currency="EUR">13.1</sum>
  <sum currency="SGD">152</sum>
  <sum currency="USD">72.5</sum>
</currency-sums>

The stylesheet can easily be modified just to produce two columns instead of a XML document
Code:
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">

    <xsl:output method="text" indent="yes"/>
    <xsl:variable name="currencies" select="//amount"/>

    <xsl:template match="/">
        <currency-sums>
            <xsl:for-each select="$currencies">
                <xsl:if test="generate-id(.)=
                              generate-id( $currencies[ @currency =
                              current()/@currency ] )">
                    <xsl:value-of select="@currency"/><xsl:text>  </xsl:text>
                    <sum currency="{@currency}">
                        <xsl:value-of select="sum( $currencies[@currency=
                                               current()/@currency] )"/>
                    </sum>
                         <xsl:text>
</xsl:text>
                </xsl:if>
             </xsl:for-each>
         </currency-sums>
     </xsl:template>

</xsl:stylesheet>

will produce:
Code:
EUR  13.1
SGD  152
USD  72.5

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Sum value using sed or awk ?

Hello all, how would one go about writing a command using sed/awk that will give me an output that can sum up the number of time each user has done something and also add the amount of time... so output would be for example "smiths has run 3 process and for time taken of value: 224" ... (5 Replies)
Discussion started by: crazy_max
5 Replies

2. Shell Programming and Scripting

Indexing each repeating pattern of rows in a column using awk/sed

Hello All, I have data like this in a column. 0 1 2 3 0 3 4 5 6 0 1 2 3 etc. where 0 identifies the start of a pattern in my data. So I need the output like below using either awk/sed. 0 1 (2 Replies)
Discussion started by: ks_reddy
2 Replies

3. Shell Programming and Scripting

How to add Xml tags to an existing xml using shell or awk?

Hi , I have a below xml: <ns:Body> <ns:result> <Date Month="June" Day="Monday:/> </ns:result> </ns:Body> i have a lookup abc.txtt text file with below details Month June July August Day Monday Tuesday Wednesday I need a output xml with below tags <ns:Body> <ns:result>... (2 Replies)
Discussion started by: Nevergivup
2 Replies

4. Shell Programming and Scripting

Shell Command to compare two xml lines while ignoring xml tags

I've got two different files and want to compare them. File 1 : HTML Code: <response ticketId="944" type="getQueryResults"><status>COMPLETE</status><description>Query results fetched successfully</description><recordSet totalCount="1" type="sms_records"><record... (1 Reply)
Discussion started by: Shaishav Shah
1 Replies

5. Shell Programming and Scripting

Shell script to extract data in repeating tags from xml

Hi, I am new to shell scripting. I need to extract data between repeating tags from an xml file and store the data in an array to process it further. <ns1:root xmlns:ns1="http://example.com/config"> <ns1:interface>in1</ns1:interface> <ns1:operation attribute1="true" attribute2="abd"... (2 Replies)
Discussion started by: sailendra
2 Replies

6. UNIX for Dummies Questions & Answers

xml to csv using sed and awk command

Hi Guys, Can you help me in creating shell script using sed,awk etc commands to generate csv file using xml file. (5 Replies)
Discussion started by: sbk
5 Replies

7. UNIX for Dummies Questions & Answers

Awk: print all URL addresses between iframe tags without repeating an already printed URL

Here is what I have so far: find . -name "*php*" -or -name "*htm*" | xargs grep -i iframe | awk -F'"' '/<iframe*/{gsub(/.\*iframe>/,"\"");print $2}' Here is an example content of a PHP or HTM(HTML) file: <iframe src="http://ADDRESS_1/?click=5BBB08\" width=1 height=1... (18 Replies)
Discussion started by: striker4o
18 Replies

8. UNIX for Dummies Questions & Answers

Using sed command to remove multiple instances of repeating headers in one file?

Hi, I have catenated multiple output files (from a monte carlo run) into one big output file. Each individual file has it's own two line header. So when I catenate, there are multiple two line headers (of the same wording) within the big file. How do I use the sed command to search for the... (1 Reply)
Discussion started by: rebazon
1 Replies

9. Shell Programming and Scripting

Repeating awk command

Hi all, I have an awk command that needs to be ran multiple times in a script on one file containing lots of fields of data. The file look like this (the numbers are made up): 1234 2222 2223 2222 123 2223 3333 2323 3333 3321 3344 4444 The... (2 Replies)
Discussion started by: nistleloy
2 Replies

10. Shell Programming and Scripting

help sum columns by break in first column with awk or sed or something.

I have some data that is something like this? item: onhand counted location ITEM0001 1 0 a1 ITEM0001 0 1 a2 ITEM0002 5 0 b5 ITEM0002 0 6 c1 I want to sum up... (6 Replies)
Discussion started by: syadnom
6 Replies
Login or Register to Ask a Question