How to trim the zero's after decimal?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to trim the zero's after decimal?
# 1  
Old 09-03-2014
How to trim the zero's after decimal?

Hello all,
I have an XML with below content from which i need to remove the trailing zeros, like 123.00 should be converted to 123 and 123.01200 to 123.012 Below is the sample excerpt data from XML file. My input file size could be approximately 5 GB or less.

CURRENT:
Code:
<ACCRUED_INTEREST>0.00</ACCRUED_INTEREST>  
  <BOOK_VALUE>0.00</BOOK_VALUE>  
  <COMMIT_CURRENT>29250.00</COMMIT_CURRENT>  
  <COMMIT_UNDISBURSED>29250.00</COMMIT_UNDISBURSED>  
  <COST_OF_FUNDS>0.0049000000</COST_OF_FUNDS>  
  <DATE_ORIGINATED>2013-05-15</DATE_ORIGINATED>  
  <GOVT_GUARANTOR>0</GOVT_GUARANTOR>  
  <INT_RATE>0.0518000000</INT_RATE>

EXPECTED:
Code:
<ACCRUED_INTEREST>0</ACCRUED_INTEREST>  
  <BOOK_VALUE>0</BOOK_VALUE>  
  <COMMIT_CURRENT>29250</COMMIT_CURRENT>  
  <COMMIT_UNDISBURSED>29250</COMMIT_UNDISBURSED>  
  <COST_OF_FUNDS>0.0049</COST_OF_FUNDS>  
  <DATE_ORIGINATED>2013-05-15</DATE_ORIGINATED>  
  <GOVT_GUARANTOR>0</GOVT_GUARANTOR>  
  <INT_RATE>0.0518</INT_RATE>

Thank you.
# 2  
Old 09-03-2014
Here's my mini-xml parser again:

Code:
$ cat minixml.awk

BEGIN {
        FS=">"; OFS=">";
        RS="<"; ORS="<"
}

{ SPEC=0 ; TAG="" }

NR==1 {
        if(ORS == RS) print;
        next
} # The first "line" is blank when RS=<

/^[!?]/ { SPEC=1    }   # XML specification junk

# Handle open-tags
match($0, /^[^\/ \r\n\t>]+/) {
        TAG=substr(toupper($0), RSTART, RLENGTH);
        if(!SPEC)
        {
                TAGS=TAG "%" TAGS;      DEP++;
                LTAGS=TAGS
        }
}

# Handle close-tags
(!SPEC) && /^[\/]/ {
        sub(/^\//, "", $1);
        LTAGS=TAGS
        sub("^.*" toupper($1) "%", "", TAGS);
        $1="/"$1
        DEP=split(TAGS, TA, "%")-1;
        if(DEP < 0) DEP=0;
}

$ awk -f minixml.awk -e '$2 ~ /^[0-9]*[.][0-9]*$/ { sub(/[.]?0*$/ , "", $2) } 1' input.xml

<ACCRUED_INTEREST>0</ACCRUED_INTEREST>
  <BOOK_VALUE>0</BOOK_VALUE>
  <COMMIT_CURRENT>29250</COMMIT_CURRENT>
  <COMMIT_UNDISBURSED>29250</COMMIT_UNDISBURSED>
  <COST_OF_FUNDS>0.0049</COST_OF_FUNDS>
  <DATE_ORIGINATED>2013-05-15</DATE_ORIGINATED>
  <GOVT_GUARANTOR></GOVT_GUARANTOR>
  <INT_RATE>0.0518</INT_RATE>

$

# 3  
Old 09-03-2014
Thank you i am receiving below error message.

Code:
awk -f minixml.awk -e '$2 ~ /^[0-9]*[.][0-9]*$/ { sub(/[.]?0*$/ , "", $2) } 1' input.xml
awk: minixml.awk:3: fatal: cannot open file `-e' for reading (No such file or directory

# 4  
Old 09-03-2014
I thought all awk had -e. Oh well. Stripped it down to a faster program which doesn't need -e:

Code:
$ ls -lh filter.xml
-rw-r--r-- 1 user user 451M Sep  3 14:10 filter.xml

$ time awk 'NR==1 { next } NF==2 { sub(/[.]?0*$/ , "", $2) } { print RS $0 }' FS=">" RS="<" OFS=">" ORS="" filter.xml > out.xml

real    1m1.632s
user    0m59.813s
sys     0m1.787s

$ head -n 20 out.xml

<ACCRUED_INTEREST>0</ACCRUED_INTEREST>
  <BOOK_VALUE>0</BOOK_VALUE>
  <COMMIT_CURRENT>29250</COMMIT_CURRENT>
  <COMMIT_UNDISBURSED>29250</COMMIT_UNDISBURSED>
  <COST_OF_FUNDS>0.0049</COST_OF_FUNDS>
  <DATE_ORIGINATED>2013-05-15</DATE_ORIGINATED>
  <GOVT_GUARANTOR></GOVT_GUARANTOR>
  <INT_RATE>0.0518</INT_RATE>
<ACCRUED_INTEREST>0</ACCRUED_INTEREST>
  <BOOK_VALUE>0</BOOK_VALUE>
  <COMMIT_CURRENT>29250</COMMIT_CURRENT>
  <COMMIT_UNDISBURSED>29250</COMMIT_UNDISBURSED>
  <COST_OF_FUNDS>0.0049</COST_OF_FUNDS>
  <DATE_ORIGINATED>2013-05-15</DATE_ORIGINATED>
  <GOVT_GUARANTOR></GOVT_GUARANTOR>
  <INT_RATE>0.0518</INT_RATE>
<ACCRUED_INTEREST>0</ACCRUED_INTEREST>
  <BOOK_VALUE>0</BOOK_VALUE>
  <COMMIT_CURRENT>29250</COMMIT_CURRENT>
  <COMMIT_UNDISBURSED>29250</COMMIT_UNDISBURSED>

$

This User Gave Thanks to Corona688 For This Post:
# 5  
Old 09-05-2014
Thanks much it is working for the XML data sample i gave it to you, but i see an issue like for below excerpt of XML file which is complete possible set of tags and attributes in my XML file, when it has multiple attributes in a single tag your awk script is not working. Could you please help me tweaking it?

please check these attributes:ACCEPTABLE_VOL_COUNT="357.000" ACCEPTABLE_VOL_DOLLARS="71829447.08000".

Excerpt:
Code:
<Provider>
<Institution ACCEPTABLE_VOL_COUNT="357.000" ACCEPTABLE_VOL_DOLLARS="71829447.08000" ACCRUED_INTEREST_COUNT="344" ACCRUED_INTEREST_DOLLARS="299979.26" BEGINNING_FARMER_FLAG_COUNT="244" BOOK_VALUE_COUNT="370" BOOK_VALUE_DOLLARS="75554816.98" CUSTOMER_ROW_COUNT="330" DOUBTFUL_VOL_COUNT="0" DOUBTFUL_VOL_DOLLARS="0.00" EXTRACT_DATE="2014-06-30" LOAN_ROW_COUNT="389" OAEM_VOL_COUNT="3" OAEM_VOL_DOLLARS="267446.68" PAST_DUE_AMOUNT_COUNT="12" PAST_DUE_AMOUNT_DOLLARS="2625411.64" PD_RATING_COUNT="389" PD_RATING_VALUES="2508" PRINCIPAL_BALANCE_COUNT="369" PRINCIPAL_BALANCE_DOLLARS="75254837.72" SMALL_FARMER_FLAG_COUNT="286" SUBSTANDARD_VOL_COUNT="10" SUBSTANDARD_VOL_DOLLARS="3457923.22" UNINUM="xxxxxx" YOUNG_FARMER_FLAG_COUNT="31">
        <Customer CIF="xxxxx">
        <BORROWER_NAME>xxxxx</BORROWER_NAME>
        <FIPS_CODE>15003</FIPS_CODE>
        <RELATED_PARTY_LOAN_CODE>0</RELATED_PARTY_LOAN_CODE>
        <DEBT_REPAYMENT_COVERAGE_RATIO>3.0000000000</DEBT_REPAYMENT_COVERAGE_RATIO>
        <CURRENT_ASSETS>1112799.00</CURRENT_ASSETS>
        <CURRENT_LIABILITIES>121482.00</CURRENT_LIABILITIES>
        <FARM_OPS_EXP>563390.00</FARM_OPS_EXP>
        <GROSS_AG_INC>593480.00</GROSS_AG_INC>
        <INT_EXP>17590.00</INT_EXP>
        <NON_CURR_ASSET>3285500.00</NON_CURR_ASSET>
        <NON_CURR_LIABILITIES>529347.00</NON_CURR_LIABILITIES>
        <NET_AG_INC>30090.00</NET_AG_INC>
        <NET_INC>194677.00</NET_INC>
        <NET_WORTH>3747470.00</NET_WORTH>
        <NONFARM_INC>164587.00</NONFARM_INC>
        <TOTAL_ASSETS>4398299.00</TOTAL_ASSETS>
        <TOTAL_LIABILITIES>650829.00</TOTAL_LIABILITIES>
        <DEBT_SERVICE_REQUIREMENT>46617.00</DEBT_SERVICE_REQUIREMENT>
        <REPAYMENT_SOURCE>1</REPAYMENT_SOURCE>
        <COST_OF_FUNDS>0.0049000000</COST_OF_FUNDS>
        </customer>
</Institution>
</provider>

Output:

Code:
<Provider>
<Institution ACCEPTABLE_VOL_COUNT="357.000" ACCEPTABLE_VOL_DOLLARS="71829447.08000" ACCRUED_INTEREST_COUNT="344" ACCRUED_INTEREST_DOLLARS="299979.26" BEGINNING_FARMER_FLAG_COUNT="244" BOOK_VALUE_COUNT="370" BOOK_VALUE_DOLLARS="75554816.98" CUSTOMER_ROW_COUNT="330" DOUBTFUL_VOL_COUNT="0" DOUBTFUL_VOL_DOLLARS="0.00" EXTRACT_DATE="2014-06-30" LOAN_ROW_COUNT="389" OAEM_VOL_COUNT="3" OAEM_VOL_DOLLARS="267446.68" PAST_DUE_AMOUNT_COUNT="12" PAST_DUE_AMOUNT_DOLLARS="2625411.64" PD_RATING_COUNT="389" PD_RATING_VALUES="2508" PRINCIPAL_BALANCE_COUNT="369" PRINCIPAL_BALANCE_DOLLARS="75254837.72" SMALL_FARMER_FLAG_COUNT="286" SUBSTANDARD_VOL_COUNT="10" SUBSTANDARD_VOL_DOLLARS="3457923.22" UNINUM="xxxxxx" YOUNG_FARMER_FLAG_COUNT="31">
        <Customer CIF="xxxxx">
        <BORROWER_NAME>xxxxx</BORROWER_NAME>
        <FIPS_CODE>15003</FIPS_CODE>
        <RELATED_PARTY_LOAN_CODE></RELATED_PARTY_LOAN_CODE>
        <DEBT_REPAYMENT_COVERAGE_RATIO>3</DEBT_REPAYMENT_COVERAGE_RATIO>
        <CURRENT_ASSETS>1112799</CURRENT_ASSETS>
        <CURRENT_LIABILITIES>121482</CURRENT_LIABILITIES>
        <FARM_OPS_EXP>563390</FARM_OPS_EXP>
        <GROSS_AG_INC>593480</GROSS_AG_INC>
        <INT_EXP>17590</INT_EXP>
        <NON_CURR_ASSET>3285500</NON_CURR_ASSET>
        <NON_CURR_LIABILITIES>529347</NON_CURR_LIABILITIES>
        <NET_AG_INC>30090</NET_AG_INC>
        <NET_INC>194677</NET_INC>
        <NET_WORTH>3747470</NET_WORTH>
        <NONFARM_INC>164587</NONFARM_INC>
        <TOTAL_ASSETS>4398299</TOTAL_ASSETS>
        <TOTAL_LIABILITIES>650829</TOTAL_LIABILITIES>
        <DEBT_SERVICE_REQUIREMENT>46617</DEBT_SERVICE_REQUIREMENT>
        <REPAYMENT_SOURCE>1</REPAYMENT_SOURCE>
        <COST_OF_FUNDS>0.0049</COST_OF_FUNDS>
        </customer>
</Institution>
</provider>


Last edited by Ariean; 09-05-2014 at 03:41 PM..
# 6  
Old 09-05-2014
You didn't ask for that, so I didn't think to do so.

This requires a more complicated expression which won't work in awk. Perl can do it though, and turns out to be faster:

Code:
$ ls -lh filter.xml
-rw-r--r-- 1 user user 905M Sep  5 13:53 filter.xml

$ time perl -074 -p -e "s/[.]?0*([<'\"])/\\1/g;" filter.xml > output.xml


real    1m57.446s
user    1m52.417s
sys     0m3.447s

$

# 7  
Old 09-05-2014
Another awk for the original question:
Code:
awk '$2+0==$2 {$2+=0} {$0=RS $0}NR>1' RS=\< ORS= FS=\> OFS=\> file

Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Sum the fields with 6 decimal places - getting only 2 decimal places as output

I used the below script to Sum up a field in a file based on some unique values. But the problem is when it is summing up the units, it is truncating to 2 decimals and not 6 decimals as in the input file (Input file has the units with up to 6 Decimals – Sample data below, when the units in the 2... (4 Replies)
Discussion started by: brlsubbu
4 Replies

2. UNIX for Dummies Questions & Answers

How to trim the decimal place for all the columns?

Dear all, I have a file call test.txt which has 2000 columns, 1000 rows. If I want to trim all the columns to 3 decimal places, how can I do it? I know how to use awk prinf to trim specic columns. But I don't know how to trim all the columns. Thank you. data sample: 0.976004565 9.34567845... (6 Replies)
Discussion started by: forevertl
6 Replies

3. Shell Programming and Scripting

Decimal Padding in Decimal

Hi Experts, I have requirement to pad a decimal number that should have fixed length as 10. if number is 234.234 > 234.234000 if number is 12.4 > 12.4000000 if number is 3456.5678 > 3456.56780 from above example we can see that overall length is 10 and padding is being done right sided of... (2 Replies)
Discussion started by: looney
2 Replies

4. Programming

Urgent help needed.. C++ program to convert decimal to hexa decimal

Hi , seq can be 0...128 int windex = seq / 8; int bindex = seq % 8; unsigned char bitvalue = '\x01' << (7-bindex) ; bpv.bitmapvalue = bitvalue; This is the part of a program to convert decimal to bitmap value of hexadecimal. I want this to change to convert only to... (1 Reply)
Discussion started by: greenworld123
1 Replies

5. UNIX for Dummies Questions & Answers

Convert hexa decimal to decimal

Hi, I want to convert two hexadecimal numbers to decimal using unix command line. 1cce446295197a9d6352f9f223a9b698 fc8f99ac06e88c4faf669cf366f60d I tried using `echo "ibase=16; $no |bc` printf '%x\n' "1cce446295197a9d6352f9f223a9b698" but it doesn't work for such big number it... (4 Replies)
Discussion started by: sudhakar T
4 Replies

6. Homework & Coursework Questions

Decimal to BCD (Binary Coded Decimal)

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: Design an algorithm that accepts an input a decimal number and converts it into BCD (Binary... (2 Replies)
Discussion started by: caramba
2 Replies

7. UNIX for Dummies Questions & Answers

Decimal to BCD (Binary Coded Decimal)

Anybody please help me... Design an algorithm that accepts an input a decimal number and converts it into BCD (Binary Coded Decimal) representation. Also, draw its Flow Chart. This is a unix qn... plz post algorithm for that :confused: (1 Reply)
Discussion started by: caramba
1 Replies

8. Shell Programming and Scripting

Trim a new line

Okay, I am trying to make a bash script to get a certain domains IP address (my home ip). My home is on a DHCP lease from my ISP, so I cannot always trust the IP address to remain constant. This is what I have so far for it: alias ip-home="ping -c 1 example.com | grep 'PING' | cut -d'(' -f2 |... (5 Replies)
Discussion started by: tnanek
5 Replies

9. Shell Programming and Scripting

Trim

Hello, I am passing a filename to a script to draw parameters from it. However, I want to use part of the filename as a parameter. The filename is transfer_ccf_3731_10.sh but I only need the 3731_10 part of it. Is this possible? Any help or suggestions would be appreciated! Regards, J. (4 Replies)
Discussion started by: JWilliams
4 Replies
Login or Register to Ask a Question