Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Removing spaces between XML tags<XX XX> -> <XXXX> Post 302183873 by sharoff on Thursday 10th of April 2008 04:28:38 AM
Old 04-10-2008
Hey guys! thanks for the replies!

era: Thanks for the solution! I won't wreck anything because all that the following process does is extract 17 arguments where it knows the tags..The tags with spaces are useless to me but are causing an erorr...I can't use an XML parser because I don't have the correct perl package. The solution you proposed works great but i am having a small problem now:

Here's part of my script:

Code:
for line in `perl -pe 's/<\s+/</g; s/\s+>/>\n/g; 1 while s/(<[^\s<>]+)\s+/$1/' $readyDir/$fileName`
    do

case "$line" in
    "</document>")

          if [ $currentNdocs -gt $maxDocs ]
               then
                 strFile="<DMS><documents>"$strFile"</documents></DMS>";
                 cd ${CLARIFY_DIR}/rulemanager
                 ./cbbatch -f ../jobs/dms/DMS_Integration.cbs -r ParseXML -as ${strFile} >> ${CLARIFY_DIR}/jobs/dms/OUT
                 strFile=""
                 currentNdocs=0
               else
                 strFile=$strFile$line;
               fi
           ;;

        *)
               strFile=$strFile$line;

Basically when the script detects the end of a an xml document. It sends whatever it has recovered from the strFile concatenated with 'line' (i.e. everything before the the </document> including it). My problem arises when i have spaces between the text. The "line" function, when it finds a space it is concatinating it into strFile, the problem is it eliminates the space. (strFile = $strFile$line #the space is lost). Now, after solving the <ta g> spaces, I need a way the line will ignore the <tag>My Coments</tag> mid space and accept it so the overall strFile contains that space. The batch that receives the document cannot handle spaces between tag and tag but it CAN handle spaces within the tags...

Any suggestions as to what I can do? A replacement for line?

A typical xml doc i receive is:

Code:
<?xml  version = "1.0" encoding = "UTF-8"?><DMS xmlns:xsi="http://www.w3.com/XSD/DMSMessage.xsd"><documents>
<document><DMSObjectID>10011468999</DMSObjectID><DMSObjectCreateDateTime>2008-04-08T18:00:00</DMSObjectCreateDateTime>
<DMSObjectFileType>pdf</DMSObjectFileType><DMSObjectLink>http://aa.tie.ch.n1=10011468734&amp</DMSObjectLink><DMSObjectType>Contract</DMSObjectType>
<DMSObjectSubType>Audio Contract</DMSObjectSubType><ClarifyCustomerID>0703203</ClarifyCustomerID><ClarifyActionCode>0</ClarifyActionCode>
<WebOrderID>99933</WebOrderID><ClarifyPartRequestID/><POAPhoneNumber/><POAPartialPorting/><POAPortingWishDate></POAPortingWishDate><SignatureDate/>
<DMSObjectSubject></DMSObjectSubject><DMSObjectProductLine></DMSObjectProductLine><DMSObjectLanguage>de</DMSObjectLanguage>
<DMSAdditionalComment>My comments</DMSAdditionalComment></document></documents></DMS>

The line function, when it arrives at the tag DMSObjectSubType the Audio Contract turns into AudioContract in the strFile.

Another problem is the script should also be prepared to receive the XML's with spaces between the tags (</document> </documents>) and also with line terminators (</document>
</documents>)

any suggestions will help!!

Last edited by sharoff; 04-10-2008 at 05:59 AM..
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Removing leading and trailing spaces of data between the tags in xml.

I am having xml document as below. <transactionid> 00 </transactionid> <tracknumber> 0 </tracknumber> <key> N/A </key> But the data contains leading and trailing spaces between the tags. Please let me know how can i remove these leading and trailing spaces between the tags.... (2 Replies)
Discussion started by: jhmr7
2 Replies

2. UNIX for Dummies Questions & Answers

rm: Unable to remove directory xxxx/xxxx: File exists

Hi Everyone, I am having problem to delete an "empty" folder ( messages attached ). It displays "total 12" when i typed "ls -lart" on the fnxroot44 folder, but i can't view any file. Is there any way to view those unseen files ? I don't know why option "a" is not working this time. Would... (1 Reply)
Discussion started by: deejay
1 Replies

3. Shell Programming and Scripting

removing spaces after sperator

Hi friends i have problem 6000000001| CDC049| 109| CDC| 02/02/2006| Auto| New Add| 02/03/2006 6000000002| CDC033| 109| CDC| 02/02/2006| Auto| New Add| 02/03/2006 6000000003| CDC037| 109| CDC| 02/02/2006| Auto| New Add| 02/03/2006 6000000004| CDC031| ... (6 Replies)
Discussion started by: vishnu_vaka
6 Replies

4. Shell Programming and Scripting

Removing spaces at particular position

I have a file with delimiter ~ ABC~12~43~TR ~890~poi~YU ~56~65 What I want is to remove spaces from column 4,7 and other columns as it is So, the final file becomes ABC~12~43~TR~890~poi~YU~56~65 (7 Replies)
Discussion started by: superprogrammer
7 Replies

5. Shell Programming and Scripting

removing spaces

hey.. i had a problem with the unix command when i want to remove the white spaces in a string..i guess i cud do it with a sed command but i get an error when i give space in the square brackets.. string="nh hjh llk" p=`echo $string | sed 's/ //g'` i donno how to give space charater and... (2 Replies)
Discussion started by: sahithi_khushi
2 Replies

6. UNIX for Dummies Questions & Answers

Removing spaces...

Hey, I'm using the command from this thread https://www.unix.com/unix-dummies-questions-answers/590-converting-list-into-line.html to convert vertical lines to horzontal lines. But I need to remove the spaces that is created. Unfortunately I can't figure out where the space is in the code.. I... (2 Replies)
Discussion started by: lost
2 Replies

7. Shell Programming and Scripting

Help in removing xml tags

Hi, I have a input xml file like this <postalAddress:>379 PROSPECT ST </postalAddress:> <street:>STE B </street:> <l:>TORRINGTON </l:> <st:>CT</st:> <postalCode:>067905238</postalCode:>... (5 Replies)
Discussion started by: pintoo
5 Replies

8. Shell Programming and Scripting

Removing blank spaces, tab spaces from file

Hello All, I am trying to remove all tabspaces and all blankspaces from my file using sed & awk, but not getting proper code. Please help me out. My file is like this (<b> means one blank space, <t> means one tab space)- $ cat file NARESH<b><b><b>KUMAR<t><t>PRADHAN... (3 Replies)
Discussion started by: NARESH1302
3 Replies

9. UNIX for Dummies Questions & Answers

Script or SED command for [[Xxxx Xxxx Xxxx]] to [[Xxxx xxx xx]]

Hi, To comply with a new naming convention on a mediawiki site we have to run a SED or other PERL command to change all instances of ] or ] or ] to ] Can someone please explain how to do this... It has to be done on a mysql dump, so if there is a way to do this in mysql even... (2 Replies)
Discussion started by: lawstudent
2 Replies

10. Shell Programming and Scripting

Removing unwanted tags from xml file

I have a XML file given as below: "<ProductUOMAlternativeDetails> <removetag> <UOMCode>EA</UOMCode> <numeratorForConversionToBaseUOM>1</numeratorForConversionToBaseUOM> <denominatorForConversionToBaseUOM>1</denominatorForConversionToBaseUOM> <length>0.59</length> <width>0.96</width> ... (3 Replies)
Discussion started by: vikingh
3 Replies
All times are GMT -4. The time now is 03:48 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy