Sponsored Content
Top Forums Shell Programming and Scripting 'for LINE in $(cat file)' breaking at spaces, not just newlines Post 302548595 by natedawg1013 on Thursday 18th of August 2011 09:23:31 PM
Old 08-18-2011
'for LINE in $(cat file)' breaking at spaces, not just newlines

Hello. I'm making a (hopefully) simple shell script xml parser that outputs a file I can grep for information. I am writing it because I have yet to find a command line utility that can do this. If you know of one, please just stop now and tell me about it. Even better would be one I can input the XML file and path to the data (below example would be "foo/bar/c") and get the value. I know this can easily be done in python, but I have no experience in that and I want to make it part of a bigger program.

My goal, if writing the full parser, is to have the input file, for example,
Code:
<foo a="1">
    <bar b="2" c="Some string">
    </bar>
</foo>

output something like
Code:
foo.a="1"
foo.bar.b="2"
foo.bar.c="Some string"

The way I thought about doing this was
Code:
for LINE in $(cat file.xml)
do
if [ -n "$(echo $LINE | grep '<.*>' | grep -ve '<\/.*>')"

followed by how to parse it if it's an open tag <tag> and an if statement for ending tags </tag>.

NOW HERE'S THE PROBLEM:
"cat file.xml" works perfectly in terminal, but when using it in a for loop, assigning it to an array, or anything else, it breaks at spaces and newlines, not just newlines thus if I echoed $LINE in the for loop, the output would be
Code:
<foo
a="1">
<bar
b="2"
c="some
string">
</bar>
<foo>

which works fine for numeric values, but some some of my values are strings with spaces in their data which breaks the code since the value is split in half.

How can I get $LINE to be the full line, not just what's between two spaces? Or what command line utility could extract a piece of information from an XML file?
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Cat'ing a multiple line file to one line

I am writing a script that is running a loop on one file to obtain records from another file. Using egrep, I am finding matching records in file b, then outputing feilds of both into another file. **************************** filea=this.txt fileb=that.txt cat $filea | while read line do... (1 Reply)
Discussion started by: djsal
1 Replies

2. UNIX for Dummies Questions & Answers

newb help! file name with spaces breaking up when trying to retrieve it

for file in `ls *.txt` do sed '/s/old/new/g' $file > /tmp/tempfile.tmp mv /tmp/tempfile.tmp $file done the txt files names look like "text file one.txt", "text file two.txt" but when I run it, all i get is: sed: 0602-419 Cannot find or open file text. sed: 0602-419 Cannot find or... (3 Replies)
Discussion started by: DeuceLee
3 Replies

3. Shell Programming and Scripting

Breaking line

My input file is like USER_WORK.ABC USER_WORK.DEF I want output file like ABC DEF (4 Replies)
Discussion started by: scorp_rahul23
4 Replies

4. Shell Programming and Scripting

cat in the command line doesn't match cat in the script

Hello, So I sorted my file as I was supposed to: sort -n -r -k 2 -k 1 file1 | uniq > file2 and when I wrote > cat file2 in the command line, I got what I was expecting, but in the script itself ... sort -n -r -k 2 -k 1 averages | uniq > temp cat file2 It wrote a whole... (21 Replies)
Discussion started by: shira
21 Replies

5. UNIX for Dummies Questions & Answers

how to append spaces(say 10 spaces) at the end of each line based on the length of th

Hi, I have a problem where I need to append few spaces(say 10 spaces) for each line in a file whose length is say(100 chars) and others leave as it is. I tried to find the length of each line and then if the length is say 100 chars then tried to write those lines into another file and use a sed... (17 Replies)
Discussion started by: prathima
17 Replies

6. Shell Programming and Scripting

split a line of a file and cat a file with another

Hi, I have two files one.txt laptop boy apple two.txt unix linux OS openS I want to split one.txt into one line each and concatenate it with the two.txt output files onea.txt laptop (4 Replies)
Discussion started by: avatar_007
4 Replies

7. Shell Programming and Scripting

sed remove newlines and spaces

Hi all, i am getting count from oracle 11g by spooling it to a file. Now there are some newline characters and blank spaces i need to remove these. pl provide me a awk/sed solution. the spooled file is attached. i tried this.. but not getting req o/p (6 Replies)
Discussion started by: rishav
6 Replies

8. AIX

Script to cat and dd last line!!! of each file

hi Guys, Am new to this awesome forum, and yea i need some help here asap thnx :) i have a directory with over 34000 text files, i need a script that will delete the last line of each of this file without me necessary opening the files. illustration:- file1 200 records file2 130 records... (5 Replies)
Discussion started by: eetang
5 Replies

9. UNIX for Dummies Questions & Answers

Sendmail with cat adding extra spaces in email body

when I try to read a file and send email using cat and sendmail: The email received having additional spaces.(Between the letters of words in the text) My code: export MAILTO="sa@y.com" export SUBJECT="mydomain PREPROD MONITOR AT ${DATE}" export... (5 Replies)
Discussion started by: visitsany
5 Replies

10. Shell Programming and Scripting

sed a multiple line pattern that has newlines followed by text and r

Here is the text that I was to run sed on. In this text I want to insert a semi colon ';' before 'select a13.STORE_TYPE STORE_TYPE,' and after 'from ZZMR00 pa11' Input text: insert into ZZMQ01 select pa11.STATE_NBR STATE_NBR, pa11.STORE_TYPE STORE_TYPE, ... (9 Replies)
Discussion started by: v_vineeta11
9 Replies
XML_PP(1p)						User Contributed Perl Documentation						XML_PP(1p)

NAME
xml_pp - xml pretty-printer SYNOPSYS
xml_pp [options] [<files>] DESCRIPTION
XML pretty printer using XML::Twig OPTIONS
-i[<extension>] edits the file(s) in place, if an extension is provided (no space between "-i" and the extension) then the original file is backed-up with that extension The rules for the extension are the same as Perl's (see perldoc perlrun): if the extension includes no "*" then it is appended to the original file name, If the extension does contain one or more "*" characters, then each "*" is replaced with the current filename. -s <style> the style to use for pretty printing: none, nsgmls, nice, indented, record, or record_c (see XML::Twig docs for the exact description of those styles), 'indented' by default -p <tag(s)> preserves white spaces in tags. You can use several "-p" options or quote the tags if you need more than one -e <encoding> use XML::Twig output_encoding (based on Text::Iconv or Unicode::Map8 and Unicode::String) to set the output encoding. By default the original encoding is preserved. If this option is used the XML declaration is updated (and created if there was none). Make sure that the encoding is supported by the parser you use if you want to be able to process the pretty_printed file (XML::Parser does not support 'latin1' for example, you have to use 'iso-8859-1') -l loads the documents in memory instead of outputing them as they are being parsed. This prevents a bug (see BUGS) but uses more memory -f <file> read the list of files to process from <file>, one per line -v verbose (list the current file being processed) -- stop argument processing (to process files that start with -) -h display help EXAMPLES
xml_pp foo.xml > foo_pp.xml # pretty print foo.xml xml_pp < foo.xml > foo_pp.xml # pretty print from standard input xml_pp -v -i.bak *.xml # pretty print .xml files, with backups xml_pp -v -i'orig_*' *.xml # backups are named orig_<filename> xml_pp -i -p pre foo.xhtml # preserve spaces in pre tags xml_pp -i.bak -p 'pre code' foo.xml # preserve spaces in pre and code tags xml_pp -i.bak -p pre -p code foo.xml # same xml_pp -i -s record mydb_export.xml # pretty print using the record style xml_pp -e utf8 -i foo.xml # output will be in utf8 xml_pp -e iso-8859-1 -i foo.xml # output will be in iso-8859-1 xml_pp -v -i.bak -f lof # pretty print in place files from lof xml_pp -- -i.xml # pretty print the -i.xml file xml_pp -l foo.xml # loads the entire file in memory # before pretty printing it xml_pp -h # display help BUGS
Elements with mixed content that start with an embedded element get an extra <elt><b>b</b>toto<b>bold</b></elt> will be output as <elt> <b>b</b>toto<b>bold</b></elt> Using the "-l" option solves this bug (but uses more memory) TODO
update XML::Twig to use Encode with perl 5.8.0 AUTHOR
Michel Rodriguez <mirod@xmltwig.com> perl v5.12.4 2011-05-18 XML_PP(1p)
All times are GMT -4. The time now is 07:14 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy