xml extract problem


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting xml extract problem
# 1  
Old 10-02-2011
xml extract problem

I have looked at other responses and never was able to modify to work.
data is:
Code:
<?xml version="1.0"?>
<note version="0.3" xmlns:link="http://beatniksoftware.com/tomboy/link" xmlns:size="http://beatniksoftware.com/tomboy/size" xmlns="http://beatniksoftware.com/tomboy"><title>recoll</title><text xml:space="preserve"><note-content version="0.1" xmlns:link="http://beatniksoftware.com/tomboy/link" xmlns:size="http://beatniksoftware.com/tomboy/size">recoll

index scrapbook

</note-content>
</text><last-change-date>2011-09-30T22:56:13.222083Z</last-change-date><last-metadata-change-date>2011-09-30T22:56:13.222083Z</last-metadata-change-date><create-date>2011-09-30T22:55:55.073966Z</create-date><cursor-position>30</cursor-position><width>450</width><height>360</height><x>0</x><y>0</y><open-on-startup>False</open-on-startup></note>

I want to get that which is in between the bold. Tried
Code:
sed 's/xmlns:size="http://beatniksoftware.com/tomboy/size">/\n/note-content>
</text><last-change-date>/g' gnote.note |grep size">

just got errors.
I think it may be because of special characters but not sure, so i tried with single quotes for quotes and so on but nothing.
thanks for your help

Last edited by Scott; 10-03-2011 at 10:06 AM.. Reason: Code tags
# 2  
Old 10-02-2011
"recoll index scrap book" is this what you want?
And does the word span across lines as you have shown?

Paste the exact text from your input!

--ahamed
# 3  
Old 10-02-2011
Yes on both count. The file is from gnote. They represent the carriage return I did to see more clearly. But? Now I understand what your thinking.
Code:
sed 's/title/\nsize/g' gnote.note |grep title

This doesn't work either!!

Last edited by Scott; 10-03-2011 at 10:06 AM.. Reason: Code tags
# 4  
Old 10-02-2011
You didn't answer my question completely. Do the word span across lines?

i.e.
recoll
newline
index scarpbook
newline


is it like this?

--ahamed

---------- Post updated at 11:13 AM ---------- Previous update was at 11:04 AM ----------

With the input you have given, try this

Using AWK
Code:
awk -F">" '/oftware.com\/tomboy\/size/{x=$NF;getline;getline;x=x" "$0;}/note-content/{print x;x=""}' input_file

Using SED
Code:
sed -n '/oftware.com\/tomboy\/size/{s/.*>\(.*\)/\1 /;H;n;n;H;x;s/\n//g;x;n;n;/note-content/{x;p}}' input_file

--ahamed
# 5  
Old 10-02-2011
Yes it' perfect !!
Wow, this is a whole course right here. Thanks very much especially for examples in sed and awk.

---------- Post updated at 01:47 PM ---------- Previous update was at 01:26 PM ----------

Yes to answer your question more precisely I've put only the beginning of long lines representing each actual lines from the file including the empty ones.
Code:
<?xml version="1.0"?>
<note version="0.3" xmlxmlns="http://beatniksoftware.co$...

index scrapbook

</note-content>
</text><last-change-date>2011-09-30T22:56:13.222083Z</last-....

So it doesn't work with other notes.

Last edited by Scott; 10-03-2011 at 10:07 AM.. Reason: Code tags
# 6  
Old 10-02-2011
so, how does your other notes look like? We can have pattern for the other ones as well!

--ahamed
# 7  
Old 10-02-2011
Code:
<?xml version="1.0"?>
#This is one line above. Not in actual file
<note version="0.3" xmlns:link="http://beatniksoftware.com/tomboy/link" xmlns:size="http://beatniksoftware.com/tomboy/size" xmlns="http://beatniksoftware.com/tomboy"><title>A installer au menu</title><text xml:space="preserve"><note-content version="0.1" xmlns:link="http://beatniksoftware.com/tomboy/link" xmlns:size="http://beatniksoftware.com/tomboy/size">To install in menu
#This is one line above. Not in actual file

#This is one line above. Not in actual file

#This is one line above. Not in actual file
audacity
#This is one line above. Not in actual file

#This is one line above. Not in actual file
uninstsall
#This is one line above. Not in actual file
</note-content>
#This is one line above. Not in actual file

An so on, so no exactly the same format due to user input but beginning and end similar other than dates.

Last edited by Scott; 10-03-2011 at 10:07 AM.. Reason: Code tags
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract a value from an xml file

I have this XML file format and all in one line: Fri Dec 23 00:14:52 2016 Logged Message:689|<?xml version="1.0" encoding="UTF-8"?><PORT_RESPONSE><HEADER><ORIGINATOR>XMG</ORIGINATOR><DESTINAT... (16 Replies)
Discussion started by: mrn6430
16 Replies

2. Shell Programming and Scripting

Extract strings from XML files and create a new XML

Hello everybody, I have a double mission with some XML files, which is pretty challenging for my actual beginner UNIX knowledge. I need to extract some strings from multiple XML files and create a new XML file with the searched strings.. The original XML files contain the source code for... (12 Replies)
Discussion started by: milano.churchil
12 Replies

3. Shell Programming and Scripting

Extract a particular xml only from an xml jar file

Hi..need help on how to extract a particular xml file only from an xml jar file... thanks! (2 Replies)
Discussion started by: qwerty000
2 Replies

4. Shell Programming and Scripting

Extract string from XML

Hi, I wish to grep for the first instance of <listen-address> value between the first <server></server> tag in an xml file. Sample xml: ......... <timeout-seconds>1500</timeout-seconds> </jta> <server> <name>Adminserver_DEV</name> ... (9 Replies)
Discussion started by: mohtashims
9 Replies

5. Shell Programming and Scripting

Extract value from XML

I have a file like below <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"><soap:Body><ns2:executeMDXResponse... (9 Replies)
Discussion started by: manas_ranjan
9 Replies

6. Shell Programming and Scripting

sed extract from xml

I have an xml file that generally looks like this: "<row><dnorpattern>02788920</dnorpattern><description/></row><row><dnorpattern>\+ 44146322XXXX</dnorpattern><description/></row><row><dnorpattern>40XXX</dnorpattern><description/></row><row><dnorpattern>11</dn... (4 Replies)
Discussion started by: garboon
4 Replies

7. Shell Programming and Scripting

XML data extract

Hi all, I have the following xml document : <HEADER><El1>asdf</El1> <El2>3</El2> <El3>asad</El3> <El4>asasdf</El4> <El5>asdf</El5> <El6>asdf</El6> <El7>asdf</El7> <El8>A</El8> <El9>0</El9> <El10>75291028141917</El10> <El11>asdf</El11> <El12>sdf</El12> <El13>er</El13> <El14><El15>asdf... (1 Reply)
Discussion started by: nthed
1 Replies

8. Shell Programming and Scripting

Extract xml data

Hi all, I have the following xml file : <xmlhead><xmlelement1>element1value</xmlelement1>\0a<xmlelement2>jjasd</xmlelement2>...</xmlhead> As you can see there are no lines or spaces seperating the elements, just the character \0a. How can i find and print the values of a specific element?... (1 Reply)
Discussion started by: nthed
1 Replies

9. Shell Programming and Scripting

SED extract XML value

I have the following string: <min-pool-size>2</min-pool-size> When I pipe the string into the following code I am expcting for it to return just the value "2", but its just reurning the whole string. Why?? sed -n '/<min-pool-size>/,/<\/min-pool-size>/p' Outputting:... (13 Replies)
Discussion started by: ArterialTool
13 Replies

10. Shell Programming and Scripting

· simerian · XML Extract

The script following in this thread allows XML data to be located and extracted in a variety of forms from an XML data stream. Using this utility, it is possible to extract all manner of XML subsets and allow data to be post inserted into the "original" XML at any logical point. The pipe is... (2 Replies)
Discussion started by: Simerian
2 Replies
Login or Register to Ask a Question