Print Value between desired html tag


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Print Value between desired html tag
# 1  
Old 11-02-2013
Print Value between desired html tag

Hi,

I have a html line as below :-
Code:
<dataFilter><filterName>Customer.PromotionsProfile</filterName></dataFilter><dataFilter><filterName>Customer.Messages</filterName></dataFilter><dataFilter><filterName>Customer.PaymentProfileDetail</filterName></dataFilter></dataFilters><validateCustomerCompleteness <sessionToken>           <tokenValue>kfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB</tokenValue></sessionToken>< dataFilter><filterName>Customer.PaymentProfileDetail</filterName></dataFilter></dataFilters>

From the above ,i would like to have only the text between tokenValue tags.

Code:
Expected o/p :- kfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB

P.s :- The above tag's are all considered as a single line in my text file so grep is giving the entire line again need some other alternative solutions.

Thanks in advance,
Satish.

Last edited by Scrutinizer; 11-02-2013 at 07:57 AM.. Reason: code tags
# 2  
Old 11-02-2013
Try:
Code:
perl -nle '/<tokenValue>(.*)<\/tokenValue>/ && print $1' file

# 3  
Old 11-02-2013
If there can be multiple tags per line, try:
Code:
awk '$1=="tokenValue"{print $2}' RS=\< FS=\> file

An additional advantage is that if it is a very long line, the application's (in this case awk) maximum record length will not likely become exceeded..

Last edited by Scrutinizer; 11-02-2013 at 09:18 AM..
This User Gave Thanks to Scrutinizer For This Post:
# 4  
Old 11-02-2013
If your grep supports -o you could do:
Code:
$ grep -Eo "<tokenValue>.*</tokenValue>" file
<tokenValue>kfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB</tokenValue>

(although that still leaves you with the tags)

Or use sed:
Code:
$ sed 's#.*<tokenValue>\(.*\)</tokenValue>.*#\1#g' file
kfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB

or awk:
Code:
$ awk '/tokenValue/ {print $2}' FS=">" RS="<|</" file
kfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB

(all cygwin/GNU)
# 5  
Old 11-02-2013
Quote:
Originally Posted by CarloM
[..]
Or use sed:
Code:
$ sed 's#.*<tokenValue>\(.*\)</tokenValue>.*#\1#g' file
kfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB

The "g"-flag has no meaning here, because this will only find one match per line, either the value (if there is only one tag), or the value and anything up until the last "tokenValue" tag on a line.

Quote:
or awk:
Code:
$ awk '/tokenValue/ {print $2}' FS=">" RS="<|</" file
kfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB

(all cygwin/GNU)
This one is not as precise because it will match any string containing "tokenValue" (for example "tokenValue2") in both tag and value...

Last edited by Scrutinizer; 11-02-2013 at 08:32 AM..
This User Gave Thanks to Scrutinizer For This Post:
# 6  
Old 11-02-2013
Quote:
Originally Posted by Scrutinizer
The "g"-flag has no meaning here, because this will only find one match per line, either the value (if there is only one tag), or the value and anything up until the last "tokenValue" tag on a line.
True, it's not correct.

Possibly of interest though - what you actually get is the last tokenValue value, not the whole intervening text as you might think (and as with the perl solution) - the leading .* greedily eats everything else (at least with my sed).
Code:
$ cat file
<dataFilter><filterName>Customer.PromotionsProfile</filterName></dataFilter><dataFilter><filterName>Customer.Messages</filterName></dataFilter><dataFilter><filterName>Customer.PaymentProfileDetail</filterName></dataFilter></dataFilters><validateCustomerCompleteness  <sessionToken>            <tokenValue>1AkfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB</tokenValue></sessionToken><   dataFilter><filterName>Customer.PaymentProfileDetail</filterName></dataFilter></dataFilters><dataFilter><filterName>Customer.PromotionsProfile</filterName></dataFilter><dataFilter><filterName>Customer.Messages</filterName></dataFilter><dataFilter><filterName>Customer.PaymentProfileDetail</filterName></dataFilter></dataFilters><validateCustomerCompleteness  <sessionToken>            <tokenValue>1BkfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB</tokenValue></sessionToken><   dataFilter><filterName>Customer.PaymentProfileDetail</filterName></dataFilter></dataFilters>
<dataFilter><filterName>Customer.PromotionsProfile</filterName></dataFilter><dataFilter><filterName>Customer.Messages</filterName></dataFilter><dataFilter><filterName>Customer.PaymentProfileDetail</filterName></dataFilter></dataFilters><validateCustomerCompleteness  <sessionToken>            <tokenValue>2kfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB</tokenValue></sessionToken><   dataFilter><filterName>Customer.PaymentProfileDetail</filterName></dataFilter></dataFilters>
 $ awk '$1=="tokenValue"{print $2}' RS='<' FS='>' file
1AkfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB
1BkfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB
2kfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB
$ perl -nle '/<tokenValue>(.*)<\/tokenValue>/ && print $1' file
1AkfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB</tokenValue></sessionToken>< dataFilter><filterName>Customer.PaymentProfileDetail</filterName></dataFilter></dataFilters><dataFilter><filterName>Customer.PromotionsProfile</filterName></dataFilter><dataFilter><filterName>Customer.Messages</filterName></dataFilter><dataFilter><filterName>Customer.PaymentProfileDetail</filterName></dataFilter></dataFilters><validateCustomerCompleteness <sessionToken>           <tokenValue>1BkfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB
2kfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB
$ sed 's#.*<tokenValue>\(.*\)</tokenValue>.*#\1#g' file
1BkfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB
2kfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB
$ sed --version | head -1
sed (GNU sed) 4.2.2

This User Gave Thanks to CarloM For This Post:
# 7  
Old 11-02-2013
Try

Code:
$ cat file
<dataFilter><filterName>Customer.PromotionsProfile</filterName></dataFilter><dataFilter><filterName>Customer.Messages</filterName></dataFilter><dataFilter><filterName>Customer.PaymentProfileDetail</filterName></dataFilter></dataFilters><validateCustomerCompleteness <sessionToken>           <tokenValue>kfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB</tokenValue></sessionToken>< dataFilter><filterName>Customer.PaymentProfileDetail</filterName></dataFilter></dataFilters>

Code:
$ awk '{gsub(".*<tokenValue>|</tokenValue>.*",x)}1' file
kfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB


OR

Code:
$ grep -Po '(?<=<tokenValue>).*(?=</tokenValue>)' file
kfuYW9mcmjkasfkjsasvIR/hm/bb945chszG8zSIC89DBq9Q7NiB

@ CarloM mine as well as yours grep fails if there is multiple entry

Last edited by Akshay Hegde; 11-03-2013 at 05:01 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Print if found non-desired result

I have a result like this root@server # grep -rl maldet /etc/cron* /etc/cron.d/maldet_daily /etc/cron.d/malcron /etc/cron.d/malcrondaily /etc/cron.d/malcronweekly What I need is, I need an if/else condition such that, if there is any output other than /etc/cron.d/maldet_daily in the... (8 Replies)
Discussion started by: anil510
8 Replies

2. Shell Programming and Scripting

Search for a html tag and print the entire tag

I want to print from <fruits> to </fruits> tag which have <fruit> as mango. Also i want both <fruits> and </fruits> in output. Please help eg. <fruits> <fruit id="111">mango<fruit> . another 20 lines . </fruits> (3 Replies)
Discussion started by: Ashik409
3 Replies

3. Shell Programming and Scripting

Extracting a string from html tag

Hi I am new to string extractions in shell script... I am trying to extract a string such as #1753 from html tag looks like below. <a class="model-link tl-tr" href="lastSuccessfulBuild/">Last successful build (#1753), 40 min ago</a> and want the value as 1753 Could someone help me to... (3 Replies)
Discussion started by: hicharbo
3 Replies

4. Shell Programming and Scripting

How to remove string inside html tag <a>

Does anybody know how i can remove string from <a> tag? There are several hundred posts in a few forums that need to be cleaned up. The precise situation is ---------- <a href="http://mydomain.com/cgi-bin/anyboard.cgi?fvp=/family/sexuality_and_spirituality/&cmd=rA&cG=43"> ------------- my... (6 Replies)
Discussion started by: georgi58
6 Replies

5. Shell Programming and Scripting

extracting Line between HTML tag

Hi everyone: I want to extract string which is in between certain html tag. e.g. I tried with grep,cut, awk but could not find exact syntax for this one. :wall: PS>Sorry about bad english. (8 Replies)
Discussion started by: newlook2011
8 Replies

6. Shell Programming and Scripting

Script to delete HTML tag

Guys, I have a little script that I got of the internet and that I use in Squid to block ads. I used that script with linux but now i have moved my servers to freebsd. I have a step learning curve there but it is fun: Back to the script issue. The script used to work i with linux but... (15 Replies)
Discussion started by: zongo
15 Replies

7. Shell Programming and Scripting

How can i delete html attributes from tag ?

Input: <table class="pixelBorderTable faqTable" width="100%" border="1" cellpadding="3" cellspacing="0"> <tbody><tr> <td class="pixelBorderTableHeaderTd" valign="top" width="20%" bgcolor="#666666"><p>&nbsp;</p></td> <td class="pixelBorderTableHeaderTd" valign="top"... (1 Reply)
Discussion started by: cola
1 Replies

8. Shell Programming and Scripting

Parse HTML tag parameters and text

Hi! I have a bunch of HTML files, which I want to parse to CSV files. Every page has a table in it, and I need to parse each row into a csv record. With awk and sed, I managed to put every table row in separate lines. So my file looks like this: <TR> .... </TR> <TR> .... </TR> ...One... (1 Reply)
Discussion started by: senszey
1 Replies

9. Shell Programming and Scripting

how to use html tag in shell scripting

Hai friends I have a small doubt.. how can we use html tag in shell scripting code : echo "<html>" echo "<body>" echo " welcome to peace world " echo "</body>" echo "</html>" output displayed like this: <html> <body> welcome to peace world </body> </html> (5 Replies)
Discussion started by: jrex1983
5 Replies

10. UNIX for Dummies Questions & Answers

How do I extract text only from html file without HTML tag

I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part. Same problem happens in "type" command in MS-DOS. I know you can do it by opening it in Internet Explorer,... (4 Replies)
Discussion started by: los111
4 Replies
Login or Register to Ask a Question