Extracting data between two tag pairs


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extracting data between two tag pairs
# 1  
Old 12-07-2016
Blade Extracting data between two tag pairs

In a huge log file (43MB, 43k lines) I am trying to extract data between two tag pairs on same line and export it to a file so I can pull it into Excel for a report.
One Pair is <Text>data I need</Text>
Other pair follows on same line and is <TimeStamp>more data I need</TimeStamp>
I would need to delimit output of the two data elements with something other than a comma, other than colon, other than semicolon (maybe because my data might contain a comma or colon or semicolon). Maybe pipe char would be best (|)
I have tried sed, grep, cut, xml_grep.

Here is sample snippet of data from one line. As I said, I only care about that in between the two tag pairs. See GREEN BOLD:
Code:
<Sensitivity>RED</Sensitivity><Text>Room 607 *** DESAT 82 &lt; 85</Text><TimeStamp>23:42:00</TimeStamp>

My OS is RHEL 7.2.
Any advice is appreciated.

Last edited by rbatte1; 12-07-2016 at 11:50 AM.. Reason: Added code tags & ICODE tags for clarity
# 2  
Old 12-07-2016
something to start with:
Code:
  awk -F'(</*Text>|</*TimeStamp>)' 'NF>1{for(i=2;i<NF; i=i+2) printf("%s%s", $i, (i+1==NF)?ORS:OFS)}' OFS='|' myFile


Last edited by vgersh99; 12-07-2016 at 12:03 PM..
This User Gave Thanks to vgersh99 For This Post:
# 3  
Old 12-07-2016
It worked perfectly! I will use similar commands again. Thank you!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Find a tag with data and replace its data in another tag

Hi I have one file, :16R::GENL :20C::RELA//SET//ABC123456 :22F::XYZYESR :20C::MITI//NETT/QWERTY12345 :16S::GENL :16R::GENL :20C::RELA//SET//XYZ23456 :22F::XYZYESR :16S::GENL The requirement is, if :20C::MITI// is present in any block, then replace the data of :20C::MITI// in... (8 Replies)
Discussion started by: Soumyadip Dutta
8 Replies

2. Shell Programming and Scripting

Extracting the tag name from an xml file

Hi, My requirement is something like this, I have a xml file that contains some tags and nested tags, <n:tag_name1> <n:sub_tag1>val1</n:sub_tag1> <n:sub_tag2>val2</n:sub_tag2> </n:tag_name1> <n:tag_name2> <n:sub_tag1>value</n:sub_tag1> ... (6 Replies)
Discussion started by: Little
6 Replies

3. Shell Programming and Scripting

Extracting key/value pairs in awk

I am extracting a number of key/value pairs in awk using following: awk ' /xyz_session_id/ { n=index($0,"xyz_session_id"); id=substr($0,n+15,25); a=$4; } END{ for (ix in a) { print a } }' I don't like this Index + substr with manually calculated... (5 Replies)
Discussion started by: migurus
5 Replies

4. Shell Programming and Scripting

Extracting Delimiter 'TAG' Data From log files

Hi I am trying to extract data from within a log file and output format to a new file for further manipulation can someone provide script to do this? For example I have a file as below and just want to extract all delimited variances of tag 32=* up to the delimiter "|" and output to a new file... (2 Replies)
Discussion started by: Buddyluv
2 Replies

5. Shell Programming and Scripting

Extracting a string from html tag

Hi I am new to string extractions in shell script... I am trying to extract a string such as #1753 from html tag looks like below. <a class="model-link tl-tr" href="lastSuccessfulBuild/">Last successful build (#1753), 40 min ago</a> and want the value as 1753 Could someone help me to... (3 Replies)
Discussion started by: hicharbo
3 Replies

6. Shell Programming and Scripting

extracting non-zero pairs of numbers from each row

Hi all, I do have a tab delimited file a1 a2 b1 b2 c1 c2 d1 d2 e1 e2 f1 f2 0 0 123 546 0 0 0 0 0 0 0 0 0 0 345 456 765 890 902 1003 0 0 0 0 534 768 0 0 0 0 0 0 0 0 0 0 0 0 0 0 456 765 0 0 0 0 0 0 0 0 0 0 0 0 12 102 0 0 0 0 456 578 789 1003 678 765 345 400 801 1003 134 765... (5 Replies)
Discussion started by: Lucky Ali
5 Replies

7. Shell Programming and Scripting

extracting Line between HTML tag

Hi everyone: I want to extract string which is in between certain html tag. e.g. I tried with grep,cut, awk but could not find exact syntax for this one. :wall: PS>Sorry about bad english. (8 Replies)
Discussion started by: newlook2011
8 Replies

8. Shell Programming and Scripting

Extracting the value of an attribute tag from XML

Greetings, I am very new to the UNIX shell scripting and would like to learn. However, I am currently stuck on how to process the below sample of code from an XML file using UNIX comands: <ATTRIBUTE NAME="Memory" VALUE="512MB"/> <ATTRIBUTE NAME="CPU Speed" VALUE="3.0GHz"/> <ATTRIBUTE... (5 Replies)
Discussion started by: JesterMania
5 Replies

9. Shell Programming and Scripting

Eliminate redundant data pairs

Hello Experts: I appeal to you to see if you can help me with a small problem. I have a .log file where there is data in two columns (separated by a space). The file is thus: 0.0 3 0.0 6 0.0 6 0.0 6 0.0 7 0.0 7 0.0 7 0.0 7 0.0 11 0.0 11 0.0 11 0.0 11 0.0 11 0.1 17 (6 Replies)
Discussion started by: Flamex
6 Replies

10. Shell Programming and Scripting

Extracting XML Tag Contents

Hi Jean I require your help in writing a shell script. Iam zero in Unix programming. I have a large file about 400 MB of data, which contains about 50000 XML messages seperated by a Tab, I think. I need to extract only 4 values from each XML message and write it onto a new file. Please help me... (2 Replies)
Discussion started by: pk_eee
2 Replies
Login or Register to Ask a Question