Extracting data between continuous non empty xml tags
Hi,
I need help in extracting only the phone numbers between the continuous non empty xml tags in unix. I searched through a lot of forum but i did not get exact result for my query. Please help
Given below is the sample pipe delimited file. I have a lot of tags before and after ...<phone>...</phone>... tags
Sample file:
Expected Output:
Last edited by zen01234; 09-14-2015 at 06:16 PM..
Reason: Formatting
To keep the forums high quality for all users, please take the time to format your posts correctly.
First of all, use Code Tags when you post any code or data samples so others can easily read your code. You can easily do this by highlighting your code and then clicking on the # in the editing menu. (You can also type code tags [code] and [/code] by hand.)
Second, avoid adding color or different fonts and font size to your posts. Selective use of color to highlight a single word or phrase can be useful at times, but using color, in general, makes the forums harder to read, especially bright colors like red.
Third, be careful when you cut-and-paste, edit any odd characters and make sure all links are working property.
If we take your two commands (modified to create file3 instead of append to it):
you get what you want in file3. If you change the 2nd command to:
you get the same output in file4. And, if you change the 1st command to:
you get the same output in file5 with one command instead of two. And, if you prefer to use awk instead of sed, with your sample input, the following:
produces the same output in file6. But, if there could be other tags between the phone tags, the following would be safer:
still producing the same output in file7.
Of course, the 1st two of the above work only if the number tags are on the same line as the starting and closing phone tags; the others work as long as the starting and closing number tags are on the same line as a starting or closing phone tag.
None of the above work if the starting and closing number tags are not on the same line. And, none of the above will find all of the numbers you want if there is more than one set of number tags on a single line.
This User Gave Thanks to Don Cragun For This Post:
Thank you Don Cragon for your code but unfortunately it doesn't worked.. My file is a huge file and it has same fields repeating over and over with different tag headings as shown below.
I tried your command
1st command
Command is searching for the first occurrence of number tag instead of searching number tag after phone tag
2nd command also searching for the first occurrence of number tag instead of searching number tag after phone tag.
Thank you Don Cragon for your code but unfortunately it doesn't worked.. My file is a huge file and it has same fields repeating over and over with different tag headings as shown below.
I tried your command
1st command
Command is searching for the first occurrence of number tag instead of searching number tag after phone tag
2nd command also searching for the first occurrence of number tag instead of searching number tag after phone tag.
I got the output from both commands as 1234 instead of 1234567890
Moderator's Comments:
Please use CODE tags (not ICODE tags) when showing full line and especially when showing multi-line output.
With the three lines of sample input shown above in file1, both the sed command and the awk command shown above will only write:
into files file5 and file7, respectively, not:
Neither script explicitly checks that <number>string</number> appears after<phone>, but they both only look for number tags on lines that contain the string phone.
Please reread the last two paragraphs I wrote in post #4 in this thread. They clearly state the limitations of the scripts presented above (and the other suggested scripts in that post).
If these results don't match what you get on your system with the above data, please show us the exact output you are getting and tell us what operating system and shell you are using.
If the code does work for this example, but fails for some other data, show us an example of the exact input lines that are producing the wrong output and shows the exact output produced, AND clearly describe the exact arrangement of tags on lines in the files you are trying to process. Everything you have shown us says that lines that you want to process have <phone> as the first tag on a line, the string you want to retrieve between <number> and </number> tags, and the last tag on the line is </phone>. And, all of the suggestions that have been provided assume this is an accurate description of the lines that need to be processed.
Hi All
My input file is an XML and it has some tags and data rows at end.
Starting of data rows is <rs:data> and ending of data rows is </rs:data>.
Within sample data rows (2 rows) shown below, I want to extract data value after equal to sign (until space or "/" sign).
So if XML data... (7 Replies)
Hi All,
I'm trying to extract data from an xml file but without the codes. I've achieved it but i was wondering if there's a better way to do this.
sample data:
$ cat xmlfile
<code>
<to>tove</to>
<from>jani</from>
<heading>reminder</heading>
<body>dont forget me</body>
</code>
... (4 Replies)
Hello @all,
first, sorry for my bad english language.
I try to extract with bash an text inside of a html page witch is finding between two tags. There is only one Tag in this file. Here is an example:
Wert... (2 Replies)
Hello,
This is my first post in here, so excuse me if I sound too noob here!
I need to extract the path "/apps/mp/installedApps/V61/HRO/hrms_01698_A_qa.ear" from the below xml extract. The path will always appear with the key "binariesURL"
<deployedObject... (6 Replies)
Hello,
Please can someone assist.
I have the following xml file:
<?xml version="1.0" encoding="utf-8" ?>
- <PUTTRIGGER xmlns:xsd="http://www.test.org/2001/XMLSchema" xmlns:xsi="http://www.test.org/2001/XMLSchema-instance" APPLICATIONNUMBER="0501160" ACCOUNTNAME="Mrs S Test"... (15 Replies)
Input file is on Linux box and the input file has data in just one line with 1699741696 characters.
Sample Input:
<xxx><document coll="uspatfull" version="0"><CMSdoc>xxxantivirus</CMSdoc><tag1>1</tag1></document><document coll="uspatfull"... (5 Replies)
Is there a way to modify Non Null data between <host> and </host> tags to a new value ?- may be using sed/awk?
I tried this sed 's|.*<host>\(?*\)</host>.*|\<host>xxx</host>|' but it is updating the host which has null value - want opposite of this - Thanks in advance for you help!!
For... (2 Replies)
i have a file like
<fruits>
<apple>redcolor<\apple>
<bana:rolleyes:na>yellow color and it is<\banana>
</fruits>
i need a text between apple and bannana ans so on....
how to read a text between a tags it multiple tags with differnt names (9 Replies)
Hi ppl out there...
Can anyone help me with the shell script to extract data from an xml file.
My xml file looks like :
- <servlet>
<servlet-name>FrontServlet</servlet-name>
<display-name>FrontServlet</display-name>
... (3 Replies)