How to pull multiple XML tags from the same XML file in Shell.?


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers How to pull multiple XML tags from the same XML file in Shell.?
# 1  
Old 01-30-2020
How to pull multiple XML tags from the same XML file in Shell.?

I'm searching for the names of a TV show in the XML file I've attached at the end of this post. What I'm trying to do now is pull out/list the data from each of the <SeriesName> tags throughout the document. Currently, I'm only able to get data the first instance of that XML field using the following:

Code:
cat My.xml | awk -F'</?SeriesName>' ' { print $2 } '

I'm missing something. Obviously. Also tried/failled with sed and xmllint although I'm not as fluent with those commands as I am with cat and awk. Open to suggestions. I'm on a Mac running Unix.

Now, here's a version of the XML file I'm using, not parsed, so it prints all nice-nice in the window:

Code:
<?xml version="1.0" encoding="UTF-8"?>
<Data><Series><seriesid>71862</seriesid><language>en</language><SeriesName>Chappelle's Show</SeriesName><banner></banner><Overview>"Chappelle's Show" takes comedian Dave Chappelle's own personal joke book and brings it to life, with episodes consisting of sketches, man-on-the-street pieces, and pop culture parodies introduced by Dave in a stand-up format in front of a studio audience. Chappelle's unique point-of-view on the world provides a hilarious, defiant and sometimes dangerous look at American culture, including music, movies, television, advertising, current events, and everyday life situations.</Overview><FirstAired>2003-1-22</FirstAired><IMDB_ID></IMDB_ID><zap2it_id></zap2it_id><AliasNames></AliasNames><id>71862</id></Series><Series><seriesid>257136</seriesid><language>en</language><SeriesName>Dave Chappelle</SeriesName><banner>/banners/graphical/257136-g.jpg</banner><Overview>Dave Chappelle's career started while he was in high school at Duke Ellington School of the Arts in Washington, DC where he studied theatre arts. At the age of 14, he began performing stand-up comedy in nightclubs. Shortly after graduation, he moved to New York City where he quickly established himself as a major young talent. At the age of 19, Chappelle made his film debut in Robin Hood: Men in Tights (1993). Chappelle then starred in the short-lived sitcom, Buddies (1996) and had a featured role in The Nutty Professor (1996).</Overview><FirstAired>1998-1-9</FirstAired><IMDB_ID></IMDB_ID><zap2it_id></zap2it_id><AliasNames></AliasNames><id>257136</id></Series><Series><seriesid>76837</seriesid><language>en</language><SeriesName>Challenge of the SuperFriends</SeriesName><banner></banner><Overview>Banded together from remote galaxies, 13 of the most sinister villains of all time - the Legion of Doom! Dedicated to a single object, the conquest of the universe! Only one group dares to challenge this inter-galactic threat - The SuperFriends! Challenge is a sequel to the two earlier Super Friends shows. It drops Zan and Jayna from the previous incarnation and makes several "guest stars" full members of the Justice League, giving us a membership of Superman, Aquaman, Wonder Woman, Batman, Robin, Hawkman, Flash, Green Lantern, Black Vulcan, Samurai, and Apache Chief. Pitted against them are some real villains - the Legion of Doom: 13 of the most powerful supervillains from "remote galaxies" (well, 12 + Black Manta... :) ). Who are they? Lex Luthor, Brainiac, Solomon Grundy, the Riddler, the Scarecrow, Bizarro, Cheetah, Black Manta, Giganta, Gorilla Grodd, Sinestro, Captain Cold, and the Toyman. Okay, everyone but Brainiac, Sinestro, and Bizarro was from Earth (most of the ga</Overview><FirstAired>1978-9-9</FirstAired><IMDB_ID></IMDB_ID><zap2it_id></zap2it_id><AliasNames></AliasNames><id>76837</id></Series><Series><seriesid>76420</seriesid><language>en</language><SeriesName>Challenge of the GoBots</SeriesName><banner></banner><Overview>Transforming robots from the planet Gobotron wage war across the galaxy: the heroic Guardians and the evil Renegades. The Guardians, led by the heroic Leader-1, must battle the evil of Cy-Kill and his Renegades! Together with UNECOM's Matt, A.J. and Nick, they'll save the Earth and Gobotron!</Overview><FirstAired>1984-10-29</FirstAired><IMDB_ID></IMDB_ID><zap2it_id></zap2it_id><AliasNames></AliasNames><id>76420</id></Series><Series><seriesid>366596</seriesid><language>en</language><SeriesName>CHALLENGER</SeriesName><banner></banner><Overview></Overview><FirstAired></FirstAired><IMDB_ID></IMDB_ID><zap2it_id></zap2it_id><AliasNames></AliasNames><id>366596</id></Series><Series><seriesid>364973</seriesid><language>en</language><SeriesName>Challenger Disaster: Lost Tapes</SeriesName><banner></banner><Overview>Challenger Disaster: Lost Tapes follows the story of the Space Shuttle Challenger and its crew, specifically Christa McAuliffe, the first civilian to be launched into space. McAuliffe was a teacher from Concord, N.H. She was chosen from thousands of applicants to expand the understanding of the nation's school children about space and the next generation of interplanetary travel. But her dreams - and those of NASA - were tragically cut short when the Challenger exploded just after liftoff in front of a live television audience. The events of the days leading up to the disaster are detailed in this unique film, which uses no narration and no interviews. Instead the story is told solely with reports of journalists covering the story, extensive recordings from the NASA team, and interviews with McAuliffe and others who were part of this one-of-a-kind mission. Using rarely seen images and audio recordings, this show takes viewers behind the scenes of this compelling and historic story in a way never before seen.</Overview><FirstAired>2016-1-25</FirstAired><IMDB_ID></IMDB_ID><zap2it_id></zap2it_id><AliasNames></AliasNames><id>364973</id></Series><Series><seriesid>351481</seriesid><language>en</language><SeriesName>Challenging Taboos</SeriesName><banner></banner><Overview>Smashing stereotypes, breaking taboos and challenging stigma - this playlist takes you on a journey through history, culture and geography.</Overview><FirstAired>2016-9-15</FirstAired><IMDB_ID></IMDB_ID><zap2it_id></zap2it_id><AliasNames></AliasNames><id>351481</id></Series><Series><seriesid>350777</seriesid><language>en</language><SeriesName>Challenge Accepted </SeriesName><banner></banner><Overview></Overview><FirstAired>2018-4-30</FirstAired><IMDB_ID></IMDB_ID><zap2it_id></zap2it_id><AliasNames></AliasNames></Series></Data>


Moderator's Comments:
Mod Comment Please do wrap your samples/codes in CODE TAGS as per forum rules.

Last edited by RavinderSingh13; 01-31-2020 at 02:32 AM..
# 2  
Old 01-31-2020
Hi
try this
Code:
awk -F '>' '/^SeriesName/ {print $2}' RS='<' file

These 2 Users Gave Thanks to nezabudka For This Post:
# 3  
Old 01-31-2020
Hello hungryd,

Could you please try following.

Code:
awk '
{
  while(match($0,/<SeriesName>[^<]*/)){
    print substr($0,RSTART+12,RLENGTH-12)
    $0=substr($0,RSTART+RLENGTH)
  }
}
'   Input_file

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 4  
Old 01-31-2020
It's a simple XPATH query with xmlstarlet

Code:
xmlstarlet sel -t -v "//SeriesName" data.xml

Chappelle's Show
Dave Chappelle
Challenge of the SuperFriends
Challenge of the GoBots
CHALLENGER
Challenger Disaster: Lost Tapes
Challenging Taboos
Challenge Accepted

Look for the keyword XPath for tutorials on the topic on the web or via forum search for more info.
# 5  
Old 01-31-2020
Thank you Nezabudka. This is great AND includes the line breaks. I wasn't seeing those with some of the other XML parsing tool options I'd experimented with in shell including:

Code:
xmllint  --xpath "//SeriesName" file.xml

Thank you

--- Post updated at 11:09 AM ---

Can you please just explain to me the meaning/syntax of the
Code:
RS='<'

call you've placed in the larger line of code? That I've not seen before and would like to know more. Thank you again, Nezabudka.
# 6  
Old 01-31-2020
xmllint does not separate the output result with newlines, xmlstarlet does
This User Gave Thanks to stomp For This Post:
# 7  
Old 01-31-2020
Code:
LESS=+/"^\s*RS\s" man awk
      RS          The input record separator, by default a newline.
...

it was RS='\n' has become RS='<'
this can be imagined as if in the text the character '<' is replaced by the character of the end of the string '\n'
and therefore the character following it becomes the character of the beginning of the next line
glad that helped

--- Post updated 02-01-20 at 00:08 ---

Quote:
-F '>'
RS='<'
<table>my story<item>not written</item></table>

table'FS'my story'RS'
item'FS'not written'RS'
/item'FS''RS'
/table'FS''RS'
This User Gave Thanks to nezabudka For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Grepping multiple XML tag results from XML file.

I want to write a one line script that outputs the result of multiple xml tags from a XML file. For example I have a XML file which has below XML tags in the file: <EMAIL>***</EMAIL> <CUSTOMER_ID>****</CUSTOMER_ID> <BRANDID>***</BRANDID> Now I want to grep the values of all these specified... (1 Reply)
Discussion started by: shubh752
1 Replies

2. Shell Programming and Scripting

Splitting a single xml file into multiple xml files

Hi, I'm having a xml file with multiple xml header. so i want to split the file into multiple files. Sample.xml consists multiple headers so how can we split these multiple headers into multiple files in unix. eg : <?xml version="1.0" encoding="UTF-8"?> <ml:individual... (3 Replies)
Discussion started by: Narendra921631
3 Replies

3. Shell Programming and Scripting

Split xml file into multiple xml based on letterID

Hi All, We need to split a large xml into multiple valid xml with same header(2lines) and footer(last line) for N number of letterId. In the example below we have first 2 lines as header and last line as footer.(They need to be in each split xml file) Header: <?xml version="1.0"... (5 Replies)
Discussion started by: vx04
5 Replies

4. Shell Programming and Scripting

Using shell command need to parse multiple nested tag value of a XML file

I have this XML file - <gp> <mms>1110012</mms> <tg>988</tg> <mm>LongTime</mm> <lv> <lkid>StartEle=ONE, Desti = Motion</lkid> <kk>12</kk> </lv> <lv> <lkid>StartEle=ONE, Source = Velocity</lkid> <kk>2</kk> </lv> <lv> ... (3 Replies)
Discussion started by: NeedASolution
3 Replies

5. Shell Programming and Scripting

How to add Xml tags to an existing xml using shell or awk?

Hi , I have a below xml: <ns:Body> <ns:result> <Date Month="June" Day="Monday:/> </ns:result> </ns:Body> i have a lookup abc.txtt text file with below details Month June July August Day Monday Tuesday Wednesday I need a output xml with below tags <ns:Body> <ns:result>... (2 Replies)
Discussion started by: Nevergivup
2 Replies

6. Shell Programming and Scripting

Shell Command to compare two xml lines while ignoring xml tags

I've got two different files and want to compare them. File 1 : HTML Code: <response ticketId="944" type="getQueryResults"><status>COMPLETE</status><description>Query results fetched successfully</description><recordSet totalCount="1" type="sms_records"><record... (1 Reply)
Discussion started by: Shaishav Shah
1 Replies

7. Shell Programming and Scripting

Help required in Splitting a xml file into multiple and appending it in another .xml file

HI All, I have to split a xml file into multiple xml files and append it in another .xml file. for example below is a sample xml and using shell script i have to split it into three xml files and append all the three xmls in a .xml file. Can some one help plz. eg: <?xml version="1.0"?>... (4 Replies)
Discussion started by: ganesan kulasek
4 Replies

8. Shell Programming and Scripting

How to add the multiple lines of xml tags before a particular xml tag in a file

Hi All, I'm stuck with adding multiple lines(irrespective of line number) to a file before a particular xml tag. Please help me. <A>testing_Location</A> <value>LA</value> <zone>US</zone> <B>Region</B> <value>Russia</value> <zone>Washington</zone> <C>Country</C>... (0 Replies)
Discussion started by: mjavalkar
0 Replies

9. Shell Programming and Scripting

Trying to pull a variable out of an xml file...

Hello. I'm new to *ix and am trying to pull a variable or two from an xml document. The document is in the format: <name>7_3(A).mov</name> <description>Some description from a file</description> <updatename>7_3_A.mov</updatename> <updatepath>Dailies Released</updatepath> ... (3 Replies)
Discussion started by: Renfield
3 Replies

10. Shell Programming and Scripting

How to remove xml namespace from xml file using shell script?

I have an xml file: <AutoData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <Table1> <Data1 10 </Data1> <Data2 20 </Data2> <Data3 40 </Data3> <Table1> </AutoData> and I have to remove the portion xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" only. I tried using sed... (10 Replies)
Discussion started by: Gary1978
10 Replies
Login or Register to Ask a Question