Sponsored Content
Top Forums Shell Programming and Scripting Help with converting XML to Flat file Post 302626617 by mayursingru on Thursday 19th of April 2012 01:04:53 PM
Old 04-19-2012
Hi,
Try this for removing the tags and then continue with awk, gawk.

Code:
sed -e 's/<[^>]*>//g' test1.txt

 

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

XML to flat file in Unix

Hello, How can I take a file in XML format and convert it to a comma separated format? Is there any scripts or programs that can do this for Unix? I tried surfing the net for such an application, but everything seems to be for Windows OS. Any help or suggestions are greatly appreciated. ... (2 Replies)
Discussion started by: oscarr
2 Replies

2. Shell Programming and Scripting

Converting Pivot file to flat file

I have a file in this format. P1 P2 P3......................... A001 v11 v21 v31...................... A002 v12 v22 v32............................ A003 v13 v23 v33.......................... A004 v14 v24 v34.............................. . . . A00n... (2 Replies)
Discussion started by: vskr72
2 Replies

3. Shell Programming and Scripting

XML to flat file

Hi all, can u please help me in converting any given XML file to flat file. thanks in advance. -bali (2 Replies)
Discussion started by: balireddy_77
2 Replies

4. Shell Programming and Scripting

Converting Column to Rows in a Flat file

Hi, Request To guide me in writing a shell program for the following requirement: Example:if the Input File contains the follwing data Input File Data: 80723240029,12,323,443,88,98,7,98,67,87 80723240030,12,56,6,,,3,12,56,6,7,2,3,12,56,6,7,2,3,88,98,7,98,67,87... (5 Replies)
Discussion started by: srinikal
5 Replies

5. Shell Programming and Scripting

Converting a flat file in XML

Hello Friends, I am new to UNIX shell scripting. Using bash....Could you please help me in converting a flat file into an XML style output file. Flat file: (Input File entries looks like this) John Miller: 617-569-7996:15 Bunting lane, staten Island, NY: 10/21/79: 60600 The... (4 Replies)
Discussion started by: humkhn
4 Replies

6. UNIX for Advanced & Expert Users

Converting the date format in a flat file

Hi All, I am new to this forum, could any one help me out in resolving the below issue. Input of the flat file contains several lines of text for example find below: 5022090,2,4,7154,88,,,,,4/1/2011 0:00,Z,L,2 5022090,3,1,6648,88,,,,,4/1/2011 0:00,Z,,1... (0 Replies)
Discussion started by: av_sagar
0 Replies

7. Shell Programming and Scripting

Reading XML data in a FLAT FILE

I have a requirement to read the xml file and split the files into two diffrent files in Unix shell script. Could anyone please help me out with this requirement. Sample file --------------- 0,<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Information... (3 Replies)
Discussion started by: kmanivan82
3 Replies

8. Shell Programming and Scripting

[ask]xml to flat file

dear all, i need your advice, i have xml file like this input.xml <?xml version="1.0" encoding="UTF-8"?> <session xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'> <capture> <atribut name="tmp_Filename" value="INTest.rbs"/> <atribut name="size_Filename" value="INTest.rbs"/>... (2 Replies)
Discussion started by: zvtral
2 Replies

9. Shell Programming and Scripting

Oracle table extract: all columns are not converting into pipe delimited in flat file

Hi All, I am writing a shell script to extract oracle table into a pipe dilemited flat file. Below is my code and I have attached two files that I have abled to generate so far. 1. Table.txt ==> database extract file 2. flat.txt ==> pipe delimited after some manipulation of the original db... (5 Replies)
Discussion started by: express14
5 Replies

10. Shell Programming and Scripting

Converting csv file to flat file

Hi All, I have a csv file which is comma seperated. I need to convert to flat file with preferred column length country,id Australia,1234 Africa,12399999 Expected output country id Australia 1234 Africa 12399999 the flat file should predefined length on respective... (8 Replies)
Discussion started by: rohit_shinez
8 Replies
htmlstrip(3)							     EN Tools							      htmlstrip(3)

NAME
htmlstrip - Strip HTML markup code SYNOPSIS
htmlstrip [-o outputfile] [-O level] [-b blocksize] [-v] [inputfile] DESCRIPTION
HTMLstrip reads inputfile or from "stdin" and strips the contained HTML markup. Use this program to shrink and compactify your HTML files in a safe way. Recognized Content Types There are three disjunct types of content which are recognized by HTMLstrip while parsing: HTML Tag (tag) This is just a single HTML tag, i.e. a string beginning with a opening angle bracket directly followed by an identifier, optionally followed by attributes and ending with a closing angle bracket. Preformatted (pre) This is any contents enclosed in one of the following container tags: 1. <nostrip> 2. <pre> 3. <xmp> The non-HTML-3.2-conforming "<nostrip>" tag is special here: It acts like "<pre>" as a protection container for HTMLstrip but is also stripped from the output. Use this as a pseudo-block which just preserves its body for the HTMLstrip processing but itself is removed from the output. Plain Text (txt) This is anything not falling into one of the two other categories, i.e any content both outside of preformatted areas and outside of HTML tags. Supported Stripping Levels The amount of stripping can be controlled by a optimization level, specified via option -O (see below). Higher levels also include all of the lower levels. The following stripping is done on each level: Level 0: No real stripping, just removing the sharp/comment-lines ("#...") [txt,tag]. Such lines are a standard feature of WML, so this is always done. Level 1: Minimal stripping: Same as level 0 plus stripping of blank and empty lines [txt]. Level 2: Good stripping: Same as level 1 plus compression of multiple whitespaces (more then one in sequence) to single whitespaces [txt,tag] and stripping of trailing whitespaces at the of of a line [txt,tag,pre]. This level is the default because while providing good optimization the HTML markup is not destroyed and remains human readable. Level 3: Best stripping: Same as level 2 plus stripping of leading whitespaces on a line [txt]. This can also be recommended when you still want to make sure that the HTML markup is not destroyed in any case. But the resulting code is a little bit ugly because of the removed whitespaces. Level 4: Expert stripping: Same as level 3 plus stripping of HTML comment lines (``"<!-- ... -->"'') and crunching of HTML tag endsi [tag]. BE CAREFUL HERE: Comment lines are widely used for hiding some Java or JavaScript code for browsers which are not capable of ignoring those stuff. When using this optimization level make sure all your JavaScript code is hided correctly by adding HTMLstrip's "<nostrip>" tags around the comment delimiters. Level 5: Crazy stripping: Same as level 4 plus wrapping lines around to fit in an 80 column view window. This saves some newlines but both leads to really unreadable markup code and opens the window for a lot of problems when this code is used to layout the page in a browser. Use with care. This is only experimental! Additionally the following global strippings are done: "^ ": A leading newline is always stripped. "<suck>": The "<suck>" tag just absorbs itself and all whitespaces around it. This is like the backslash for line-continuation, but is done in Pass 8, i.e. really at the end. Use this inside HTML tag definitions to absorb whitespaces, for instance around %body when used inside "<table>" structures which at some point are newline-sensitive in Netscape Navigator. OPTIONS
-o outputfile This redirects the output to outputfile. Usually the output will be send to "stdout" if no such option is specified or outputfile is ""-"". -O level This sets the optimization/stripping level, i.e. how much HTMLstrip should compress the contents. -b blocksize For efficiency reasons, input is divided into blocks of 16384 chars. If you have some performance problems, you may try to change this value. Any value between 1024 and 32766 is allowed. With a value of 0, input is not divided into blocks. -v This sets verbose mode where some processing information will be given on the console. AUTHORS
Ralf S. Engelschall rse@engelschall.com www.engelschall.com Denis Barbier barbier@engelschall.com EN Tools 2014-04-16 htmlstrip(3)
All times are GMT -4. The time now is 04:05 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy