Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Remove Unicode/special chars from XML Post 302597540 by dsrookie7 on Friday 10th of February 2012 03:11:14 PM
Old 02-10-2012
Remove Unicode/special chars from XML

Hi,

We are receiving an XML file in Unix which has some special characters between tags like '^' etc

<Tag> 1e^O7f%<2304e.$d8f57e8^Bf-&e.^Zh7/327e^O7 </Tag>

We need to remove all special characters like ^ ones and also any '&' or '<' or '>' being sent within the start and close tags i.e. in tag text.

The upstream system is sending some unicode characters which are getting convrted to carot symbols in Unix (apart from & and > and <). This is causing my XML parser to abort or drop rows which have such data.

Please provide a perl command to remove them. (we need to remove '&' and '<' and '>' which are present in tag 'text')

Thanks
DSR
 

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Supress special chars in vi

Hi, One of our application is producing log files. But if we open the log file in vi or less or view mode, it shows all the special characters in it. The 'cat' shows correctly but it shows only last page. If I do 'cat' <file_name> | more, then again it shows special characters. ... (1 Reply)
Discussion started by: divakarp
1 Replies

2. Shell Programming and Scripting

treating special chars

Hi, I need some advise on treating non printable chars over ascii value 126 Case 1 : On some fields in the text , I need to retiain then 'as-is' and load to a database.I understand it also depends on database codepage. but i just wanna know how do i ensure it do not change while loading... (1 Reply)
Discussion started by: braindrain
1 Replies

3. Shell Programming and Scripting

special chars arrangement in code

here is my simple script to show process and owners except me: ps `-ef |grep xterm |grep -v aucar` | while read a1 a2 a3 a4 a5 a6 a7 a8 do echo KILL..\($a1\).. $a2 |more done how can I pass values from command "ps -ef |grep xterm|grep -v aucar" to ? because above command... (2 Replies)
Discussion started by: xramm
2 Replies

4. UNIX for Dummies Questions & Answers

remove special and unicode characters

Hi, How do I remove the lines where special characters or Unicode characters appear? The following query does work but I wonder if there is a better way. cat test.txt | egrep -v '\)|#|,|&|-|\(|\\|\/|\.' The following lines show that my query is incomplete. Warning: The word "*Khan" is... (1 Reply)
Discussion started by: shantanuo
1 Replies

5. Shell Programming and Scripting

finding files with unicode chars in the filename

I'm trying to check-in a repository to svn -- but the import is failing because some files waaaay down deep in some graphics-library folder are using unicode characters in the file name - which are masked using the ls command but picked up when piping output to more: # ls -l 1914* -rwxrwxr-x 1... (2 Replies)
Discussion started by: mshallop
2 Replies

6. Shell Programming and Scripting

comm command help with unicode chars in file

Hi, I have a Master file (file.txt) with good and bad records( records with unicode characters). I ahve a file with only bad records (bad.txt) I want the records in file.txt which are not present in bad.txt ie only the good records. I tried comm -23 file.txt bad.txt It is giving... (14 Replies)
Discussion started by: ashwin3086
14 Replies

7. Shell Programming and Scripting

print all between patterns with special chars

Hi, I'm having trouble with awk print all characters between 2 patterns. I tried more then one solution found on this forum but with no success. Probably my mistakes are due to the special characters "" and "]"in the search patterns. Well, have a log file like this: logfile.txt ... (3 Replies)
Discussion started by: ginolatino
3 Replies

8. Shell Programming and Scripting

All strings within two special chars

I have a file with multiple lines. From each line I want to get all strings that starts with '+' and ends with '/'. Then I want the strings to be separated by ' + ' Example input: +$A$/NOUN+At/NSUFF_FEM_PL+K/CASE_INDEF_ACC Sample output: $A$ + At + K (20 Replies)
Discussion started by: Viernes
20 Replies

9. Shell Programming and Scripting

Safely Remove Files with Special Chars

Hey Guys, I'm swamped writing code for the forums: Could someone write a script or command line to safely delete files with special chars in filenames from a directory: Example: -rw-r--r-- 1 root root 148 Apr 30 23:00 ?xA?? -rw-r--r-- 1 root root 148... (8 Replies)
Discussion started by: Neo
8 Replies

10. UNIX for Beginners Questions & Answers

Shell script to split data with a delimiter having chars and special chars

Hi Team, I have a file a1.txt with data as follows. dfjakjf...asdfkasj</EnableQuotedIDs><SQL><SelectStatement modified='1' type='string'><! The delimiter string: <SelectStatement modified='1' type='string'><! dlm="<SelectStatement modified='1' type='string'><! The above command is... (7 Replies)
Discussion started by: kmanivan82
7 Replies
Parser::Style::Stream(3)				User Contributed Perl Documentation				  Parser::Style::Stream(3)

NAME
XML::Parser::Style::Stream - Stream style for XML::Parser SYNOPSIS
use XML::Parser; my $p = XML::Parser->new(Style => 'Stream', Pkg => 'MySubs'); $p->parsefile('foo.xml'); { package MySubs; sub StartTag { my ($e, $name) = @_; # do something with start tags } sub EndTag { my ($e, $name) = @_; # do something with end tags } sub Characters { my ($e, $data) = @_; # do something with text nodes } } DESCRIPTION
This style uses the Pkg option to find subs in a given package to call for each event. If none of the subs that this style looks for is there, then the effect of parsing with this style is to print a canonical copy of the document without comments or declarations. All the subs receive as their 1st parameter the Expat instance for the document they're parsing. It looks for the following routines: * StartDocument Called at the start of the parse . * StartTag Called for every start tag with a second parameter of the element type. The $_ variable will contain a copy of the tag and the %_ vari- able will contain attribute values supplied for that element. * EndTag Called for every end tag with a second parameter of the element type. The $_ variable will contain a copy of the end tag. * Text Called just before start or end tags with accumulated non-markup text in the $_ variable. * PI Called for processing instructions. The $_ variable will contain a copy of the PI and the target and data are sent as 2nd and 3rd parameters respectively. * EndDocument Called at conclusion of the parse. perl v5.8.4 2003-08-18 Parser::Style::Stream(3)
All times are GMT -4. The time now is 01:07 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy