Find out special characters from xml file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Find out special characters from xml file
# 1  
Old 04-03-2013
DB Find out special characters from xml file

Hi....I have a xml file which is having lots of special characters which I need to find out and put the distinct list of those into a text file. The list of special characters is not specific, it can be anything at different point of time.

Can anyone help me to find out the same and list out?

I'm using KSH script.

Thanks & Regards,
Krishanu Saha
# 2  
Old 04-03-2013
The description seems pretty vague. How about a sample input file, with code tags, and the expected output?
# 3  
Old 04-03-2013
OK. Let me explain again. I have a xml file. I need to find out the special characters from that xml file except A-Z, a-z, 0-9 and the following list of symbols -

_ (underscore) , (comma) () (first brackets) & (ampersand) ; (semi colon) {} (2nd brackets) % (percentage) + (plus) < (less than) > (greater than) / (front slash) : (colon) = (equal to) . (dot) ' ' (space) " (double quotes) - (hyphen) \ (backslash) $ (dollar) and * (asterisks).

Apart from the above list any characters, symbols should be considered as invalid and need to find out the same.
# 4  
Old 04-03-2013
Does this awk serve your purpose?
Code:
awk '{gsub(/[a-zA-Z0-9_,()&;{}%+<>/:=. "\-\\$*]/,x)}NF' file.xml

# 5  
Old 04-03-2013
Thanks. But its no working.

I have tried -
Code:
awk '{gsub(/[a-zA-Z0-9_,()&;{}%+<>/:=. "-\$*]/,x)}NF' jhfnfull.xml > pqr.txt

but got the following error messages -
Code:
awk: syntax error near line 1
awk: illegal statement near line 1
awk: syntax error near line 1
awk: illegal statement near line 1


Last edited by Franklin52; 04-04-2013 at 03:20 AM.. Reason: Please use code tags for data and code samples
# 6  
Old 04-03-2013
Modified code:
Code:
awk '{gsub(/[a-zA-Z0-9_,()&;{}%+<>\/:=. "\-\\$*]/,x)}NF' file.xml

Note: Use nawk instead if you are on SunOS or Solaris
# 7  
Old 04-03-2013
Thank you....This is working. Let me do some testing.

---------- Post updated at 08:30 PM ---------- Previous update was at 04:36 PM ----------

Another help I need.....Now I need to keep A-Z, a-z, 0-9 and the following symbols in the xml file and remove all other symbols which are not listed here.

_ (underscore) , (comma) () (first brackets) & (ampersand) ; (semi colon) {} (2nd brackets) [] (3rd brackets), % (percentage) + (plus) < (less than) > (greater than) / (front slash) : (colon) = (equal to) . (dot) ' ' (space) " (double quotes) ' (single quote) - (hyphen) \ (backslash) $ (dollar) @ (at the rate) and * (asterisks).

Can anyone please help me on this?

Regards,
Krishanu

Last edited by Krishanu Saha; 04-03-2013 at 10:05 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. How to Post in the The UNIX and Linux Forums

How to replace value of password tag in xml with blanks when special characters are there?

Hi All, I am trying to replace the values inside <password> tag in an xml file but it doesn't replace certain passwords: For eg: Server/home/sperinc>cat TextXML.txt <appIds> <entry name="AccountXref"> <type id="ldap"> <realm>nam</realm> ... (7 Replies)
Discussion started by: saroopkris85
7 Replies

2. AIX

special characters in front of xml declaration

Hi I read xml files through mq and placed them on unix by using datastage as tool. I can see some special characters infront of declaration part for every xml file i have produced. below is the sample snippet when i opened the file by suing vi editor ^Z^E|^A^Z^Z<?xml version="1.0"... (1 Reply)
Discussion started by: dsdev_123
1 Replies

3. Shell Programming and Scripting

HOw to find special characters

I have flat file which has data like this glid¿as_liste¿025175456 How can I print these lines into new file? (4 Replies)
Discussion started by: sol_nov
4 Replies

4. UNIX for Dummies Questions & Answers

find text enclosed between special characters

Hi, I'm trying to find all DISTINCT words having _mr in the line and ENCLOSED in '/'. For eg below is the text in a file.. /database/new_mr254/1 /database/rawdb/views/new_mr254/1 /database/project/rawdb/tables/new_mr232/1 /database/project/rawdb/views/new_mr253/1... (5 Replies)
Discussion started by: northwest
5 Replies

5. Shell Programming and Scripting

Single/Multiple Line with Special characters - Find & Replace in Unix Script

Hi, I am creating a script to do a find and replace single/multiple lines in a file with any number of lines. I have written a logic in a script that reads a reference file say "findrep" and populates two variables $FIND and $REPLACE print $FIND gives Hi How r $u Rahul() Note:... (0 Replies)
Discussion started by: r_sarnayak
0 Replies

6. UNIX for Dummies Questions & Answers

Find in Files (special characters)

Well, I've searched the forum, but couldn't find an option, that would help me. I'm really a dummie in unix, so here it goes. I've got like 50k files in a single catalogue. One of them contains a string: Including the box/square brackets. I tried to find it manually, and use some search... (2 Replies)
Discussion started by: kalik
2 Replies

7. UNIX for Dummies Questions & Answers

Find and replace special characters in a file

HI All I need a shell script ehich removes all special characters from file and converts the file to UTF-* format Specail characters to be removed must be configurable. strIllegal = @"?/><,:;""'{|\\+=-)(*&^%$#@!~`"; Please help me in getting this script as my scripting skilla are... (2 Replies)
Discussion started by: sujithchandra
2 Replies

8. UNIX for Dummies Questions & Answers

Help with find and replace w/string containing special characters

Can I get some help on this please, I have looked at the many post with similar questions and have tried the solutions and they are not working for my scenario which is: I have a text file (myfile) that contains b_log=$g_log/FILENAME.log echo "Begin processing file FILENAME " >> $b_log ... (4 Replies)
Discussion started by: CAGIRL
4 Replies

9. UNIX for Dummies Questions & Answers

Parsing special characters between C and XML..

Hi, I am getting problem in parsing special characters(Like &, > or <) in XML. I need to encode my C program and send in report format to another interface which is in XML format. I do not know how to encode these special characters in C program before sending to XML format. Please help !! (1 Reply)
Discussion started by: ronix007
1 Replies

10. AIX

How to find special characters??

By more, vi, cat etc commands special characters (few control characters) are not identified. Is there any way to find out those? Thanks Sumit (3 Replies)
Discussion started by: sumitc
3 Replies
Login or Register to Ask a Question