Greping entire XML which has special character


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Greping entire XML which has special character
# 1  
Old 01-29-2013
Greping entire XML which has special character

I have an XML with has special character Â.
I wrote a Grep command to find out the special character
Code:
grep -i  Filename | grep ShipAddress2

I need the help to know how to find out special character such as  and get the whole XML listed assuming there are more xml data of similar sort for another <OrderNumber>

Code:
<order>
<OrderNumber>0214895664*000000</OrderNumber>
<Ship>Next Day Shipment</Ship>
<PortRequest>NO</PortRequest>
<SubChannel>DF GB</SubChannel>
<ShipAddress1>11580 NW 105TH ST</ShipAddress1>
<ShipAddress2>Attn  </ShipAddress2>

<Subscriber>
<SalesCode>2222</SalesCode>
<DesiredState>FL</DesiredState>
<DesiredCity>Miami</DesiredCity>
<DesiredNPA>786</DesiredNPA>
<ProductType>G</ProductType>
<RatePlanCode>ENADON9RP</RatePlanCode>
<RatePlanName>PLAN 9.99</RatePlanName>
<RatePlanMarket>MIF</RatePlanMarket>
<MonthlyCost>  0</MonthlyCost>

<Feature>
<Name>$9 ADD'L</Name>
<Code>ENA</Code>
<Cost>   9.99</Cost>
</Feature>
<Feature>
<Name>400 SMS</Name>
<Code>SMS400</Code>
<Cost>   0.00</Cost>
</Feature>

</Subscriber>
<order>

# 2  
Old 01-29-2013
ISO-8859-1 or Latin-1 characters are between 128 and 255. Regex '[¡-ÿ]' should pick up those lines. These are the first and last visible glyphs on my code page. I wrote these 2 commands in c, one to gen all byte codes and one to display a census of byte codes in a file:
Code:
$ all256 | census
1       000     00      000     ''
1       001     01      001     ''
1       002     02      002     ''
1       003     03      003     ''
1       004     04      004     ''
1       005     05      005     ''
1       006     06      006     ''
1       007     07      007     ''
1       008     08      010     '
1       009     09      011     '       '
1       010     0a      012     '
'
1       011     0b      013     '
                                 '
1       012     0c      014     '
                                 '
'       013     0d      015     '
1       014     0e      016     ''
1       015     0f      017     ''
1       016     10      020     ''
1       017     11      021     ''
1       018     12      022     ''
1       019     13      023     ''
1       020     14      024     ''
1       021     15      025     ''
1       022     16      026     ''
1       023     17      027     ''
1       024     18      030     ''
1       025     19      031     ''
1       026     1a      032     ''
1       027     1b      033     '
1       028     1c      034     ''
1       029     1d      035     ''
1       030     1e      036     ''
1       031     1f      037     ''
1       032     20      040     ' '
1       033     21      041     '!'
1       034     22      042     '"'
1       035     23      043     '#'
1       036     24      044     '$'
1       037     25      045     '%'
1       038     26      046     '&'
1       039     27      047     '''
1       040     28      050     '('
1       041     29      051     ')'
1       042     2a      052     '*'
1       043     2b      053     '+'
1       044     2c      054     ','
1       045     2d      055     '-'
1       046     2e      056     '.'
1       047     2f      057     '/'
1       048     30      060     '0'
1       049     31      061     '1'
1       050     32      062     '2'
1       051     33      063     '3'
1       052     34      064     '4'
1       053     35      065     '5'
1       054     36      066     '6'
1       055     37      067     '7'
1       056     38      070     '8'
1       057     39      071     '9'
1       058     3a      072     ':'
1       059     3b      073     ';'
1       060     3c      074     '<'
1       061     3d      075     '='
1       062     3e      076     '>'
1       063     3f      077     '?'
1       064     40      100     '@'
1       065     41      101     'A'
1       066     42      102     'B'
1       067     43      103     'C'
1       068     44      104     'D'
1       069     45      105     'E'
1       070     46      106     'F'
1       071     47      107     'G'
1       072     48      110     'H'
1       073     49      111     'I'
1       074     4a      112     'J'
1       075     4b      113     'K'
1       076     4c      114     'L'
1       077     4d      115     'M'
1       078     4e      116     'N'
1       079     4f      117     'O'
1       080     50      120     'P'
1       081     51      121     'Q'
1       082     52      122     'R'
1       083     53      123     'S'
1       084     54      124     'T'
1       085     55      125     'U'
1       086     56      126     'V'
1       087     57      127     'W'
1       088     58      130     'X'
1       089     59      131     'Y'
1       090     5a      132     'Z'
1       091     5b      133     '['
1       092     5c      134     '\'
1       093     5d      135     ']'
1       094     5e      136     '^'
1       095     5f      137     '_'
1       096     60      140     '`'
1       097     61      141     'a'
1       098     62      142     'b'
1       099     63      143     'c'
1       100     64      144     'd'
1       101     65      145     'e'
1       102     66      146     'f'
1       103     67      147     'g'
1       104     68      150     'h'
1       105     69      151     'i'
1       106     6a      152     'j'
1       107     6b      153     'k'
1       108     6c      154     'l'
1       109     6d      155     'm'
1       110     6e      156     'n'
1       111     6f      157     'o'
1       112     70      160     'p'
1       113     71      161     'q'
1       114     72      162     'r'
1       115     73      163     's'
1       116     74      164     't'
1       117     75      165     'u'
1       118     76      166     'v'
1       119     77      167     'w'
1       120     78      170     'x'
1       121     79      171     'y'
1       122     7a      172     'z'
1       123     7b      173     '{'
1       124     7c      174     '|'
1       125     7d      175     '}'
1       126     7e      176     '~'
1       127     7f      177     ''
1       128     80      200     ''
1       129     81      201     ''
1       130     82      202     ''
1       131     83      203     ''
1       132     84      204     ''
1       133     85      205     ''
1       134     86      206     ''
1       135     87      207     ''
1       136     88      210     ''
1       137     89      211     ''
1       138     8a      212     ''
1       139     8b      213     ''
1       140     8c      214     ''
1       141     8d      215     ''
1       142     8e      216     ''
1       143     8f      217     ''
1       144     90      220     ''
1       145     91      221     ''
1       146     92      222     ''
1       147     93      223     ''
1       148     94      224     ''
1       149     95      225     ''
1       150     96      226     ''
1       151     97      227     ''
1       152     98      230     ''
1       153     99      231     ''
1       154     9a      232     ''
1       155     9b      233     ''
1       156     9c      234     ''
1       157     9d      235     ''
1       158     9e      236     ''
1       159     9f      237     ''
1       160     a0      240     ' '
1       161     a1      241     '¡'
1       162     a2      242     '¢'
1       163     a3      243     '£'
1       164     a4      244     '¤'
1       165     a5      245     '¥'
1       166     a6      246     '¦'
1       167     a7      247     '§'
1       168     a8      250     '¨'
1       169     a9      251     '©'
1       170     aa      252     'ª'
1       171     ab      253     '«'
1       172     ac      254     '¬'
1       173     ad      255     '­'
1       174     ae      256     '®'
1       175     af      257     '¯'
1       176     b0      260     '°'
1       177     b1      261     '±'
1       178     b2      262     '²'
1       179     b3      263     '³'
1       180     b4      264     '´'
1       181     b5      265     'µ'
1       182     b6      266     '¶'
1       183     b7      267     '·'
1       184     b8      270     '¸'
1       185     b9      271     '¹'
1       186     ba      272     'º'
1       187     bb      273     '»'
1       188     bc      274     '¼'
1       189     bd      275     '½'
1       190     be      276     '¾'
1       191     bf      277     '¿'
1       192     c0      300     'À'
1       193     c1      301     'Á'
1       194     c2      302     'Â'
1       195     c3      303     'Ã'
1       196     c4      304     'Ä'
1       197     c5      305     'Å'
1       198     c6      306     'Æ'
1       199     c7      307     'Ç'
1       200     c8      310     'È'
1       201     c9      311     'É'
1       202     ca      312     'Ê'
1       203     cb      313     'Ë'
1       204     cc      314     'Ì'
1       205     cd      315     'Í'
1       206     ce      316     'Î'
1       207     cf      317     'Ï'
1       208     d0      320     'Ð'
1       209     d1      321     'Ñ'
1       210     d2      322     'Ò'
1       211     d3      323     'Ó'
1       212     d4      324     'Ô'
1       213     d5      325     'Õ'
1       214     d6      326     'Ö'
1       215     d7      327     '×'
1       216     d8      330     'Ø'
1       217     d9      331     'Ù'
1       218     da      332     'Ú'
1       219     db      333     'Û'
1       220     dc      334     'Ü'
1       221     dd      335     'Ý'
1       222     de      336     'Þ'
1       223     df      337     'ß'
1       224     e0      340     'à'
1       225     e1      341     'á'
1       226     e2      342     'â'
1       227     e3      343     'ã'
1       228     e4      344     'ä'
1       229     e5      345     'å'
1       230     e6      346     'æ'
1       231     e7      347     'ç'
1       232     e8      350     'è'
1       233     e9      351     'é'
1       234     ea      352     'ê'
1       235     eb      353     'ë'
1       236     ec      354     'ì'
1       237     ed      355     'í'
1       238     ee      356     'î'
1       239     ef      357     'ï'
1       240     f0      360     'ð'
1       241     f1      361     'ñ'
1       242     f2      362     'ò'
1       243     f3      363     'ó'
1       244     f4      364     'ô'
1       245     f5      365     'õ'
1       246     f6      366     'ö'
1       247     f7      367     '÷'
1       248     f8      370     'ø'
1       249     f9      371     'ù'
1       250     fa      372     'ú'
1       251     fb      373     'û'
1       252     fc      374     'ü'
1       253     fd      375     'ý'
1       254     fe      376     'þ'
1       255     ff      377     'ÿ'
256     Total
$

# 3  
Old 01-29-2013
If exist on your system:
Code:
$ man ascii
$ man iso_8859_1

# 4  
Old 01-29-2013
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Special character $$

Hi, on ksh What does the following do? grep -v "toolbox" $home_oracle/.profile >$home_oracle/.profile.$$ Thanks. Please use CODE tags as required by forum rules! (3 Replies)
Discussion started by: big123456
3 Replies

2. Shell Programming and Scripting

How to remove newline character if it is the only character in the entire file.?

I have a file which comes every day and the file data look's as below. Vi abc.txt a|b|c|d\n a|g|h|j\n Some times we receive the file with only a new line character in the file like vi abc.txt \n (8 Replies)
Discussion started by: rak Kundra
8 Replies

3. Shell Programming and Scripting

Print entire line only if certain fixed character matches the string

Hi All, I have a file testarun.txt contains the below lines and i want to print the lines if the character positions 7-8 matches 01. 201401011111 201401022222 201402013333 201402024444 201403015555 201403026666 201404017777 201404028888 201405019999 201405020000 I am trying the... (4 Replies)
Discussion started by: Arunprasad
4 Replies

4. Shell Programming and Scripting

Replace character in files of entire folder? sed? or what?

Hello, I do have several files in one folder each file contains measurement data. for each file I would like to replace the character "," by "." ? How can I do this and how can I do this for each file at once? E.G. data_1.dat, data_x.dat (original version) data_1out.dat, data_x_out.dat... (10 Replies)
Discussion started by: rollinator
10 Replies

5. Shell Programming and Scripting

Vi special character

When editing a file, vi displays a special character as ^L. Can you tell me the escaped character to be used in awk? And can that escaped character be used in a regexp in both sed and awk? (7 Replies)
Discussion started by: dmesserly
7 Replies

6. Shell Programming and Scripting

greping word after new line character

how grep all user from the below looking file: User_Alias ADMIN1 = horacel, matthes, caseyl, alexl2, \ jackl, johnnyzh, maheshm, jihuih, davidw, \ christh, williaml,jasminez User_Alias ADMIN2 = tomc, apitssc, fengh, guh, kail,... (10 Replies)
Discussion started by: manojit123
10 Replies

7. Shell Programming and Scripting

Deleteing one character after an special character

I have below line in a unix file, I want to delete one character after "Â". 20091020.Non-Agency CMO Daily Trade Recap Â~V Hybrids The result should be : 20091020.Non-Agency CMO Daily Trade Recap  Hybrids i dont want to use "~V" anywhere in the sed command or any other command, just remove... (1 Reply)
Discussion started by: mohsin.quazi
1 Replies

8. Shell Programming and Scripting

Special character \

Hi, In the shell script, i need to remove the special charater "\" with "\\". For example, i need to replace "D:\FXT\ABC.TXT" with "D:\\FXT\\ABC.TXT". However, when trying to do something like , i get the below error :- -->echo "D:\FXT\ABC.TXT" | sed -e 's#\#\\#g' sed: 0602-404 Function... (7 Replies)
Discussion started by: amit_arora
7 Replies

9. Shell Programming and Scripting

special character

Hi, I am trying to unload file from a database. Which contains few lines with the character below. Rest of the data was unloaded appropriately. a) What does this below character means? b) How can i remove it, I already have sed '/^$/d' c) Will this effect the file by any means... (4 Replies)
Discussion started by: tostay2003
4 Replies

10. Programming

special character ?

hey there im a bit stuck on executing commands that include the special character '?'. can someone recommend a way on how i would be able to execute it?? i thought the glob function could be useful (still mite be) but upon entering the command 'ls pars?' it listed all the files in the... (1 Reply)
Discussion started by: mile1982
1 Replies
Login or Register to Ask a Question