Sponsored Content
Top Forums Shell Programming and Scripting Remove lines with non-chinese characters from xml file Post 302501004 by g4rb4g3 on Wednesday 2nd of March 2011 08:31:20 AM
Old 03-02-2011
Quote:
Originally Posted by Chubler_XL
...
The range of chineese unicode chars is 4E00 thru 9FFF (344 270 200 thru 351 277 277) so the test should be >"\343" and <"\352" (to avoid picking up any 4 char UTF-8 codes):

Code:
awk '{f=0;for(i=1;i<=length;i++)if(substr($0,i,1)>"\343"&&substr($0,i,1)<"\352")f=1}f' file

Thank you! Works perfectly!
 

10 More Discussions You Might Find Interesting

1. Solaris

Chinese characters on Sol 2.7

Hi there, I need to get a Chinese disclaimer attached to an email on a Solaris 2.7 box. The disclaimer we use is in English and stored as a text file although I've been asked to see if we can add the Chinsese one? Is it simply just a matter of adding the Chinese locale to the OS or is there... (1 Reply)
Discussion started by: Hayez
1 Replies

2. Filesystems, Disks and Memory

Chinese characters in Vi editor

Dear All, I have excel files containing Chinese characters. I have a requirement to display the contents of both the English and the Chinese files in the Unix box using the vi editor. But I when I try to open the Chinese files, the characters are junk. Can one of you help me in getting rid of... (4 Replies)
Discussion started by: chrisanto_2000
4 Replies

3. Shell Programming and Scripting

How to remove xml namespace from xml file using shell script?

I have an xml file: <AutoData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <Table1> <Data1 10 </Data1> <Data2 20 </Data2> <Data3 40 </Data3> <Table1> </AutoData> and I have to remove the portion xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" only. I tried using sed... (10 Replies)
Discussion started by: Gary1978
10 Replies

4. Shell Programming and Scripting

Remove lines from XML based on condition

Hi, I need to remove some lines from an XML file is the value within a tag is empty. Imagine this scenario, <acd><acdID>2</acdID><logon></logon></acd> <acd><acdID></acdID><logon></logon></acd> <acd><acdID></acdID><logon></logon></acd> <acd><acdID></acdID><logon></logon></acd> I... (3 Replies)
Discussion started by: giles.cardew
3 Replies

5. Solaris

Chinese / Global characters problem

Hello, I have large xml files with chinese characters on a windows box and they need to be FTP'd to UNIX box. When I ftp the file, the chinese text converts to junk characters. I tried changing my setting on putty to UTF-8, but still cannot view the correct text. Is there something I need to... (4 Replies)
Discussion started by: tokool420
4 Replies

6. Shell Programming and Scripting

How to remove some xml tag lines using shell script

I have existing XML file as below, now based on input string in shell script on workordercode i need to create a seprate xml file for e.g if we pass the input string as 184851 then it find the tag data from <workOrder>..</workOrder> and write to a new file and similarly next time if i pass the... (3 Replies)
Discussion started by: balrajg
3 Replies

7. Shell Programming and Scripting

How to add the multiple lines of xml tags before a particular xml tag in a file

Hi All, I'm stuck with adding multiple lines(irrespective of line number) to a file before a particular xml tag. Please help me. <A>testing_Location</A> <value>LA</value> <zone>US</zone> <B>Region</B> <value>Russia</value> <zone>Washington</zone> <C>Country</C>... (0 Replies)
Discussion started by: mjavalkar
0 Replies

8. Shell Programming and Scripting

How can I remove some xml tag lines using shell script?

Hi All, My name is Prathyu and I am working as a ETL develper. I have one requirement to create a XML file based on the provided XSD file. As per the Datastage standards Key(repeatable) field does not contain any Null values so I am inserting some dummy tag line to that XML file. ... (14 Replies)
Discussion started by: Prathyu
14 Replies

9. Red Hat

How to display Chinese and Japanese Characters on Rhel 6?

Hello, I'm trying to figure out how to display Chinese and Japanese Characters on my RHEL 6 Console. There is no more "bogl-bterm" for RHEL6, that is not supported anymore. Is there any way that I could display them? Thank you. (2 Replies)
Discussion started by: pjeedu2247
2 Replies

10. SuSE

Display Chinese and Japanese characters on my SLES console.

Hello, I'm trying to figure out how to display Chinese and Japanese Characters on my SLES 11 Console. Is there any way that I could display those characters on my console? Thank you. (3 Replies)
Discussion started by: pjeedu2247
3 Replies
pbput(1)							     bikeshed								  pbput(1)

NAME
pbput - compress and encode arbitrary files to pastebin.com pbputs - compress, encrypt, encode arbitrary files to pastebin.com pbget - decode and decompress arbitrary files from pastebin.com SYNOPSIS
pbput [FILENAME] cat foo | pbput pbputs [FILENAME] [GPG_USER] cat foo | pbputs [GPG_USER] pbget URL [DIRECTORY] DESCRIPTION
pbput is a program that can upload text files, binary files or entire directory structures to a pastebin, such as pastebin.com. pbget is a program that be used to retrieve content uploaded to a pastebin by pbput. pbputs operates exactly like pbput, except it encrypts the data. An optional GPG_USER argument is allowed, which will sign and encrypt the data to the target user in one's keyring (which could be oneself!). Otherwise, the user is prompted for a symmetric passphrase for encrypting the content with gpg(1) before uploading. pbget will automatically prompt the receiving user for the pre-shared passphrase. pbput and pbputs can take its input either on STDIN, or as a FILENAME argument. - If STDIN is used, then the receiving user's pbget will simply paste the input on STDOUT. - If a FILENAME or DIRECTORY is passed as an argument, then it is first archived using tar(1) to preserve the file and directory attributes pbget takes a URL as its first, mandatory argument. Optionally, it takes a DIRECTORY as a second parameter. If the incoming data is in fact a file or file structure in a tar(1) archive, then that data will be extracted in the specified DIRECTORY. If no DIRECTORY is speci- fied, then a temporary directory is created using mktemp(1). In any case the uploaded/downloaded data is optionally tar(1) archived, always lzma(1) compressed, optionally gpg(1) encrypted, and always base64(1) encoded. http://pastebin.com is used by default. EXAMPLES
$ pbput /sbin/init http://pastebin.com/BstNzasK $ pbget http://pastebin.com/BstNzasK sbin/init INFO: Output is in [/tmp/pbget.bG67DwY6Zl] $ cat /etc/lsb-release | pbput http://pastebin.com/p43gJv6Z $ pbget http://pastebin.com/p43gJv6Z DISTRIB_ID=Ubuntu DISTRIB_RELEASE=11.04 DISTRIB_CODENAME=natty DISTRIB_DESCRIPTION="Ubuntu 11.04" $ pbputs /etc/shadow Enter passphrase: http://pastebin.com/t2ZaCYr3 $ pbget http://pastebin.com/t2ZaCYr3 Enter passphrase: root:09cc6d2d9d63371a425076e217f77698:15096:0:99999:7::: daemon:*:15089:0:99999:7::: bin:*:15089:0:99999:7::: sys:*:15089:0:99999:7::: .... SEE ALSO
pastebinit(1), lzma(1), base64(1), tar(1), gpg(1), mktemp(1) AUTHOR
This manpage and the utility was written by Dustin Kirkland <kirkland@ubuntu.com> for Ubuntu systems (but may be used by others). Permis- sion is granted to copy, distribute and/or modify this document under the terms of the GNU General Public License, Version 2 or later pub- lished by the Free Software Foundation. On Debian systems, the complete text of the GNU General Public License can be found in /usr/share/common-licenses/GPL, or on the web at http://www.gnu.org/licenses/gpl.txt. bikeshed 6 Oct 2010 pbput(1)
All times are GMT -4. The time now is 10:01 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy