Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Need to remove Junk characters Post 302998796 by MadeInGermany on Wednesday 7th of June 2017 12:50:00 PM
Old 06-07-2017
Try
Code:
sed 's/[^[:print:]]//g'

If you want to mark the deleted text then try
Code:
sed 's/[^[:print:]]\{1,\}/<unprintable>/g'

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Email ends with Junk Characters

I have written the following code ...to include the Subject, Message Body and Attachment with sendmail. When I send mail from my Unix account to diffrent mail servers like Yahoo , Hotmail etc..I recv the Message Body but there is no newline character at the end of each line.... Also I recv the... (5 Replies)
Discussion started by: Amruta Pitkar
5 Replies

2. Shell Programming and Scripting

Remove junk characters using Perl

Guys, can you help me in removing the junk character "^S" from the below line using perl Reference Data Not Recognised ^S Where a value is provided by the consuming system, which is not reco Thanks, M.Mohan (1 Reply)
Discussion started by: mohan_xunil
1 Replies

3. Shell Programming and Scripting

finding junk characters

Hi, Is there anyway to find the junk characters in a file.Consider the file has data as given below: 123|abc^M|Doctor^C #record 1 234|def|Med #record 2 345|dfg^C|Wrong^V #record 3 The junk characters are highlighted and this is a pipe delimited file. Is there anyway to... (20 Replies)
Discussion started by: ashwin3086
20 Replies

4. Shell Programming and Scripting

Replacing junk characters

Hi, I have a file with data as given below $cat file1 123|abc|345 345|def|567 The first record is good record. The second record has an invisible junk character like \032. I was replace all the occurences of that invisible character with #. I want to do this for a set of... (16 Replies)
Discussion started by: ashwin3086
16 Replies

5. Shell Programming and Scripting

Handling Junk Characters

Urgently ur help is needed. Actually my req is i have an input file, that input file may have junk characters (^M, ^Z) etc... eg: cat file name abc^Z addres name2 msdmskd^Z address2 I want to validate the record and display where exactly this junk character resides. I want to... (3 Replies)
Discussion started by: help_scr_seeker
3 Replies

6. Shell Programming and Scripting

Junk characters in mailx output

I have script which send a mail with top output. The script look like $ cat health.sh #!/bin/sh maillist="email address" rm /home/rtq1/file top -n 1 | head 15 > file cat file | mailx -s "Daily Health Report from `hostname` ..." "${maillist}" But now i am getting some junk characters along... (1 Reply)
Discussion started by: Renjesh
1 Replies

7. UNIX for Dummies Questions & Answers

How to remove JUNK characters (FROM�)

Hi I have to remove the junk characters from my file. Please help.. File content : CURITY_CODE_GSD) FROM� DL_CB_SOD_EOD_VALUATION WHERE� ASOF (1 Reply)
Discussion started by: arukuku
1 Replies

8. Solaris

Junk characters in Solaris 11

Hi, I rebooted a Solaris 11 box and after that date stamp is coming in junk in almost all directories. root@tstilp05 # ls -l total 112 drwxrwxr-x 9 root sys 19 juin 1 03:10 adm drwxr-xr-x 6 root sys 6 sept. 19 2012 ai drwxr-xr-x 3 root bin ... (3 Replies)
Discussion started by: solaris_1977
3 Replies

9. Shell Programming and Scripting

Remove all junk characters from a text file

I am using flatfile, in that flat file we are getting the junk chars 1)I21001f<82>^Me<85>!h49 Service Charge 2) I21001f‚ e...!h49 Service Charge please tell me how to remove all junk chars in unix scripts. (1 Reply)
Discussion started by: Talari
1 Replies

10. UNIX for Dummies Questions & Answers

How to replace and remove few junk characters from a specific field?

I would like to remove all characters starting with "%" and ending with ")" in the 4th field - please help!! 1412007819.864 /device/services/heartbeatxx 204 0.547%!i(int=0) 0.434 0.112 1412007819.866 /device/services/heartbeatxx 204 0.547%!i(int=1) 0.423 0.123... (10 Replies)
Discussion started by: snemuk14
10 Replies
jiskanji(5)							File Formats Manual						       jiskanji(5)

NAME
jiskanji, jiskanji7, JIS7 - A character encoding system (codeset) for Japanese DESCRIPTION
JIS Kanji is a codeset that uses the JIS X0202 symbol extension method for encoding the JIS X0208 and JIS X0201 character sets. There are two types of JIS Kanji encoding: 7-bit JIS Kanji code and 8-bit JIS Kanji code. 7-bit JIS Kanji Code In 7-bit JIS Kanji encoding, all character values are 7-bit bytes. Characters are interpreted according to preceding in and out sequences as follows: Kanji in sequence (ESC $ B) The code values following the Kanji in sequence (ESC $ B) are treated as characters in the JIS X0208 Kanji character set. Kanji out sequence (ESC ( B) The code values following the Kanji out sequence (ESC ( B) are treated as ASCII characters. Supplementary Kanji in sequence (ESC $ ( D) The code values following the supplementary Kanji in sequence (ESC $ ( D) are treated as characters in the JIS X0212 supplementary Kanji character set. User-Defined Character (UDC) in sequence (ESC $ ( 0) The code values following the UDC in sequence (ESC $ ( 0) are treated as characters in the vendor-defined or user-defined character set. Kana in (SO) and Kana out (SI) sequences The code values following SO(0x0e) and preceding SI(0x0f) are treated as characters in the JIS X0201 Katakana character set. Katakana in sequence (ESC ( I) Code values following the Katakana in sequence (ESC ( I) are treated as characters in the JIS X0201 Katakana character set. In this case, the Kanji out sequence is used to switch back to ASCII code. The Katakana in and Kanji out sequences are an alternative to using the Kana in and out sequences (SO/SI). 8-bit JIS Kanji Code In 8-bit JIS Kanji encoding, the JIS X0201 Katakana characters are represented as 8-bit bytes. Using this form of encoding, in and out sequences have the following effect: Kanji in sequence (ESC $ B) Code values following the Kanji in sequence (ESC $ B) are treated as characters in the JIS X0208 Kanji character set. Supplementary Kanji in sequence (ESC $ ( D) Code values following the supplementary Kanji in sequence (ESC $ ( D) are treated as characters in the JIS X0212 supplementary Kanji character set. User-Defined Character (UDC) in sequence (ESC $ ( 0) Code values following the UDC in sequence (ESC $ ( 0) are treated as vendor-defined or user-defined characters. Kanji out sequence (ESC ( B) Code values following the Kanji out sequence (ESC ( B) are treated as ASCII characters. Kana in and out sequences (SI/SO) These sequences are ignored. Codeset Conversion The following codeset converter pairs are available for converting Japanese characters between jiskanji7 or JIS7 and other encoding for- mats. The RESTRICTIONS section discusses some conversion limitations that apply to these converters. Refer to iconv_intro(5) for an introduction to codeset conversion. For more information about the other codeset for which jiskanji7 or JIS7 is the input or output, see the reference page specified in the list item. deckanji_jiskanji7 or deckanji_JIS7, jiskanji7_deckanji or JIS7_deckanji Converting from and to the DEC Kanji codeset: deckanji(5). eucJP_jiskanji7 or eucJP_JIS7, jiskanji7_eucJP or JIS7_eucJP Converting from and to Japanese Extended UNIX Code: eucJP(5). eucTW_jiskanji7 or eucTW_JIS7, jiskanji7_eucTW or JIS7_eucTW Converting from and to Taiwanese Extended UNIX Code: eucTW(5). sdeckanji_jiskanji7 or sdeckanji_JIS7, jiskanji7_sdeckanji or JIS7_sdeckanji Converting from and to the Super DEC Kanji codeset: sdeckanji(5). SJIS_jiskanji7 or SJIS_JIS7, jiskanji7_SJIS or JIS7_SJIS Converting from and to Shift JIS format: SJIS(5). Shift JIS encoding format is identical to encoding in Microsoft code-pages used on PC systems. Therefore, you can use these convert- ers to convert Japanese characters between JIS Kanji and PC code-page format. For general information on how the operating system supports PC code pages, see code_page(5). RESTRICTIONS
The JIS Kanji codeset is not supported directly by a locale but through code conversion (through the iconv utility, Japanese terminal (tty) code conversion, and so forth). In the codeset naming conventions used by the iconv utility, the string JIS7 indicates 7-bit JIS Kanji code that follows a Katakana in sequence and the string jiskanji7 indicates 7-bit JIS Kanji code entered between Kana in and out sequences. The following sequences are valid for input to the iconv utility but are not generated when code is converted to jiskanji7: Kanji in (ESC $ @) Kanji in (ESC & @ ESC $ B) Kanji in (ESC $ ( B) Kanji in (ESC $ ( @) Supplementary Kanji in (ESC $ D) Kana in (ESC ( J) Kana in (ESC ( H) In the code naming conventions of the Japanese terminal, the string jis7 indicates 7-bit JIS Kanji code and the string jis8 indicates 8-bit JIS Kanji code. When the terminal code is set to jis7, the Kana in and out sequences (SI/SO) are used for JIS X0201 Katakana character rep- resentation. SEE ALSO
Commands: locale(1) Others: ascii(5), code_page(5), deckanji(5), eucJP(5), i18n_intro(5), i18n_printing(5), iconv_intro(5), iso2022jp(5), Japanese(5), l10n_intro(5), sdeckanji(5), shiftjis(5) jiskanji(5)
All times are GMT -4. The time now is 09:20 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy