Detect lines beginning with double-byte characters (Japanese) and delete Post: 302371632

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

delete lines from file2 beginning w/file1

I've been searching around here and other places, but can't put this together... I've got a unique list of words in file 1 (one word on each line). I need to delete each line in file2 that begins with the word in file1. I started this way, but want to know how to use file1 words instead...

2. Shell Programming and Scripting

delete zero byte file

Hello I have a requirement where i need to find the zero byte size file in the directory and need to delete that zero byte file. Thanks

3. Shell Programming and Scripting

Email a File from UNIX which has Japanese characters in it

Hi, I'm trying to email from UNIX, a file which has Japanese characters in it (i,e. in the contents -- not the filename). The file gets emailed, but the Japanese characters do not show up properly when I open the file on Windows in my Outlook mailbox. I searched a lot of forums but still...

4. Shell Programming and Scripting

How to delete all lines with less then 32 characters from a textfile?

I need to delete all lines with less then 32 characters from a textfile. :)

5. Shell Programming and Scripting

Removing one or more blank characters from beginning of a line

Hi, I was trying to remove the blank from beginning of a line. when I try: sed 's/^ +//' filename it does not work but when I try sed 's/^ *//' filename it works But I think the first command should have also replaced any line with one or more blanks. Kindly help me in understanding...

6. Red Hat

How to display Chinese and Japanese Characters on Rhel 6?

Hello, I'm trying to figure out how to display Chinese and Japanese Characters on my RHEL 6 Console. There is no more "bogl-bterm" for RHEL6, that is not supported anymore. Is there any way that I could display them? Thank you.

7. SuSE

Display Chinese and Japanese characters on my SLES console.

Hello, I'm trying to figure out how to display Chinese and Japanese Characters on my SLES 11 Console. Is there any way that I could display those characters on my console? Thank you.

8. UNIX for Beginners Questions & Answers

Removing characters from beginning of multiple files

Hi, I have been searching how to do this but I can't seem to find how to do it. Hopefully someone can help. I have multiplr files, 100's example 12345-zxys.213423.zyz.txt. I want to be able to take all these files and remove the first '12345-' from each of the files. '12345-' these characters...

9. UNIX for Beginners Questions & Answers

Inserting n characters to beginning of line if match

I would like to insert n number of characters at the beginning of each line that starts with a given character. If possible, I would be most appreciative for a sed or awk solution. Given the data below, I would like to be able to insert either 125 spaces or 125 "-" at the beginning of every line...

LEARN ABOUT SUNOS

ibmjtoeuc

euctoibmj(1)							   User Commands						      euctoibmj(1)

NAME

       euctoibmj, ibmjtoeuc - Code conversion between Japanese EUC and IBM-Japanese

SYNOPSIS

       euctoibmj [-t] [-u code] [-U] [filename...]

       ibmjtoeuc [-u code] [-U] [filename...]

AVAILABILITY

       SUNWjfpu

DESCRIPTION

       euctoibmj  converts  the contents of the specified filenames from  ASCII/ Japanese EUC to  EBCDIC/IBM-Japanese. ibmjtoeuc converts the con-
       tents of the specified filenames from  EBCDIC/IBM-Japanese to  ASCII/ Japanese EUC. The both commands write the resultant code  to  stdout.
       If filename is not given, input characters are read from the standard input.

       For Japanese language handling, the euctoibmj/ibmjtoeucj pair of commands provide conversion only between the two code standards. Code con-
       version among Japanese EUC, JIS, and PC kanji are supported by another set of commands, jistoeuc(1) family or iconv(1).

OPTIONS

       -u code	With this option specified, characters in one code set that do not have corresponding characters in the other are  mapped  to  the
		code  given  in four-digit hexadecimal HOST CODE of IBM Japanese (for euctoibmj) or in four-digit JIS Ku-Ten code (for ibmjtoeuc).
		Without this option, such characters are mapped to HOST CODE 4040 (for euctoibmj) or JIS Ku-Ten code 0101 (for ibmjtoeuc).

       -U	The output is not buffered (The default is buffered output).

       -t	With this option specified, euctoibmj translates Half-Size Katakana (Code Set 2) in Japanese EUC to the  corresponding	characters
		in Code Set 1 prior to conversion. Without this option, Code Set 2 characters in Japanese EUC are processed to the illegal charac-
		ter.

ENVIRONMENT VARIABLES

       The environment variables LC_CTYPE and LANG control the character classification throughout these commands. For euctoibmj and ibmjtoeuc	to
       work correctly, one or both of the environment variables must be set to ja or an equivalent locale. On entry to these commands, these envi-
       ronment variables are checked in the following order: LC_CTYPE and LANG. When a valid value is found, remaining environment  variables  for
       character classification are ignored.

FILES

       /usr/lib/jcodetables/ibmj-euc

	   Code conversion table for IBM Japanese.

SEE ALSO

       iconv(1), jistoeuc(1), iconv_ja(5)

DIAGNOSTICS

       unexpected data encountered in input.

	   Illegal character code is found in input file.

BUGS

       The   ASCII/EBCDIC conversion table are taken from the 256 character standard in the  CACM Nov, 1968. The conversion, while less blessed as
       a standard, corresponds better to certain  IBM print train convertions. There is no universal solution.

       The Japanese EUC/IBM Japanese conversion table is based on the  IBM Kanji codebook (4th edition - September 1987),  JIS X 0201, and  JIS  X
       0208-1983.

       If  JIS X 0212 caracter set is specified as input, euctoibmj can not support the conversion correctly.

SunOS 5.10							    10 Jan 2003 						      euctoibmj(1)