lex for Chinese character


 
Thread Tools Search this Thread
Operating Systems Linux lex for Chinese character
# 1  
Old 03-16-2006
lex for Chinese character

Hi,
I need to read one chinese char using lex. I tried using "." ( period ) for pattern matching but in vain.
Could anyone suggest me how do i proceeed.
Sample pgm: to read a chinese char in single quotes.
%{
#include <locale.h>
%}
%%
\'.\' printf("SUCCESS\n");
. printf("Failed\n");
}
%%
main()
{
setlocale(LC_ALL,"");
yylex();
}
This always prints me "Failed" message. I cannot use '.+' for matching for my application.
I can view the chinese char with locale settings on my machine. I am using linux machine . Locale set is : zh_CN.utf8.

Thank You
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Lex: concatenating word into array of c-strings

Alright, so I'm writing a file for the lexical analyzer (lex). It will be used to check C code (collecting the identifiers and storing those names along with the line numbers the identifier was found on). I'm not used to 'C' so I'm having some difficulty. I am using a function (insertId()) to... (4 Replies)
Discussion started by: D2K
4 Replies

2. UNIX for Dummies Questions & Answers

Match newline character "\n" in lex

Hi everyone! This is my very first post, sorry if I'm not posting in the right category. I'm trying to match a newline "/n" using lex/yacc. For example, print(9,'\n',8) should print 9 8 now do I write a regular expression to match exactly " '\n' " Thanks! (1 Reply)
Discussion started by: code21
1 Replies

3. Programming

Netbeans ide can't display chinese character in linux

When i write the source code in netbeans environment,if the source code,there are chinese characters,chinese characters will be displayed as box,how to solve this problem?please (2 Replies)
Discussion started by: fang_xiaoan
2 Replies

4. Programming

actions before parsing rules in lex n yacc

Hi , We have developed a grammer for our domain language using lex n yacc. I want to know is there any pre defined lex-yacc function which gets call before executing any rule (or rules). Oue requirement is, before processing any rule ,we want to perform some specific actions ? is there... (0 Replies)
Discussion started by: supritjain
0 Replies

5. Programming

How to match n number of newline character(\n) in lex and yacc

Hi , I need to develop a parser which should match something like 1. text a=5 " a=20"; 2. text a=..." a=20"; 3 text a=..." a=20 b=34 c=12 "; I have used this regular expression in my Lex file to generate the tokens: \".\s*.*\s.\" (8 Replies)
Discussion started by: vishwa787
8 Replies

6. Solaris

lex on solaris??? (urgent, pls!!!)

Hi everyone, I would like to know how to compile and run lex programs on solaris 10. the conventional way is $ lex <name.l> $ cc lex.yy.c -ll $ ./a.out but while trying to execute the 2nd command :i get a reference saying that the command is old or that main is not supported... Hence... (1 Reply)
Discussion started by: wrapster
1 Replies

7. Programming

regarding lex regular expression

Hi all I am using lex for my application scanning and I need to skip some lines for which I don't know the exact pattern. So, could anybody tell me the regular expression to display lines NOT beginning with the specified pattern. I know how to display lines beginning with the... (1 Reply)
Discussion started by: axes
1 Replies

8. Programming

Multiple scanners on same input file using lex/flex

Hi all, I'm working with flex (version 2.5.4a) on GNU/linux. I used it to develop 4 scanner C files for matching different patterns within an input file. But the problem now on my hand is that I need to conditionally combine these. That is in main (placed in a separate C file other than the 4... (5 Replies)
Discussion started by: Rakesh Ranjan
5 Replies

9. Shell Programming and Scripting

help with cshell script to read 1 or more lex files

taskes one or more .l files and compiles them #!/usr/bin/csh #while loop to carry on asking user to enter the files while $number!=0 echo "enter file name" #check to see if file ends with .l #if file ends with .l compile lexx.yy.c file for each file this is how i think it needs... (1 Reply)
Discussion started by: homerj546
1 Replies

10. Shell Programming and Scripting

probs compiling lex

this is my lex file ------------ test.l %% printf("%c",yytext+'a'-'A');.ECHO; how do i compile it $ lex test.l cc lex.yy.c -o test -ll <------| | if this is correct do i add this line--------| @the command line or does it... (0 Replies)
Discussion started by: sinner
0 Replies
Login or Register to Ask a Question
Encode::CN(3pm) 					 Perl Programmers Reference Guide					   Encode::CN(3pm)

NAME
Encode::CN - China-based Chinese Encodings SYNOPSIS
use Encode qw/encode decode/; $euc_cn = encode("euc-cn", $utf8); # loads Encode::CN implicitly $utf8 = decode("euc-cn", $euc_cn); # ditto DESCRIPTION
This module implements China-based Chinese charset encodings. Encodings supported are as follows. Canonical Alias Description -------------------------------------------------------------------- euc-cn /euc.*cn$/i EUC (Extended Unix Character) /cn.*euc$/i /GB[-_ ]?2312(?:D.*$|$)/i (see below) gb2312-raw The raw (low-bit) GB2312 character map gb12345-raw Traditional chinese counterpart to GB2312 (raw) iso-ir-165 GB2312 + GB6345 + GB8565 + additions MacChineseSimp GB2312 + Apple Additions cp936 Code Page 936, also known as GBK (Extended GuoBiao) hz 7-bit escaped GB2312 encoding -------------------------------------------------------------------- To find how to use this module in detail, see Encode. NOTES
Due to size concerns, "GB 18030" (an extension to "GBK") is distributed separately on CPAN, under the name Encode::HanExtra. That module also contains extra Taiwan-based encodings. BUGS
When you see "charset=gb2312" on mails and web pages, they really mean "euc-cn" encodings. To fix that, "gb2312" is aliased to "euc-cn". Use "gb2312-raw" when you really mean it. The ASCII region (0x00-0x7f) is preserved for all encodings, even though this conflicts with mappings by the Unicode Consortium. SEE ALSO
Encode perl v5.16.2 2012-08-26 Encode::CN(3pm)