Exactly How the BOM is encoded in the file depends on whether it is UTF8, UTF16 or UTF32, plus whether the the Text is big endian or little endian.
The BOM is supposed to be at very beginning of the text, hence bipinajith used the ^ to indicate that. What you show as a BOM denotes UTF16 big endian. Is that in fact what you have? Because what you were given by bipiniajith should have worked. That tells me something is not right. Not all BOM's are 0xFEFF.
I have a stream of characters like "\u8BBE\u5907\u7BA1"
and i want to display it.
I tried following things already without any luck.
1) printf("%s",L("\u8BBE\u5907\u7BA1"));
2) printf("%lc",0x8BBE);
3) setlocale followed by fwide followed by wprintf
4) also changed the local manually... (3 Replies)
grep for a particular pattern and remove 5 lines above the pattern and 6 lines below the pattern
root@server1 # cat filename
Shell Programming and Scripting test1
Shell Programminsada asda
dasd asd Shell Programming and Scripting Post New Thread
Shell Programming and S sadsa ... (17 Replies)
Hi
All,
I have AIX 5.3 server. I have one big file. in that i want to remove 5000 line from top. is there any command for this?
Thanks,
Vishal (6 Replies)
Hi,
How do I remove the lines where special characters or Unicode characters appear?
The following query does work but I wonder if there is a better way.
cat test.txt | egrep -v '\)|#|,|&|-|\(|\\|\/|\.'
The following lines show that my query is incomplete.
Warning: The word "*Khan" is... (1 Reply)
Hi,
We are receiving an XML file in Unix which has some special characters between tags like '^' etc
<Tag> 1e^O7f%<2304e.$d8f57e8^Bf-&e.^Zh7/327e^O7 </Tag>
We need to remove all special characters like ^ ones and also any '&' or '<' or '>' being sent within the start and close tags i.e.... (6 Replies)
I don't want HTML_CONTENT,RICH_CONTENT,TEXT_CONTENT columns data in the file and reset of data we need to extract.
Find the attached file.
Need to extract date in between DI_UX_ROW_END tag.
Can help me using unix command using AWK.
Thanks, (2 Replies)
Hi,
Please excuse for posting new thread on control characters,
I am facing some difficulties in removing the control character from a file extracted from top command,
i am able to see control characters using more command and in vi mode, through cat control characters are not visible ... (8 Replies)
Dear All
I was wondering if someone could help me in resolving an issue.
I have a file like this:
column1 column2
2 4
3 5
8 9
0 12
0 0
0 0
9 0
87 0
1 0
1 0
1 0
4 0 (2 Replies)
Hello.
Source file are in : /a/b/c/d/e/f/g/some_file
Destination is : /d/e where sub-directories "f" and "g" may missing or not.
After copying I want /a/b/c/d/e/f/g/file1 in /d/e/f/g/file1
On source /a is top-level directory
On destination /d is top-level directory
I would like... (2 Replies)
Discussion started by: jcdole
2 Replies
LEARN ABOUT CENTOS
gencfu
GENCFU(1) ICU 50.1.2 Manual GENCFU(1)NAME
gencfu - Generates Unicode Confusable data files
SYNOPSIS
gencfu [ -h, -?, --help ] [ -V, --version ] [ -c, --copyright ] [ -v, --verbose ] [ -d, --destdir destination ] [ -i, --icudatadir direc-
tory ] -r, --rules rule-file -w, --wsrules whole-script-rule-file -o, --out output-file
DESCRIPTION
gencfu reads confusable character definitions in the input file, which are plain text files containing confusable character definitions in
the input format defined by Unicode UAX39 for the files confusables.txt and confusablesWholeScript.txt. This source (.txt) format is also
accepted by ICU spoof detectors. The files must be encoded in utf-8 format, with or without a BOM. Normally the output data file has the
.cfu extension.
OPTIONS -h, -?, --help
Print help about usage and exit.
-V, --version
Print the version of gencfu and exit.
-c, --copyright
Embeds the standard ICU copyright into the output-file.
-v, --verbose
Display extra informative messages during execution.
-d, --destdir destination
Set the destination directory of the output-file to destination.
-i, --icudatadir directory
Look for any necessary ICU data files in directory. For example, the file pnames.icu must be located when ICU's data is not built
as a shared library. The default ICU data directory is specified by the environment variable ICU_DATA. Most configurations of ICU
do not require this argument.
-r, --rules rule-file
The source file to read.
-w, --wsrules whole-script-rule-file
The whole script source file to read.
-o, --out output-file
The output data file to write.
VERSION
1.0
COPYRIGHT
Copyright (C) 2009 International Business Machines Corporation and others
ICU MANPAGE 24 May 2009 GENCFU(1)