Sponsored Content
Top Forums Shell Programming and Scripting How to remove Unicode <feff> from top of file? Post 302739767 by biztank on Wednesday 5th of December 2012 12:42:13 AM
Old 12-05-2012
when i check the encoding its UTF-8

Code:
file xyz.csv
xyz.csv: UTF-8 Unicode text, with very long lines

i tried piconv from UTF-8 to ASCII and it does converts <feff> to ?.
then i can grep ? and delete the 1st line.
is that ideal solution?
i wanted something robust. what if file has ? mark somewhere else in the file etc ...
 

10 More Discussions You Might Find Interesting

1. Programming

How to display unicode characters / unicode string

I have a stream of characters like "\u8BBE\u5907\u7BA1" and i want to display it. I tried following things already without any luck. 1) printf("%s",L("\u8BBE\u5907\u7BA1")); 2) printf("%lc",0x8BBE); 3) setlocale followed by fwide followed by wprintf 4) also changed the local manually... (3 Replies)
Discussion started by: jackdorso
3 Replies

2. Shell Programming and Scripting

grep for a particular pattern and remove few lines above top and bottom of the patter

grep for a particular pattern and remove 5 lines above the pattern and 6 lines below the pattern root@server1 # cat filename Shell Programming and Scripting test1 Shell Programminsada asda dasd asd Shell Programming and Scripting Post New Thread Shell Programming and S sadsa ... (17 Replies)
Discussion started by: fed.linuxgossip
17 Replies

3. AIX

want to remove some line from top of file.

Hi All, I have AIX 5.3 server. I have one big file. in that i want to remove 5000 line from top. is there any command for this? Thanks, Vishal (6 Replies)
Discussion started by: vishalpatel03
6 Replies

4. UNIX for Dummies Questions & Answers

remove special and unicode characters

Hi, How do I remove the lines where special characters or Unicode characters appear? The following query does work but I wonder if there is a better way. cat test.txt | egrep -v '\)|#|,|&|-|\(|\\|\/|\.' The following lines show that my query is incomplete. Warning: The word "*Khan" is... (1 Reply)
Discussion started by: shantanuo
1 Replies

5. UNIX for Dummies Questions & Answers

Remove Unicode/special chars from XML

Hi, We are receiving an XML file in Unix which has some special characters between tags like '^' etc <Tag> 1e^O7f%<2304e.$d8f57e8^Bf-&e.^Zh7/327e^O7 </Tag> We need to remove all special characters like ^ ones and also any '&' or '<' or '>' being sent within the start and close tags i.e.... (6 Replies)
Discussion started by: dsrookie7
6 Replies

6. Shell Programming and Scripting

Unicode file validation

I don't want HTML_CONTENT,RICH_CONTENT,TEXT_CONTENT columns data in the file and reset of data we need to extract. Find the attached file. Need to extract date in between DI_UX_ROW_END tag. Can help me using unix command using AWK. Thanks, (2 Replies)
Discussion started by: bmk
2 Replies

7. Shell Programming and Scripting

Request for advise on how to remove control characters in a UNIX file extracted from top command

Hi, Please excuse for posting new thread on control characters, I am facing some difficulties in removing the control character from a file extracted from top command, i am able to see control characters using more command and in vi mode, through cat control characters are not visible ... (8 Replies)
Discussion started by: karthikram
8 Replies

8. Shell Programming and Scripting

Remove top 3 duplicates

hello , I have a requirement with input in below format abc 123 xyz bcd 365 kii abc 987 876 cdf 987 uii abc 456 yuu bcd 654 rrr Expecting Output abc 456 yuu bcd 654 rrr cdf 987 uii (1 Reply)
Discussion started by: Tomlight
1 Replies

9. Shell Programming and Scripting

Remove top and bottom for each column

Dear All I was wondering if someone could help me in resolving an issue. I have a file like this: column1 column2 2 4 3 5 8 9 0 12 0 0 0 0 9 0 87 0 1 0 1 0 1 0 4 0 (2 Replies)
Discussion started by: giuliangiuseppe
2 Replies

10. UNIX for Beginners Questions & Answers

Bash script - Remove the 3 top level of a full path filename

Hello. Source file are in : /a/b/c/d/e/f/g/some_file Destination is : /d/e where sub-directories "f" and "g" may missing or not. After copying I want /a/b/c/d/e/f/g/file1 in /d/e/f/g/file1 On source /a is top-level directory On destination /d is top-level directory I would like... (2 Replies)
Discussion started by: jcdole
2 Replies
Template::Provider::Encoding(3pm)			User Contributed Perl Documentation			 Template::Provider::Encoding(3pm)

NAME
Template::Provider::Encoding - Explicitly declare encodings of your templates SYNOPSIS
use Template::Provider::Encoding; use Template::Stash::ForceUTF8; use Template; my $tt = Template->new( LOAD_TEMPLATES => [ Template::Provider::Encoding->new ], STASH => Template::Stash::ForceUTF8->new, ); # Everything should be Unicode # (but you can pass UTF-8 bytes as well, thanks to Template::Stash::ForceUTF8) my $author = "x{5bae}x{5ddd}"; # this will emit Unicode flagged string to STDOUT. You might # probably want to binmode(STDOUT, ":encoding($enccoding)") # before process() call $tt->process($template, { author => $author }); # in your templates [% USE encoding 'utf-8' -%] My name is [% author %]. { ... whatever UTF-8 bytes } DESCRIPTION
Template::Provider::Encoding is a Template Provider subclass to decode template using its declaration. You have to declare encoding of the template in the head (1st line) of template using (fake) encoding TT plugin. Otherwise the template is handled as utf-8. [% USE encoding 'utf-8' %] Here comes utf-8 strings with [% variable %]. DIFFERNCE WITH OTHER WAYS
UNICODE option and BOM Recent TT allows "UNICODE" option to Template::Provider and by adding it Provider scans BOM (byte-order mark) to detect UTF-8/UTF-16 encoded template files. This module does basically the same thing in a different way, but IMHO adding BOM to template files is a little painful especially for non-programmers. Template::Provider::Encode Template::Provider::Encode provides a very similar way to detect Template file encodings and output the template into various encodings. This module doesn't touch output encoding of the template and instead it emits valid Unicode flagged string. I think the output encoding conversion should be done by other piece of code, especially in the framework. This module doesn't require you to specify encoding in the code, nor doesn't guess encodings. Instead it forces you to put "[% USE encoding 'foo-bar' %]" in the top of template files, which is explicit and, I think, is a good convention. AUTHOR
Tatsuhiko Miyagawa <miyagawa@bulknews.net> This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. SEE ALSO
Template::Stash::ForceUTF8, Template::Provider::Encode perl v5.12.3 2007-08-01 Template::Provider::Encoding(3pm)
All times are GMT -4. The time now is 07:48 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy