Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Issue with UTF-8 BOM character in text file Post 302657857 by jawsnnn on Monday 18th of June 2012 11:52:43 AM
Old 06-18-2012
Unfortunately downloading+installing tools is not an option (Controlled environment at work means I would have to cut through at least half a dozen people to get something as basic as puTTY installed on my system).

Question regarding your first point: Wouldn't transferring the file in ASCII mode incorrectly transmit the UTF(japanese/spanish) characters? Also, are you suggesting skipping the "copy data from excel - paste to notepad - save to UTF8 format" step? That might again not be possible in my current situation, unless I find a way to convert the data to a proper UTF8 text file without BOM characters using a pre-installed application.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Finding files with UTF-8 BOM

Hi, there: I am relatively new to Unix. So, I am not even sure if I am asking is an easy or difficult task. I want to peform GREP like command which will generate a list of files with a file format of UTF-8. I would especially like to know whether the files use UTF-8 or UTF-8N (in other... (0 Replies)
Discussion started by: kotoponus
0 Replies

2. UNIX for Dummies Questions & Answers

need to read 3° character from a text file

Hi, I need a script to read the n° character from a text file. eg: if the text file contains the line "123456" ,I nedd a command to display the number 4, as an example. I tried with awk and printf but it seems only works with words separated with spaces, but in this case I have only one word... (15 Replies)
Discussion started by: piltrafa
15 Replies

3. UNIX for Advanced & Expert Users

Convert UTF-8 encoded hex value to a character

Hi, I have a non-ascii character (Ŵ), which can be represented in UTF-8 encoding as equivalent hex value (\xC5B4). Is there a function in unix to convert this hex value back to display the charcter ? (10 Replies)
Discussion started by: sumirmehta
10 Replies

4. UNIX for Dummies Questions & Answers

Deleting all instances of a certain character from a text file

In my command prompt I did: sed 's/\://' mytextfile > newtextfile But it only deleted the first instance of : in each line when some lines have multiple : appearing in each one. How can I delete all the : from the entire file? (1 Reply)
Discussion started by: guitarscn
1 Replies

5. Shell Programming and Scripting

read the text file and print the content character by character..

hello all i request you to give the solution for the following problem.. I want read the text file.and print the contents character by character..like if the text file contains google means..i want to print g go goo goog googl google like this Using unix Shell scripting... without using... (1 Reply)
Discussion started by: samupnl
1 Replies

6. Shell Programming and Scripting

post-Adding character for a text file

#################################################################### #NAME SL.NO TITLE SAL #################################################################### |RAGAV S S | 12358 | SALES EXECUTIVE| | 25000 |RAJU R B | 64253 | SALES EXECUTIVE| ... (5 Replies)
Discussion started by: manas_ranjan
5 Replies

7. Shell Programming and Scripting

How to modify character to UTF-8 in shell script?

I have a shell script running to load some data from a text file to database. Text file contains some non-ASCII characters like ü. How can i convert these characters to UTF-8 codes before loading to DB. (5 Replies)
Discussion started by: vel4ever
5 Replies

8. Shell Programming and Scripting

new line after n'th character in text file

Gurus, I have a text file having only one row having following data. 640.0800 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 640.2324 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 640.3848 ... (5 Replies)
Discussion started by: Amit.saini333
5 Replies

9. Linux

Help to Convert file from UNIX UTF-8 to Windows UTF-16

Hi, I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt and i am getting some chinese characters as below which l opened the converted file on windows machine. LANG=en_US.UTF-8... (3 Replies)
Discussion started by: phanidhar6039
3 Replies

10. UNIX for Advanced & Expert Users

UTF-8,16,32 character lengths using awk

Hi All, I am trying to obtain count of characters using awk, but "length" function returns a value of 1 for 2-byte or 3-byte characters as well unlike wc -c command. I have tried to use the below commands within awk function, but it does not seem to work { cmd="wc -c "stringtocheck ( cmd )... (6 Replies)
Discussion started by: tostay2003
6 Replies
PPI::Token::BOM(3)					User Contributed Perl Documentation					PPI::Token::BOM(3)

NAME
PPI::Token::BOM - Tokens representing Unicode byte order marks INHERITANCE
PPI::Token::BOM isa PPI::Token isa PPI::Element DESCRIPTION
This is a special token in that it can only occur at the beginning of documents. If a BOM byte mark occurs elsewhere in a file, it should be treated as PPI::Token::Whitespace. We recognize the byte order marks identified at this URL: <http://www.unicode.org/faq/utf_bom.html#BOM> UTF-32, big-endian 00 00 FE FF UTF-32, little-endian FF FE 00 00 UTF-16, big-endian FE FF UTF-16, little-endian FF FE UTF-8 EF BB BF Note that as of this writing, PPI only has support for UTF-8 (namely, in POD and strings) and no support for UTF-16 or UTF-32. We support the BOMs of the latter two for completeness only. The BOM is considered non-significant, like white space. METHODS
There are no additional methods beyond those provided by the parent PPI::Token and PPI::Element classes. SUPPORT
See the support section in the main module AUTHOR
Chris Dolan <cdolan@cpan.org> COPYRIGHT
Copyright 2001 - 2011 Adam Kennedy. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. The full text of the license can be found in the LICENSE file included with this module. perl v5.16.2 2011-02-25 PPI::Token::BOM(3)
All times are GMT -4. The time now is 05:27 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy