Finding files with UTF-8 BOM


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Finding files with UTF-8 BOM
# 1  
Old 05-20-2005
Finding files with UTF-8 BOM

Hi, there:

I am relatively new to Unix. So, I am not even sure if I am asking is an easy or difficult task.

I want to peform GREP like command which will generate a list of files with a file format of UTF-8. I would especially like to know whether the files use UTF-8 or UTF-8N (in other words, whether or not there is a BOM in the files or not).

I am a PHP developer and we quicly have noticed that PHP does not deal with UTF-8 with BOM. So, I need to find which files contain BOM.

Any idea?
kotoponus
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Convert files to UTF-8 on AIX 7.1

Dears, I have a shell script - working perfectly on Oracle Linux - that detects the encoding (the charset to be exact) of the files in a specified directory using the "file" command (The file command outputs the charset in Linux, but doesn't do that in AIX), then if the file isn't a UTF-8 text... (4 Replies)
Discussion started by: JeanM-1
4 Replies

2. Linux

Help to Convert file from UNIX UTF-8 to Windows UTF-16

Hi, I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt and i am getting some chinese characters as below which l opened the converted file on windows machine. LANG=en_US.UTF-8... (3 Replies)
Discussion started by: phanidhar6039
3 Replies

3. OS X (Apple)

Changing txt files to pure UTF-8

I have two Macs running 10.7.5. We download .txt files from remote site to these local Macs using 'rsync -e ssh -avz...'. The files on Mac1 are in the required format of pure UTF-8. The files on Mac2 are in UTF-8 (no BOM) which is wrong format for us; these formats are indicated using BBEdit.... (1 Reply)
Discussion started by: sovdia
1 Replies

4. UNIX for Dummies Questions & Answers

Issue with UTF-8 BOM character in text file

Sometimes we recieve some excel files containing French/Japanese characters over the mail, and these files are manually transferred to the server by using SFTP (security is not a huge concern here). The data is changed to text format before transferring it using Notepad. Problem is: When saving... (4 Replies)
Discussion started by: jawsnnn
4 Replies

5. Shell Programming and Scripting

Finding files with wc -l results = 1 then moving the files to another folder

Hi guys can you please help me with a script to find files with one row/1 line of content then move the file to another directory my script below runs but nothing happens to the files....Alternatively Ca I get a script to find the *.csv files with "wc -1" results = 1 then create a list of those... (5 Replies)
Discussion started by: Dj Moi
5 Replies

6. Shell Programming and Scripting

Finding files

How can we find "latest files which have been recently updated/changed/created" in solaris 10??? (3 Replies)
Discussion started by: asadlone
3 Replies

7. Shell Programming and Scripting

finding files

Hi guys just wondering if there is a way to scan the whoel file system and find files that have not been used over a number of days, using the script (5 Replies)
Discussion started by: musicmancanora
5 Replies

8. Shell Programming and Scripting

Finding files

Hello guys, Please your help, i need to find all the files writed in the last 5 minutes, but without create another file using touch (like im doing right now): I am doing this: anio=`date +%Y` mes=`date +%m` dia=`date +%d` hora=`date +%H` minuto2=`date +%M` minuto=`expr... (1 Reply)
Discussion started by: lestat_ecuador
1 Replies

9. Shell Programming and Scripting

finding duplicate files by size and finding pattern matching and its count

Hi, I have a challenging task,in which i have to find the duplicate files by its name and size,then i need to take anyone of the file.Then i need to open the file and find for more than one pattern and count of that pattern. Note:These are the samples of two files,but i can have more... (2 Replies)
Discussion started by: jerome Sukumar
2 Replies

10. Solaris

finding files

Hi, I am trying to find all files ending in a particular file extension, ie all files ending with .pdf find / -name *.pdf But this doesnt seem to work, ie it doesnt find the files, is there a better way of doing this? I am using solaris 9 (4 Replies)
Discussion started by: frustrated1
4 Replies
Login or Register to Ask a Question
PPI::Token::BOM(3)					User Contributed Perl Documentation					PPI::Token::BOM(3)

NAME
PPI::Token::BOM - Tokens representing Unicode byte order marks INHERITANCE
PPI::Token::BOM isa PPI::Token isa PPI::Element DESCRIPTION
This is a special token in that it can only occur at the beginning of documents. If a BOM byte mark occurs elsewhere in a file, it should be treated as PPI::Token::Whitespace. We recognize the byte order marks identified at this URL: <http://www.unicode.org/faq/utf_bom.html#BOM> UTF-32, big-endian 00 00 FE FF UTF-32, little-endian FF FE 00 00 UTF-16, big-endian FE FF UTF-16, little-endian FF FE UTF-8 EF BB BF Note that as of this writing, PPI only has support for UTF-8 (namely, in POD and strings) and no support for UTF-16 or UTF-32. We support the BOMs of the latter two for completeness only. The BOM is considered non-significant, like white space. METHODS
There are no additional methods beyond those provided by the parent PPI::Token and PPI::Element classes. SUPPORT
See the support section in the main module AUTHOR
Chris Dolan <cdolan@cpan.org> COPYRIGHT
Copyright 2001 - 2011 Adam Kennedy. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. The full text of the license can be found in the LICENSE file included with this module. perl v5.18.2 2011-02-25 PPI::Token::BOM(3)