![]() |
|
|
|
|
|||||||
| Forums | Portal | Register | Rules & FAQ | Contribute | Members List | Arcade | Search | Today's Posts | Mark Forums Read |
| UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !! |
|
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| How to determine if a file is done copying | husker_ricky | UNIX for Advanced & Expert Users | 2 | 05-22-2008 08:32 AM |
| ftp - determine ascii or binary file | congo | Shell Programming and Scripting | 5 | 07-21-2007 12:53 PM |
| How to determine the max file size | dknight | UNIX for Advanced & Expert Users | 2 | 10-27-2006 05:39 AM |
| determine the size of a file??? | alan | UNIX for Dummies Questions & Answers | 8 | 12-31-2003 10:23 AM |
| How to determine if a File is Open? | derrikw2 | UNIX for Advanced & Expert Users | 2 | 02-01-2002 07:30 AM |
|
|
LinkBack | Thread Tools | Display Modes |
|
|||
|
How to determine if a file is ASCII?
I'm writing a script that takes a filename as an argument, which determines the "file type" of the file. I want to know if there is any command I can use to determine if a file is ASCII type, thanks all for giving a help.
|
| Forum Sponsor | ||
|
|
|
|||
|
That test for -f will not help with an ASCII test. It determines if a filename is a regular file as opposed to a directory, a link, etc, and will return true even for binaries.
But I would start with that test, and if it is a regular file, then the "file" command will give a clue as to its content. |
|
||||
|
The only language that I ever have encountered with a built-in test for this was perl. I dislike perl and seldom use it. But I did give this feature a try. It seemed broken because it allowed many non-ascii characters before it finally declared a file to be binary. Since I then had to code my own test, I returned to ksh. But I do prefer perl's terminology. It calls this "text files" and "binary files".
Unless you inspect every byte of the file, you are not going to get this 100%. And there is a big performance hit with inspecting every byte. But after some experiments, I settled on an algorithm that works for me. I examine the first line and declare the file to be binary if I encounter even one non-text byte. It seems a little slack, I know, but I seem to get away with it. Here is a little script that demonstrates this. Note that where I have used (TAB) to indicate a place where you must actually type the tab character. Code:
#! /usr/bin/ksh
typeset -L30 fmtfile
for file in * ; do
if read line < $file ; then
if [[ "$line" = *[!\(TAB)\ -\~]* ]] ; then
type=binary
else
type=text
fi
else
type=unreadable
fi 2> /dev/null
fmtfile=$file
echo "$fmtfile is a $type file"
done
exit 0
|
||||
| Google UNIX.COM |
| Thread Tools | |
| Display Modes | |
|
|