Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

file(1) [plan9 man page]

FILE(1) 						      General Commands Manual							   FILE(1)

NAME
file - determine file type SYNOPSIS
file [ file ... ] DESCRIPTION
File performs a series of tests on its argument files in an attempt to classify their contents by language or purpose. If no arguments are given, the classification is performed on standard input. The file types it looks for include directory, device file, zero-filled file, empty file, Plan 9 executable, PAC audio file, cpio archive, tex dvi file, archive symbol table, archive, rc script, sh script, PostScript, troff output file for various devices, GIF, FAX, pic- file(9.6), object code, C and Alef source, assembler source, compressed files, encrypted file, English text, Plan 9 bitmap, Plan 9 subfont, Plan 9 font. If a file has no apparent format, file looks at the character set it uses to classify it according to ASCII, extended ASCII, Latin ASCII, or UTF holding one or more of the following blocks of the Unicode Standard: Extended Latin, Greek, Cyrillic, Armenian, Hebrew, Arabic, Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu, Kannada, Malayalam, Thai, Lao, Tibetan, Georgian, Japanese, Chinese, or Korean. If all else fails, file decides its input is binary. SOURCE
/sys/src/cmd/file.c BUGS
It can make mistakes, for example classifying a file of decimal data, etc. as troff(1) input. FILE(1)

Check Out this Related Man Page

TCS(1)							      General Commands Manual							    TCS(1)

NAME
tcs - translate character sets SYNOPSIS
tcs [ -slcv ] [ -f ics ] [ -t ocs ] [ file ... ] DESCRIPTION
Tcs interprets the named file(s) (standard input default) as a stream of characters from the ics character set or format, converts them to runes, and then converts them into a stream of characters from the ocs character set or format on the standard output. The default value for ics and ocs is utf, the UTF encoding described in utf(6). The -l option lists the character sets known to tcs. Processing continues in the face of conversion errors (the -s option prevents reporting of these errors). The -c option forces the output to contain only cor- rectly converted characters; otherwise, 0x80 characters will be substituted for UTF encoding errors and 0xFFFD characters will substituted for unknown characters. The -v option generates various diagnostic and summary information on standard error, or makes the -l output more verbose. Tcs recognizes an ever changing list of character sets. In particular, it supports a variety of Russian and Japanese encodings. Some of the supported encodings are utf The Plan 9 UTF encoding, known by ISO as UTF-8 utf1 The deprecated original UTF encoding from ISO 10646 ascii 7-bit ASCII 8859-1 Latin-1 (Central European) 8859-2 Latin-2 (Czech .. Slovak) 8859-3 Latin-3 (Dutch .. Turkish) 8859-4 Latin-4 (Scandinavian) 8859-5 Part 5 (Cyrillic) 8859-6 Part 6 (Arabic) 8859-7 Part 7 (Greek) 8859-8 Part 8 (Hebrew) 8859-9 Latin-5 (Finnish .. Portuguese) koi8 KOI-8 (GOST 19769-74) jis-kanji ISO 2022-JP ujis EUC-JX: JIS 0208 ms-kanji Microsoft, or Shift-JIS jis (from only) guesses between ISO 2022-JP, EUC or Shift-Jis gb Chinese national standard (GB2312-80) big5 Big 5 (HKU version) unicode Unicode Standard 1.0 tis Thai character set plus ASCII (TIS 620-1986) msdos IBM PC: CP 437 atari Atari-ST character set EXAMPLES
tcs -f 8859-1 Convert 8859-1 (Latin-1) characters into UTF format. tcs -s -f jis Convert characters encoded in one of several shift JIS encodings into UTF format. Unknown Kanji will be converted into 0xFFFD char- acters. tcs -lv Print an up to date list of the supported character sets. SOURCE
/sys/src/cmd/tcs SEE ALSO
ascii(1), rune(2), utf(6). TCS(1)
Man Page