file(1) [plan9 man page]

FILE(1) 						      General Commands Manual							   FILE(1)

NAME

       file - determine file type

SYNOPSIS

       file [ file ...	]

DESCRIPTION

       File performs a series of tests on its argument files in an attempt to classify their contents by language or purpose.  If no arguments are
       given, the classification is performed on standard input.

       The file types it looks for include directory, device file, zero-filled file, empty file, Plan 9 executable, PAC audio file, cpio  archive,
       tex  dvi  file,	archive  symbol  table,  archive,  rc script, sh script, PostScript, troff output file for various devices, GIF, FAX, pic-
       file(9.6), object code, C and Alef source, assembler source, compressed files, encrypted file, English text, Plan 9 bitmap, Plan 9 subfont,
       Plan 9 font.

       If  a  file has no apparent format, file looks at the character set it uses to classify it according to ASCII, extended ASCII, Latin ASCII,
       or UTF holding one or more of the following blocks of the Unicode Standard: Extended Latin,  Greek,  Cyrillic,  Armenian,  Hebrew,  Arabic,
       Devanagari,  Bengali,  Gurmukhi,  Gujarati,  Oriya,  Tamil, Telugu, Kannada, Malayalam, Thai, Lao, Tibetan, Georgian, Japanese, Chinese, or
       Korean.

       If all else fails, file decides its input is binary.

SOURCE

       /sys/src/cmd/file.c

BUGS

       It can make mistakes, for example classifying a file of decimal data, etc. as troff(1) input.

																	   FILE(1)

Check Out this Related Man Page

TCS(1)							      General Commands Manual							    TCS(1)

NAME

       tcs - translate character sets

SYNOPSIS

       tcs [ -slcv ] [ -f ics ] [ -t ocs ] [ file ...  ]

DESCRIPTION

       Tcs  interprets the named file(s) (standard input default) as a stream of characters from the ics character set or format, converts them to
       runes, and then converts them into a stream of characters from the ocs character set or format on the standard output.  The  default  value
       for  ics  and ocs is utf, the UTF encoding described in utf(6).	The -l option lists the character sets known to tcs.  Processing continues
       in the face of conversion errors (the -s option prevents reporting of these errors).  The -c option forces the output to contain only  cor-
       rectly  converted characters; otherwise, 0x80 characters will be substituted for UTF encoding errors and 0xFFFD characters will substituted
       for unknown characters.

       The -v option generates various diagnostic and summary information on standard error, or makes the -l output more verbose.

       Tcs recognizes an ever changing list of character sets.	In particular, it supports a variety of Russian and Japanese encodings.   Some	of
       the supported encodings are

       utf    The Plan 9 UTF encoding, known by ISO as UTF-8

       utf1   The deprecated original UTF encoding from ISO 10646

       ascii  7-bit ASCII

       8859-1 Latin-1 (Central European)

       8859-2 Latin-2 (Czech .. Slovak)

       8859-3 Latin-3 (Dutch .. Turkish)

       8859-4 Latin-4 (Scandinavian)

       8859-5 Part 5 (Cyrillic)

       8859-6 Part 6 (Arabic)

       8859-7 Part 7 (Greek)

       8859-8 Part 8 (Hebrew)

       8859-9 Latin-5 (Finnish .. Portuguese)

       koi8   KOI-8 (GOST 19769-74)

       jis-kanji
	      ISO 2022-JP

       ujis   EUC-JX: JIS 0208

       ms-kanji
	      Microsoft, or Shift-JIS

       jis    (from only) guesses between ISO 2022-JP, EUC or Shift-Jis

       gb     Chinese national standard (GB2312-80)

       big5   Big 5 (HKU version)

       unicode
	      Unicode Standard 1.0

       tis    Thai character set plus ASCII (TIS 620-1986)

       msdos  IBM PC: CP 437

       atari  Atari-ST character set

EXAMPLES

       tcs -f 8859-1
	      Convert 8859-1 (Latin-1) characters into UTF format.

       tcs -s -f jis
	      Convert characters encoded in one of several shift JIS encodings into UTF format.  Unknown Kanji will be converted into 0xFFFD char-
	      acters.

       tcs -lv
	      Print an up to date list of the supported character sets.

SOURCE

       /sys/src/cmd/tcs

SEE ALSO

       ascii(1), rune(2), utf(6).

																	    TCS(1)

Linux and UNIX Man Pages

file(1) [plan9 man page]

Check Out this Related Man Page