Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

wnintro(5) [centos man page]

WNINTRO(5)						      WordNettm File Formats							WNINTRO(5)

wnintro - introduction to descriptions of WordNet file formats SYNOPSIS
cntlist - format of cntlist and cntlist.rev files lexnames - list of lexicographer file names and numbers prologdb - description of Prolog database files senseidx - format of sense index file sensemap - mapping from senses in WordNet 2.1 to corresponding 3.0 senses wndb - format of WordNet database files wninput - format of WordNet lexicographer files DESCRIPTION
This section of the WordNet Reference Manual contains manual pages that describe the formats of the various files included in different WordNet 3.0 packages. NOMENCLATURE
All files are in ASCII. Fields are generally separated by one space, unless otherwise noted, and each line is terminated with a newline character. In the file format descriptions, terms in italics refer to field names. Characters or strings in boldface represent an actual character or string as it appears in the file. Items enclosed in italicized square brackets ([ ]) may not be present. Since several files contain fields that have the identical meaning, field names are consistently defined. For example, several WordNet files contain one or more synset_offset fields. In each case, the definition of synset_offset is identical. SEE ALSO
wnintro(1), wnintro(3), cntlist(5), lexnames(5), prologdb(5), senseidx(5), sensemap(5), wndb(5), wninput(5), wnintro(7), wngloss(7). Fellbaum, C. (1998), ed. "WordNet: An Electronic Lexical Database". MIT Press, Cambridge, MA. WordNet 3.0 Dec 2006 WNINTRO(5)

Check Out this Related Man Page

CNTLIST(5WN)						      WordNettm File Formats						      CNTLIST(5WN)

cntlist - file listing number of times each tagged sense occurs in a semantic concordance, sorted most to least frequently tagged cntlist.rev - file listing number of times each tagged sense occurs in a semantic concordance, sorted by sense key DESCRIPTION
A cntlist file for a semantic concordance lists the number of times each semantically tagged sense occurs in the concordance and its sense number in the WordNet database. Each line in the file corresponds to a sense in the WordNet database to which at least one semantic tag points. Only senses that are tagged in a concordance are in the concordance's cntlist file. WordNet Database cntlist File In the WordNet database, words are assigned sense numbers based on frequency of use in semantically tagged corpora. The cntlist file used by grind(1WN) to build the WordNet database and assign the sense numbers is a union of the cntlist files from the various semantic concor- dances that were formerly released by Princeton University. This combined cntlist file is provided with the WordNet package and is found in the WNSEARCHDIR directory. The cntlist.rev file is used at run-time by the WordNet library code and browser interfaces to print in the output display the number of times each sense has been tagged. File Format Each line in a cntlist file contains information for one sense. The file is ordered from most to least frequently tagged sense. The fields are separated by one space, and each line is terminated with a newline character. Senses having the same tag_cnt value are listed in reverse alphabetical order of the lemma field of the sense_key. Each line in cntlist is of the form: tag_cnt sense_key sense_number where tag_cnt is the decimal number of times the sense is tagged in the corresponding semantic concordance. sense_key is a WordNet sense encoding and sense_number is a WordNet sense number as described in The cntlist.rev file contains the same fields described above, in the following order: sense_key sense_number tag_cnt NOTES
Princeton no longer maintains or releases the Semantic Concordance files. The cntlist file used to order the senses in WordNet 3.0 was generated from the Semantic Concordance files at the point that they were last updated in 2001. In general, the order of senses presented usually reflects what the user would expect, however sense ordering is now less reliable than in prior releases and should not be construed as an accurate indicator of frequency of use. ENVIRONMENT VARIABLES (UNIX) WNHOME Base directory for WordNet. Default is /usr/local/WordNet-3.0. WNSEARCHDIR Directory in which the WordNet database has been installed. Default is WNHOME/dict. REGISTRY (WINDOWS) HKEY_LOCAL_MACHINESOFTWAREWordNet3.0WNHome Base directory for WordNet. Default is C:Program FilesWordNet3.0. HKEY_CURRENT_USERSOFTWAREWordNet3.0wnres User's default browser options. FILES
cntlist, cntlist.rev file of combined semantic concordance cntlist files. Used to assign sense numbers in WordNet database SEE ALSO
grind(1WN), wnintro(5WN), senseidx(5WN). WordNet 3.0 Dec 2006 CNTLIST(5WN)
Man Page