Unix/Linux Go Back    


CentOS 7.0 - man page for hspell (centos section 3)

Linux & Unix Commands - Search Man Pages
Man Page or Keyword Search:   man
Select Man Page Set:       apropos Keyword Search (sections above)


hspell(3)				      Ivrix					hspell(3)

NAME
       hspell - Hebrew spellchecker (C API)

SYNOPSIS
       #include <hspell.h>

       int hspell_init(struct dict_radix **dictp, int flags);

       void hspell_uninit(struct dict_radix *dictp);

       int hspell_check_word(struct dict_radix *dict, const char *word, int *preflen);

       void hspell_trycorrect(struct dict_radix *dict, const char *word, struct corlist *cl);

       int corlist_init(struct corlist *cl);

       int corlist_free(struct corlist *cl);

       int corlist_n(struct corlist *cl);

       char *corlist_str(struct corlist *cl, int i);

       unsigned int hspell_is_canonic_gimatria(const char *word);

       typedef	int  hspell_word_split_callback_func(const  char *word, const char *baseword, int
       preflen, int prefspec);

       int hspell_enum_splits(struct dict_radix *dict, const char *word,  hspell_word_split_call-
       back_func *enumf);

       void hspell_set_dictionary_path(const char *path);

       const char *hspell_get_dictionary_path(void);

DESCRIPTION
       This  manual  describes	the  C	API  of  the  Hspell Hebrew spellchecker. Please refer to
       hspell(1) for a description of the Hspell project,  its	spelling  standard,  and  how  it
       works.

       The  hspell_init() function must be called first to initialize the Hspell library. It sets
       up some global structures (see CAVEATS section) and then reads  the  necessary  dictionary
       files  (whose  places  are  fixed  when	the library is built). The 'dictp' parameter is a
       pointer to a struct dict_radix* object, which is modified to point to  a  newly	allocated
       dictionary.  A typical hspell_init() call therefore looks like

	  struct dict_radix *dict;
	  hspell_init(&dict, flags);

       Note  that  the	(struct  dict_radix*) type is an opaque pointer - the library user has no
       access to the separate fields in this structure.

       The 'flags' parameter can contain a bitwise or'ing of several flags that  modify  Hspell's
       default behavior; Turning on HSPELL_OPT_HE_SHEELA allows Hspell to recognize the interrog-
       ative He prefix (he ha-she'ela). HSPELL_OPT_DEFAULT is a synonym for turning on no special
       flag, i.e., it evaluates to 0.

       hspell_init()  returns  0  on  success, or negative numbers on errors. Currently, the only
       error is -1, meaning the dictionary files could not be read.

       The hspell_uninit() function undoes the effects of hspell_init(), freeing any memory  that
       was allocated during initialization.

       The  hspell_check_word()  function  checks whether a certain word is a correct Hebrew word
       (possibly with prefix particles attached in a syntacticly-correct manner). 1  is  returned
       if the word is correct, or 0 if it is incorrect.

       The  'word'  parameter should be a single Hebrew word, in the iso8859-8 encoding, possibly
       containing the ASCII quote or double-quote characters (signifying the geresh and gershayim
       used in Hebrew for abbreviations, acronyms, and a few foreign sounds). If the calling pro-
       grams works with other encodings, it must convert the word to iso8859-8 first. In particu-
       lar  cp1255  (the  MS-Windows Hebrew encoding) extensions to iso8859-8 like niqqud charac-
       ters, geresh or gershayim, are currently not recognized and must be removed from the  word
       prior to calling hspell_check_word().

       Into  the 'preflen' parameter, the function writes back the number of characters it recog-
       nized as a prefix particle - the rest of the 'word' is a stand-alone word.  Because Hebrew
       words  typically  can be read in several different ways, this feature (of getting just one
       prefix from one possible reading) is usually not very useful,  and  it  is  likely  to  be
       removed in a future version.

       The  hspell_enum_splits()  function  provides  a  way to get all possible splitting of the
       given 'word' into an optional prefix particle and a stand-alone word.  For  each  possible
       (and  legal,  as some words cannot accept certain prefixes) split, a user-defined callback
       function is called. This callback function is given the whole word, the length of the pre-
       fix,  the stand-alone word, and a bitfield which describes what types of words this prefix
       can get.  Note that in some cases, a word beginning with the letter waw gets this waw dou-
       bled before a prefix, so sometimes strlen(word)!=strlen(baseword)+preflen.

       The  hspell_trycorrect()  tries	to  find  a list of possible corrections for an incorrect
       word.  Because in Hebrew the word density is high (a random string of letters,  especially
       if  short,  has a high probability of being a correct word), this function attempts to try
       corrections based on the assumption of a spelling error (replacement of letters that sound
       alike, missing or spurious immot qri'a), not typo (slipped finger on the keyboard, etc.) -
       see also CAVEATS.

       hspell_trycorrect() returns the correction list into a structure of type  struct  corlist.
       This  structure	must  be  first  allocated with a call to corlist_init() and subsequently
       freed with corlist_free().  The corlist_n() macro returns the number of words held  in  an
       allocated  corlist, and corlist_str() returns the i'th word. Accordingly, here is an exam-
       ple usage of hspell_trycorrect():

	  struct corlist cl;
	  printf ("Found misspelled word %s. Possible corrections:\n", w);
	  corlist_init (&cl);
	  hspell_trycorrect (dict, w, &cl);
	  for (i=0; i<corlist_n(&cl); i++) {
	      printf ("%s\n", corlist_str(&cl, i));
	  }

       The hspell_is_canonic_gimatria() function checks whether the given word is a canonic gima-
       tria  -	i.e.,  the  proper  way to write in gimatria the number it represents. The caller
       might want to accept canonic gimatria as proper Hebrew words, even if  hspell_check_word()
       previously  reported  such  word  to be a non-existent word.  hspell_is_canonic_gimatria()
       returns the number represented as gimatria in 'word' if it is indeed proper  gimatria  (in
       canonic form), or 0 otherwise.

       hspell_init()  normally	reads the dictionary files from a path compiled into the library.
       This makes sense when the library's code and the dictionaries  are  distributed	together,
       but  in some scenarios the library user might want to use the Hspell dictionaries that are
       already present on the system  in  an  arbitrary  path.	The  function  hspell_set_dictio-
       nary_path() can be used to set this path, and should be used before calling hspell_init().
       The given path is that of the word list, and other input files  have  that  path  with  an
       appended  prefix.   hspell_get_dictionary_path()  can be used to find the current path. On
       many installations, this defaults to "/usr/local/share/hspell/hebrew.wgz".

LINKING
       On most systems, the Hspell library is compiled to use the Zlib library	for  reading  the
       compressed dictionaries. Therefore, a program linking with the Hspell library must also be
       linked with the Zlib library (usually, by adding "-lz" to the compilation line).

       Programs that use autoconf to search for the  Hspell  library,  should  remember  to  tell
       AC_CHECK_LIB to also link with the -lz library when checking for -lhspell.


CAVEATS
       While the API described here has been stable for years, it may change in the future. Users
       are encouraged to compare the  values  of  the  integer	macros	HSPELL_VERSION_MAJOR  and
       HSPELL_VERSION_MINOR  to  those	expected  by  the  writer  of the program. A third macro,
       HSPELL_VERSION_EXTRA contains a string which can describe subrelease modifications  (e.g.,
       beta versions).

       The current Hspell C API is very low-level, in the sense that it leaves the user to imple-
       ment many features that some users take for granted that a spell-checker  should  provide.
       For  example  it doesn't provide any facilities for a user-defined personal dictionary. It
       also has separate functions for checking valid Hebrew words and	valid  gimatria,  and  no
       function  to  do  both. It is assumed that the caller - a bigger spell-checking library or
       word processor (for example), will already have these facilities. If not, you may wish  to
       look at the sources of hspell(1) for an example implementation.

       Currently  there  is  no concept of separate Hspell "contexts" in an application.  Some of
       the context is now global for the entire application: currently, a single  list	of  legal
       prefix-particles is kept, and the dictionary read by hspell_init() is always read from the
       global default place. This may be solved in a later version, e.g., by switching to an  API
       like:

	  context = hspell_new_context();
	  hspell_set_dictionary_path(context, "/some/path/hebrew.wgz");
	  hspell_init(context, flags);
	  ...
	  hspell_check_word(context, word, preflenp);

       Note  that  despite the global context mentioned above, after initialization all functions
       described here are thread-safe, because they only read the dictionary data, not	write  to
       it.

       hspell_trycorrect()  is not as powerful as it could have been, with typos or certain kinds
       of spelling mistakes not giving useful correction suggestions. Along with  more	types  of
       corrections, hspell_trycorrect() needs a better way to order the likelihood of the correc-
       tions, as an unordered list of 100 corrections would be just as useful  (or  rather,  use-
       less) as none.

       In some cases of errors during hspell_init(), warning messages are printed to the standard
       errors. This is a bad thing for a library to do.

       There are too many CAVEATS in this manual.

VERSION
       The version of hspell described by this manual page is 1.2.

COPYRIGHT
       Copyright  (C)  2000-2012,  Nadav  Har'El  <nyh@math.technion.ac.il>  and  Dan  Kenigsberg
       <danken@cs.technion.ac.il>.

       Hspell  is free software, released under the GNU Affero General Public License (AGPL) ver-
       sion 3.	Note that not only the programs in the	distribution,  but  also  the  dictionary
       files  and the generated word lists, are licensed under the AGPL.  There is no warranty of
       any kind.

       See the LICENSE file for more information and the exact license terms.

       The latest version of this software can be found in http://hspell.ivrix.org.il/

SEE ALSO
       hspell(1)

Hspell 1.2				 28 February 2012				hspell(3)
Unix & Linux Commands & Man Pages : ©2000 - 2018 Unix and Linux Forums


All times are GMT -4. The time now is 01:51 AM.