04-11-2011
Some of the early NAT language packages for C used compression exploiting the null terminated string, finding short strings that were suffixes of other strings, so "1234" might be stored but "234", "34", "4" and "" were just offset pointers into "1234". While not that great for compressing long strings, it was great for sets with many short strings.
I was working on high performance container since a while back, and came up with a byte-tree, where the first byte was a lookup into an array of pointers, or similar structure, to quickly travers an invariant tree one byte of key at a time. Various alternate nodes dealt with compression, like a 'next-n-bytes-must-be' to swallow invariant areas in a key, or a truncated array of less than 256 cells, with a base and size, or a dumb list lookup leveraging strchr(), a string of random key letters, and a like-length array of pointers, or a N-copies-of for duplicates. The advantages: quick insert, sorted access, no rebalancing, quick access. Linear hash is cute, but if you are not sure of the data's key distribution, it is dicey to go all the way to one key per bucket, so how much linear search do you want?
8 More Discussions You Might Find Interesting
1. Programming
Hello, guys
Anyone had experiences to express polynomial using c language. I want to output the polynomial formula after I solve the question. Not to count the value of a polynomial.
That means I have to output the polynomial formula to screen.
such as:
f :=... (0 Replies)
Discussion started by: xli3
0 Replies
2. News, Links, Events and Announcements
Tiger Unleased
Advanced UNIX-Based Technology (0 Replies)
Discussion started by: Neo
0 Replies
3. Filesystems, Disks and Memory
the superblock has the offset for inode table.
My question is
1) whether it starts relative to the start of the first cylinder group
or is it relative to the start of filesystem???
2)and also which entry corresponds to the root(/) inode?? is it second or third entry???
My questions are... (4 Replies)
Discussion started by: anwerreyaz
4 Replies
4. Shell Programming and Scripting
Hello,
I have a file of the following information ( first field parent item, second field child item)
PM01 PM02
PM01 PM1A
PM02 PM03
PM03 PM04
PM03 PM05
PM03 PM06
PM05 PM10
PM1A PM2A
PM2A PM3B
PM2A PM3C
The output should be like this :
PM01 PM02 PM03 PM04
... (2 Replies)
Discussion started by: ThobiasVakayil
2 Replies
5. Programming
Hello,
Back in late August 2009, I decided to start working on a modification of the traditional Directed Acyclic Word Graph data structure.
End Of Word Nodes did not match up with single words, and Child Information had to be discovered through list scrolling. These were a heavy price to... (0 Replies)
Discussion started by: HeavyJ
0 Replies
6. Shell Programming and Scripting
Hi All,
I want to create a data structure like this
$VAR1 = {
'testsuite' => {
'DHCP' => {
'failures' => '0',
'errors' => '0',
'time' =>... (3 Replies)
Discussion started by: Damon_Qu
3 Replies
7. Shell Programming and Scripting
I am working with an undocumented feature of a software product (BladeLogic). It is returning the below string in response to a query. It is enclosed with square brackets, "records" are separated with commas and "fields" separated with semicolons. My thought was that this might be some basic... (1 Reply)
Discussion started by: dshcs
1 Replies
8. Shell Programming and Scripting
Input file:
bv|111259484|pir||T49736_real_data
bv|159484|pir||T9736_data_figure
bv|113584|prf|T4736|truth
bv|113584|pir||T4736_truth
Desired output:
bv|111259484|pir|T49736|real_data
bv|159484|pir|T9736|data_figure
bv|113584|prf|T4736|truth
bv|113584|pir|T4736|truth
Once the... (8 Replies)
Discussion started by: perl_beginner
8 Replies
LEARN ABOUT DEBIAN
mallex
MALLEX(1) Malaga quick reference MALLEX(1)
NAME
mallex - generate a Malaga run-time lexicon
SYNOPSIS
mallex [-binary|-readable|-prelex] project-file
mallex [-binary|-readable|-prelex] symbol-file rule-file [lexicon-file] [prelex-file]
DESCRIPTION
Malaga is a development environment for natural-language grammars based on the Left-Associative Grammar formalism. Malaga grammars can be
used for automatic morphological and/or syntactic analysis.
The program mallex generates a Malaga run-time lexicon by letting allomorph rules process a base-form lexicon. It can be started in inter-
active mode to help find bugs in the base-form lexicon or in the allomorph rules.
mallex uses the following grammar components:
symbol-file
The symbol-file has the suffix .sym and contains the symbols that are used in the lexicon and/or the allomorph rules.
rule-file
The rule-file has the suffix .all and contains the allomorph rules used to create the runtime-lexicon.
lexicon-file
The lexicon-file has the suffix .lex and contains the base-form lexicon entries that are used as input for the allomorph rules.
prelex-file (optional)
The prelex-file has the suffix .prelex and contains precompiled allomorph entries, which have been created by a former run of mallex
with the option -prelex.
You can give the names of the grammar components as command line arguments, in any order. Alternatively, you can describe these components
in a project-file and use the name of the project file as mallex' single command-line argument. A project file has the suffix .pro.
If no command line options are given, mallex runs in interactive mode, and you can enter commands. The lexicon-file and prelex-file are
not used in interactive mode. If you are not sure about the name of a command, use the command help to get an overview of all mallex com-
mands.
If you want to quit mallex, enter the command quit.
See info Malaga for details.
OPTIONS
-b[inary]
Create the run time lexicon file from the base form lexicon file and the optional prelex file, and save it as a binary run-time lex-
icon, which can be used by malaga.
-h[elp]
Print a help text about mallex' command line arguments and exit.
-p[relex]
Create the run time lexicon, and save it as a binary prelex-file, which can be read in later by another mallex run. output stream.
-r[eadable]
Create the run time lexicon but don't save it, but print its entries in human-readable form on the standard output stream.
-v[ersion]
Print mallex' version number and exit.
AUTHORS
Malaga has been developed by Bjoern Beutel. Numerous other people distributed to it. This manpage was originally written for the Debian
distribution by Antti-Juhani Kaijanaho.
SEE ALSO
malaga(1), malmake(1), malrul(1), malshow(1), malsym(1)
``Malaga 7, User's and Programmer's Manual''. Available in Debian systems via info Malaga, and, if the malaga-doc package is installed, in
various formats (DVI, Postscript, PDF, HTML) under /usr/share/doc/malaga-doc/.
Malaga 26 September 2006 MALLEX(1)