Hello,
Over the past few years, I've conducted some rather thorough R&D in the field of lexicon-data-structure optimization.
A Trie is a good place to start, followed by a traditional DAWG.
Smaller means faster, but a traditional DAWG encoding operates as a Boolean-graph, unable to index the keywords within.
It came to my attention that the world's most powerful lexicon-data-structure would incorporate postfix-compression, while at the same time eliminating the need to scroll through lists in alphabetical order. Further, the graph would operate as an incremental-(perfect & complete)-hash-function.
After a lot of deep insight thinking, and many sessions of accurate reckoning, I put together just exactly that: I call it Caroline Word Graph or CWG, and published the documentation on a web page: (Updated the DAWG page as well.)
CWG
DAWG
Please inform me if you have encountered a similar construct.
All the very best,
JohnPaul Adamovsky