Using gperf for Keyword Lookup in Lexers

I’ve recently found about gperf a GNU tool said to be a “a perfect hash function generator” Well what do we use it for? Anytime you need to lookup a string from a fixed dataset. It generates a “perfect” hash function that promises to do that lookup using at most one string comparison. A good use of it is in lexers (or call it a scanner or tokenizer if you want) where we need to distinguish keywords from identifiers, this leads us to scanning an identifier and before calling it an identifier we do comparisons to see if it could actually be a reserved/builtin keyword of the language....

July 9, 2022 · 6 min