It is obvious that the hardest part (by far, far, very far) of the work was done a while ago, when the amazing people from the Cologne University in Germany digitized and published Cologne Digital Sanskrit Dictionaries. They even made all the 36 or more dictionaries available online for anybody to use (and in different formats too).
The only problem with the web interface they provided was that it is that the UI is a little... too academic for my taste. It also only searches in one dictionary at once, and only for exact matches.
It was clear that improving the UI will give great advantages to researches and students, and there were several improvements I wanted:
- Let the users type using IAST (Latin transcription) or Devanagari, because the program can tell the difference by itself
- Make it possible to search in any number of dictionaries at once
- Add fuzzy search (so that it was possible to find approximate matches and similar words)
- Add full-text search across the articles (not just the words), effectively creating the largest bi-directional dictionary out there
So, long story short, I finally made it happen and launched sanskrit.myke.blog with all of the above. I hope you'll enjoy using it!
Some technical stuff for the nerds
When creating the GUI version before I wrote a migrator that migrated all the articles into a SQLite database, which shipped with the dictionary. It seemed to make sense both in terms of performance and convenience of search. The resulting program is not slow, but it's not super fast either. I didn't like that.
On this iteration I thought a bit more and there was a very obvious idea: you only need a database when you're going to be writing to it. That's not what a dictionary does, what it does is reading, so if I wanted to make it quick all I needed was a good data structure that sits in RAM waiting for reads.
So as a result not only is the web version much quicker (minus the network transfer time), but it ended up being just a fraction of the original code base. And I know that there is additional performance improvement waiting by adding additional indexes, which I can do next time I have inspiration.