I was asked in an interview how I would design the Oxford English Dictionary.
I told him that I’d use a TREE data structure, but he replied that it would take a lot of memory. So which other data structure should be used?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
One data structure I heard was used in the past in mobile phones for storing T9 dictionaries is the following (well, this addresses only the key issue, but not the definition storage):
Entries are sorted, and each entry should start with an offset into the previous entry from where it should be continued, and also the continuation. For example:
would decode to apple, applicable, application. However this might not be that different from tries with merged chains, see
Wikipedia uncovered the Directed acyclic word graph, which differs from trees that it not only branches, but branches can merge, where words have the same suffix. This indeed could be a superior storage.