Can anyone show a simple implementation or usage example of a tf-idf algorithm in

Question

0

Asked: June 14, 20262026-06-14T03:06:24+00:00 2026-06-14T03:06:24+00:00

Can anyone show a simple implementation or usage example of a tf-idf algorithm in

0

Can anyone show a simple implementation or usage example of a tf-idf algorithm in Smalltalk for natural language processing?
I’ve found an implementation in a package called NaturalSmalltalk, but it seems too complicated for my needs. A simple implementation in Python is like this one.

I’ve noticed there is another tf-idf in Hapax, but it seems related to analysis of software systems vocabularies, and I didn’t found examples of how to use it.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-14T03:06:25+00:00

I am the author of the original Hapax package for Visualworks. Hapax is a general purpose information retrieval package, it should be able to work with any kind of text files. I just happens so that I used to use it to analyze source code files.

The class that you are looking for is TermDocumentMatrix, there should be two methods globalWeighting: and localWeighting: to which you pass instances of InverseDocumentFrequency and either LogTermFrequency or TermFrequency depending on your needs. Typically when referring to tfidf people mean it to include logarithmic term frequencies.

There should best tests demonstrating the TDM class using a small example corpus. If the tests have not been ported to Squeak, please let me know so I can provide you with an example.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Can anyone show a simple implementation or usage example of a tf-idf algorithm in

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply