So I am working on a pet project where I’m storing various text files.

Question

0

Asked: May 24, 20262026-05-24T01:09:44+00:00 2026-05-24T01:09:44+00:00

So I am working on a pet project where I’m storing various text files.

0

So I am working on a pet project where I’m storing various text files. I have setup my app to save the tags as a string in one of my collections so an example would be:

tags: “Linux Apache WSGI”

Storing them and searching for them work just fine but my question comes when I want to do something like a tag cloud, count all the various tags, or make a dynamic selection system based on tags, what is the best way to break them up to work with? Or should I be storing them some other way?

Logically I could scan through every record and get all the tags, break them based on space, then cache the result somehow. Maybe that’s the right answer but I wanted to ask the community wisdom.

I’m using pymongo to interact with my database.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-24T01:09:44+00:00

Or should I be storing them some other way?

The standard way to store tags is to store them as an array. In your case, the DB would look something like:

tags: ['linux', 'apached', 'wsgi']

… what is the best way to break them up to work with?

This is what Map/Reduce is designed for. This effectively “scans every record”. The output of a Map/Reduce is another collection that you can query.

However, there’s also another way to do this and that’s to keep “counters” and update them. So when you save a new document you also increment all of the tags related to that document.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

So I am working on a pet project where I’m storing various text files.

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply