I have a queuing mechanism in C on Unix. It accepts XML transactions. Some transactions contain records to be stored. Other transactions request those transactions. The transactions get stored in a file, which is a home-grown queue. First in, first out, very simple. Header area at start of file, keeps track of next position to read from, and next position to write to. We use file locking, but not semaphores as retrieval is polled from remote systems. And there’s only one program that accesses the queues. It’s in C. Been working fine for years.
Now we have to expand the system. The transactions will contain an extra XML tag. We have to selectively retrieve based on the values of that tag. We are going from a simple queue to a priority queue. There can be many different values in the tag. Say AX, BX, CX, FL and TS. Transactions get added to the queue in order received. We need to be able to retrieve them either in order they were received, or retrieve the next transaction where the tag is FL. Or TS. Or (CS or FL). Or not AX.
How best to do this?
Simple and fast are what we need. Several options come to mind:
- Use something like Berkely DB to turn the queue into a database of sorts.
- Tap into a PostgreSQL database, create a table that can be used as a priority queue.
- Find a C library that will do what we want.
- Write our own disk-based priority queue.
We have some constraints. Time is ticking away and this needs to be done in a few weeks. C for fast insertion into the system. Maybe Python if we can tapdance fast enough to convert all the other business logic in the program that accesses the queue. Prefer not to use PostgreSQL as we have no control over the database system and the DBA has nasty habits over what he considers ‘his’ and we have no reliability of uptime even though this is a critical system. Politics, huh!! DBA has also said that using a PostgreSQL table is not an efficient way of doing it. We prefer something that is localized so we can control it. Got to be lightning fast to handle a lot of transactions per minute.
I’m open to any suggestions, even far-out ones. The more suggestions the better.
It sounds like all you really need is;
I would suggest the quickest change would be to have each record contain a small header in the file that includes this info (length of record, isValid, tag-info etc). You then start at the first record as normal and iterate over all the records till you encounter one with a tag, and when you do you mark this record invalid so regular processing will ignore it.
Long-term you may want to consider something like sqllite which is freely available, works with regular files, and can be compiled into your executable. For most types of record searching it’s likely to be faster than something you can quickly roll yourself, and a heck lot more flexible 🙂
So I would go with the quick change now and consider reworking your data format in the near future, be it with sqllite or something else. It sounds like the requirements for your queue are becoming more complex so now would be the time to consider something that can be further expanded in the future.