I’m looking to create a table for users and tracking their objectives. The objectives themselves would be on the order of 100s, if not 1000s, and would be maintained in their own table, but it wouldn’t know who completed them – it would only define what objectives are available.
Objective:
ID | Name | Notes |
----+---------+---------+
| | |
Now, in the Java environment, the users will have a java.util.BitSet for the objectives. So I can go
/* in class User */
boolean hasCompletedObjective(int objectiveNum) {
if(objectiveNum < 0 || objectivenum > objectives.length())
throw new IllegalArgumentException("Objective " + objectiveNum + " is invalid. Use a constant from class Objective.");
return objectives.get(objectivenum);
}
I know internally, the BitSet uses a long[] to do its storage. What would be the best way to represent this in my Derby database? I’d prefer to keep it in columns on the AppUser table if at all possible, because they really are elements of the user.
Derby does not support arrays (to my knowledge) and while I’m not sure the column limit, something seems wrong with having 1000 columns, espeically since I know I will not be querying the database with things like
SELECT *
FROM AppUser
WHERE AppUser.ObjectiveXYZ
What are my options, both for storing it, and marshaling it into the BitSet?
Are there viable alternatives to java.util.BitSet?
Is there a flaw in the general approach? I’m open to ideas!
Thanks!
*EDIT: If at all possible, I would like the ability to add more objectives with only a data modification, not a table modification. But again, I’m open to ideas!
[puts on fake moustache]
Store the bitset as a BLOB. Start by simply serializing it, then if you want more space-efficiency, trying pushing the results through a DeflaterOutputStream on their way to the database. For better space- and time- efficiency, try the bitmap compression method used in FastBit, which breaks the bitset into 31-bit chunks, then run-length encodes all-zero chunks, packing the literal and run chunks into 32-bit words along with a discriminator bit.
If you know you’ll only look at the objective bitset while the ResultSet that brought it from the database is still open, write a new bitset class that wraps the Blob interface and implements get on top of getBytes. This avoids having to read the whole BLOB into memory to check a few specific bits, and at least avoids having to allocate a separate buffer for the bitset if you do want to look at all the values. Note that making this work with a compressed bitset will take substantial ingenuity.
Be aware that this approach gives you no referential integrity, and no ability to query on the user-objective relationship, little flexibility for different uses of the data in future, and is exactly the kind of thing that Don Knuth warned you about.