I am working with a web site analyser which will be used to analyse our own site according to the log from tomcat.
Now,we push the log from tomcat to the database (MySQL) everyday, it works well now. However I found a potential and fatal problem !
Until now we push the log to a single table in the database,but the log items will increase rapidly soon especially when we hold more users, obviously a single table can not save so many log items (also it will result in a low performance when do the query operation from the large table).
And we use the hibernate as the persistence layer,each row in the log table is mapped to a java object of LogEntry in the application.
I have thought create a new table each month,but how to make the LogEntry map to more than one tables and query across tables?
Also,the log number of each month maybe not the same, an extreme example, how about the log number (records in the table) is greater than the max capacity of the table in db?
Then I thought set a property to limit the max number of log to be pushed when hibernate push log to db. If so I have no idea to tell the hibernate create a new table and query across table automatically.
Any ideas?
Update to Sandy:
I know your meaning, that’s to say the max capability of a table is decided by the OS, and if I use the partitioning, the max capability maybe increase until it up to the max capability of my disk. However even if I use the partition, it seems that I do not need to care about the max capability of the table, but if the table hold too many records, it will result in a low performance. (BTW, we have not decide to delete the old logs yet.) Another way I thought is create more than tables with the same structure,but I am using the hibernate,all of the log inserting and querying will through the hibernate, and can the Entity (POJO) mapped to more than one table?
Have a look at Hibernate Shards (database sharding is a method of horizontal partitioning). Although this suproject is not very active and has some limitations (refer to the documentation), it’s stable and usable (Hibernate Shards has been contributed by Max Ross from Google who is using it internally).
Monitor your database/tables and anticipate the required maintenance.
Hibernate won’t do that automatically, this will be part of the maintenance of the database and of the sharding configuration (see also the section about Virtual Shards).