I want to make a document management system (interface in Ruby).
What do profesional sollutions (Alfresco, Liferay social office, others) use for storing and versioning documents?
What else can I use?
Key points:
- storage space optimization (deltas, compression …)
- versioning
- ability to index docs (can be external)
- ability to make backups at runtime (live hot-backup)
- locking?
- scalability on large data volume
- ensure data integrity (hashing?)
- permissions
- transactional
- Workflow support (optional)
Bonus points:
- how does KnowledgeTree do it?
- how does Liferay Social Office do it (jcr?) ?
- how does Alfresco do it ?
Any books on this issue ?
Most of the enterprise document management solutions I’ve seen (Cimage, Documentum, LiveLink) definitely don’t care about #1. Storage is relatively cheap, especially if it’s storage vs processing (store and retreieve). They mostly rely on filesystem based storage – perhaps with name abstraction such that
ShoppingList.docperhaps becomes20100909100101a.doc.rev1, with a database tracking the given-name, the stored name, revisions, and various other data {MIME type, headers & properties etc}. By not generating deltas + compression you get indexing very easily from any number of existing products/agorithms. Versioning is also extremely simple with this approach.Depending on the size and scale you’re building, you could also store versioned files within a database.
An (S)FTP or CIFS storage process would also allow your software to run on an app server with modest space, but store the files+history on a file or cloud server of some sort – although this isn’t much different from filesystem based storage.