I’m attempting to develop a simple online editor that allows for real-time collaboration (written in Java). In this editor, I want clients to be able to edit the source code at arbitrary points (e.g. add the letter ‘d’ to the source code file at row 11, column 20). I’m not sure how to design these source code file objects in an efficient way, while still allowing for letter-by-letter client-server synchronization (similar to how Google Docs works).
I considered using a RandomAccessFile, but after reading this post, I don’t think that would be an efficient approach. Inserting a letter near the beginning of the file would involve changing everything after it.
My current plan is to represent both the source files on the server and client using a StringBuilder object and its insert/delete/append methods. On the server-side, this StringBuilder would be converted to an actual file as necessary.
I’m curious as to whether there might be a better approach for solving this problem. Any ideas?
You will want something like Ropes as a fundamental data structure. This will enable O(log n) edits, inserts, appends, concatenation etc. so you don’t need to worry about edits in the middle of a large data structure.
Two open source libraries to consider:
On top of this you will need to build logic for merging and publishing synchronous changes. This is actually the tricky part: you’ll need to decide on the logic for resolving conflicts etc. and how to transmit “deltas” to the client.
I would treat persistence / copying to permanent storage as a separate problem – best to get everything working well with in-memory data structures first. Then at periodic points you can flush the data out to persistent storage. I’d suggest something like Git, or if you are particularly adventurous you could try something like Datomic (which is essentially a database that works like Git, and keeps a history of all updates)