Firstly, let me explain what I’m building:
-
I have a D3.js Force Layout graph which is rooted at the center, and has a bunch of nodes spread around it. The center node is an Entity of some sort, the nodes around it are other Entities which are somehow related to the root. The edges are the actual relations (i.e. how the two are related).
-
The outer nodes can be clicked to center the target Entity and load its relations
-
This graph is “Egocentric” in the sense that every time a node is clicked, it becomes the center, and only relations directly involved with itself are displayed.
My Setup, in case any of it matters:
-
I’m serving an API through Node.js, which translates requests into queries to a CouchDB server with huge data sets.
-
D3.js is used for layout, and aside from jQuery and Bootstrap, I’m not using any other client-side libraries. If any would help with this caching task, I’m open to suggestions 🙂
My Ideas:
-
I could easily grab a few levels of the graph each time (recurse through the process of listing and expanding children a few times) but since clicking on any given node loads completely unrelated data, it is not guaranteed to yield a high percentage of the similar data as was loaded for the root. This seems like a complete waste, and actually a step in the opposite direction — I’d end up doing more processing this way!
-
I can easily maintain a hash table of Entities that have already been retrieved, and check the list before requesting data for that entity from the server. I’ll probably end up doing this regardless of the cache strategy I implement, since it’s a really simple way of reducing queries.
Now, how do you suggest I cache this data?
Is there any super-effective strategy you can think of for doing this kind of caching? Both server-and-client-side options are greatly welcomed. A ton of data is involved in this process, and any reduction of querying/processing puts me miles ahead of the game.
Thanks!
On the client side I would have nodes, and have their children either be an array of children, or else a function that serves as a promise of those children. When you click on a given node, if you have data, display it immediately. Else send off an AJAX request that will fill it.
Whenever you display a node (not centered), create an asynchronous list of AJAX requests for the children of the displayed nodes and start requesting them. That way when the user clicks, there is a chance that you already have it cached. And if not, well, you tried and cost them nothing.
Once you have it working, decide how many levels deep it makes sense to go. My guess is that the magic number is likely to be 1. Beyond that the return in responsiveness falls off rapidly, while the server load rises rapidly. But having clicks come back ASAP is a pretty big UI win.