I’m working on a project where we have an array of atoms which acts as a hash. Whenever a user connects to the server a certain value is hashed, and that hash is used as an index to lookup the element in the array, and return that element. “Outside forces” (which are handled by a long-running gen_server) are able to change this array, so I can’t simply hardcode it. My problem is how to “host” this array.
My first implementation was a simple gen_server which kept a copy of the array around and sent it to whoever asked for it. The process asking for it could then traverse it and get the index they want. This implementation had and inordinate amount of memory being used, which I attributed to there being so many copies of this same array floating around.
My current implementation has a central gen_server which handles the state of this array, and children which handle the actual requests. When the state changes the central gen_server updates the children. When a process wants to find it’s hash result it sends its index number to the central gen_server, which forwards the request to one of the children. The child traverses its “local” list, and sends the resulting atom back to the original process.
The problem with the current implementation is that it gets bogged down at high traffic. I’ve tried using more and more children, but I’m pretty sure the central gen_server is the bottleneck.
Does anyone have any ideas on a better solution to my problem?
EDIT: %s/array/list/g
I suggest that you use
ETS Tables.I think that the Array method is not efficient enough. With anETS Table, created as public within the application backend, any process can lookup an item as soon as it needs it.ETS Tablesin the current newer versions of erlang have the capability for concurrent access.With this kind of arrangement, you will avoid
A Single Point of Failurearising from a single gen_server holding data. This data is needed by many processes and hence should not be held by a single process. That’s where a Table accessible by any process at any time as soon as it needs to make a look up.The Values in the Array should be converted to records of the form as
elementand then inserted in theETS Tables.Advantages of this approach
1. We can create as many
ETS Tablesas possible2. An ETS Table can handle many more elements than a data structure such as a list or an Array with much lower comparable memory consumption.
3.
ETS Tablescan be concurrently accessed by any process within reach and hence you will not need a central process or server to handle data4. A single process or gen_server holding this data, means that if its compromised (goes down due to a full mail box), it will be unavailable, hence the processes which need the array will have to wait for this one server to either restart or i dont know….
5. Accessing the Array data by sending request messages plus making copies of the same array to each process that needs it is not “Erlangic” design.
6. Finally,
ETS Tablesownership can be transferred from process to process. When the owning process is crashing (Only gen_servers can detect that they are dying [take note of this]), it can transfer theETS Tableto another process to take over. Check here: ETS Give AwayThat’s my thinking.