I programmed a simple WCF service that stores messages sent by users and sends these messages to the intended user when asked for. For now, the persistence is implemented by creating username.xml files with the following structure:
<messages recipient="username">
<message sender="otheruser">
...
</message
</messages>
It is possible for more than one user to send a message to the same recipient at the same time, possibly causing the xml file to be updated concurrently. The WCF service is currently implemented with basicHttp binding, without any provisions for concurrent access.
What concurrency risks are there? How should I deal with them? A ReadWrite lock on the xml file being accessed?
Currently the service runs with 5 users at the most, this may grow up to 50, but no more.
EDIT:
As stated above the client will instantiate a new service class with every call it makes. (InstanceContext is PerCall, ConcurrencyMode irrelevant) This is inherent to the use of basicHttpBinding with default settings on the service.
The code below:
public class SomeWCFService:ISomeServiceContract
{
ClassThatTriesToHoldSomeInfo useless;
public SomeWCFService()
{
useless=new ClassThatTriesToHoldSomeInfo();
}
#region Implementation of ISomeServiceContract
public void IncrementUseless()
{
useless.Counter++;
}
#endregion
}
behaves is if it were written:
public class SomeWCFService:ISomeServiceContract
{
ClassThatTriesToHoldSomeInfo useless;
public SomeWCFService()
{}
#region Implementation of ISomeServiceContract
public void IncrementUseless()
{
useless=new ClassThatTriesToHoldSomeInfo();
useless.Counter++;
}
#endregion
}
So concurrency is never an issue until you try to access some externally stored data as in a database or in a file.
The downside is that you cannot store any data between method calls of the service unless you store it externally.
If your WCF service is a singleton service and guaranteed to be that way, then you don’t need to do anything. Since WCF will allow only one request at a time to be processed, concurrent access to the username files is not an issue unless the operation that serves that request spawns multiple threads that access the same file. However, as you can imagine, a singleton service is not very scalable and not something you want in your case I assume.
If your WCF service is not a singleton, then concurrent access to the same user file is a very realistic scenario and you must definitely address it. Multiple instances of your service may concurrently attempt to access the same file to update it and you will get a ‘can not access file because it is being used by another process‘ exception or something like that. So this means that you need to synchronize access to user files. You can use a monitor (lock), ReaderWriterLockSlim, etc. However, you want this lock to operate on per file basis. You don’t want to lock the updates on other files out when an update on a different file is going on. So you will need to maintain a lock object per file and lock on that object e.g.
Note that that dictionary is also a shared resource that will require synchronized access.
Now, dealing with concurrency and ensuring synchronized access to shared resources at the appropriate granularity is very hard not only in terms of coming up with the right solution but also in terms of debugging and maintaining that solution. Debugging a multi-threaded application is not fun and hard to reproduce problems. Sometimes you don’t have an option but sometimes you do. So, Is there any particular reason why you’re not using or considering a database based solution? Database will handle concurrency for you. You don’t need to do anything. If you are worried about the cost of purchasing a database, there are very good proven open source databases out there such as MySQL and PostgreSQL that won’t cost you anything.
Another problem with the xml file based approach is that updating them will be costly. You will be loading the xml from a user file in memory, create a message element, and save it back to file. As that xml grows, that process will take longer, require more memory, etc. It will also hurt your scalibility because the update process will hold onto that lock longer. Plus, I/O is expensive. There are also benefits that come with a database based solution: transactions, backups, being able to easily query your data, replication, mirroring, etc.
I don’t know your requirements and constraints but I do think that file-based solution will be problematic going forward.