I have a class which acts as a simple crawler and I want to invoke this class within a servlet.
My idea is to get an url from user then url request will be passed to the servlet and servelt pass the url to the class and class will start the crawling. and I want my servlet to create only one instance of this class.the retrieved data from crawlwer will be added to the DB directly by the class.
I want to control the behavior of the class like running/halting/stopping from servlet
(for this matter I think I am able to create a simple xml file which will be shared between servlet and class and if servlet change the status code class should response to the status change)
But I have some doubts about how to control the behavior of the class such as command it to run/halt/stop and since my class is not multithreaded I don’t have any idea what will happen to invoked class after calling it from servlet and since this class needs to read from network obviously I’ll have some gap/freezing phase during running it.
How can I solve the problem of concurrency in this situation?or in other word will I have any concurrency issue or not?
regards.
It depends on the Servlet container you are using. Some containers spawn a new Thread per user request (almost always this is the desired behavior), so you should definitely design for concurrency.
You can make the Servlet class implement SingleThreadModel, then in the service method you can directly call the crawler class code, as only a thread will enter
serviceat a time.This implies only an URL can be processed at a given time, which is probably not what you want, so instead of that, don’t implement SingleThreadModel and create a singleton executor service in the init method:
Then, in the
servicemethod create a new CrawlingTask (Runnable) with the URL specified in the request, then submit the task to the executor.That way you could also shutdown it:
As ExecutorService is thread-safe, you don’t have to worry about concurrency when enqueuing tasks.