I have a program which requests web page, parses it and depends on result decides what to do next.
For example: Program should obtain some element from page, let’s call it “aim element”. If program does request and obtains “someOtherElement” it will continue execution. If it obtains “aimElement” Executor should click some links, execution ends and we go to the next task. Program has 20 attempts to obtain “aimElement”. And “aimElement” could change in the future.
Seems simple but I don’t really know how to implement this with good oo-design. I am thinking of two objects: Task and Executor. Task contains all conditions and Executor receives Task as an argument and does requests based on Task’s needs. But if executor returns raw page to Task itself Task will be to complicated and there will be strong connection between them. If Executor returns already parsed elements of a page then Executor will need to know what and how to parse and will be also complicated and also there will be strong connection.
I don’t know if my explanation clear or not but maybe you could advice me some design pattern or just share your experience with similar problems.
I would just use a push parser and allow Tasks to register themselves for whatever events they are interested in. Basically your parser parses the document once and informs all subscribers about the kind of things they are interested in (say “img with URL X” or “link to Y”).
You can then either query the subscribers to see what to do next (at the end/after every event), or let them inform you with a listener approach yourself.
The best part is that I’m sure there’s already a push parser for HTML in Java, so you avoid quite some work.