As a part of my bigger project I’m developing a parser for the web, which, according to predefined rules, will get the data from web pages and put it into objects. I write in C# using .Net 4.0 and SQL Server 2008.
Idea is like this:
Suppose I have a Car { Power, Weight, Price } object (which is a POCO used in EF to store data).
I create rules for my parser, so when it scan some site, it will try to fill properties of the object.
For example: Car { Power=100Hp, Weight=1000kg, Price=7000$ }
All of this is working. But since all sites are different and even on one site data can be structured in a different way, I need some way to verify my parsing rules.
I want to create a tester app, that will:
- Take list of URLs of items. Download them, store in some format.
- I will manually check which properties of items are parsed correctly and mark them as correct.
- Will run tester in verification mode. Where it should again load everything from the net and now compare with the data that I marked as correct.
This way I can have a sort of unit tests for rules, while I improve them.
Now goes the actual question: in order to implement all of this, I don’t want to modify my entities, I would like to have some wrapper like:
Car {
Power { Value, IsCorrect},
Weight {Value, IsCorrect},
Price {Value, IsCorrect}}
So all my existing code can work as before, and only my test code, can use that annotation info. Also, of course, I want this info to be serializable, either to the database or XML.
Right now my general idea is to serialize Car into Xml, and then manually add annotation properties, and check them. By manually here I mean by using XmlDocument methods. This should work, but I did not come to a conclusion on how to store and access the annotations in memory (I don’t like the idea to do all validation via XmlDocument)
Perhaps, in some dynamic language, like javascript I would simply add annotations like this
Car c = { Power = 1, Weight=2 ...};
c["Power_IsCorrect"] = true;
c["Weight_IsCorrect"] = true;
///... etc
I can’t imagine somehting like this in C# 🙁
You could create a little wrapper class that either uses strings to identify the properties or expression trees.
Example for a class that uses strings:
You could then use it like this: