I am playing around with parsing RSS feeds looking for references to countries. At the moment I am using Python, but I think this question is fairly language agnostic (in theory).
Let’s say I have three lists (all related)
- Countries – Nouns (i.e. England, Norway, France )
- Countries – Adjectives (i.e. English, Norwegian, French)
- Cities (i.e. London, Newcastle, Birmingham)
My aim is to begin by parsing the feeds for these strings.
So for example if ‘London’ was found, the country would be ‘England’, if ‘Norwegian’ was found it would be ‘Norway’ etc.
What would be the optimal method for working with this data? Would it be jason and pulling it all in to create nested dictionaries? sets? or some type of database?
At the moment this is only intended to be used on a local machine.
It is a very debatable question. There can be multiple solutions for this. If I were you, I would simply a small DB in Mongodb with three tables like these
Columns: id, name
Columns: id, name, country_id
Columns: id, name, country_id
then simple queries would give your desired results.