I need advise about a database structure. I need to capture data from the web about one specific subject on few specific websites and insert that data to a database.
The problem with this task is that the information is not linear, if I try to design tables with fields for all possible data I will end up with many row fields with NULL values. There are any problem with this(end up with many row fields with NULL values)? Or should I user other kind of structure? For example store the data in one field and that field containing an associative array with data.
What I mean with non linear data is the following:
array(
'name' => 'Don',
'age' => '31'
);
array(
'name' => 'Peter',
'age' => '28',
'car' => 'ford',
'km' => '2000'
);
In a specific website search I will store only “name” and “age”, and in other website I will store “name”, “age”, “car” and “km”.
I don’t know If I explain weel my problem. My english is not very good.
Best Regards.
Ok lets track back..and assume you are most comfortable with databases….you can always break down a nonlinear structure to linear type..only query performance will get hit..
No problem in rows with lot of null values. Depends on db implementation, but I have seen such designs before and they are pretty flexible.
Let me give an example
Lets say we have to store hours worked per week..but in your case week can have any number of days.
So you define a table with columns like
StartDate, Id, MondayHour, Tuesdayhour, etc etc..upto SundayHour
If you want to add another hour like MondayHour1, just add the column and modify your queries..
To store the same structure is a linear(normalized) way(not sure if linear is a right word here) just define a table as follows DayID, DayName
And then your hours table will have StartDate, ID, DayID, Hours..
Only now you need a join on two tables.
Hope I have understood and answered your question correctly