I have an XML file that I want to parse into a database in a normalized fashion. Table two is the idea I had for creating a one to many relationship table. The name,title will never change for each file group, but the download paths will be different.
Table 1
id | name | title | download_path
----------------------------------------------------------------------
1 | FileGroup 1 | This is the first file group | /this/1/1.zip
2 | FileGroup 1 | This is the first file group | /this/1/2.zip
3 | FileGroup 2 | This is the second file group | /this/2/1.zip
4 | FileGroup 2 | This is the second file group | /this/2/2.zip
5 | FileGroup 3 | This is the third file group | /this/3/1.zip
XML File
<Item>
<Name>File Group 1</Name>
<Title>This is the first file group</Title>
<DownloadPath>/this/1/1.zip</DownloadPath>
</Item>
<Item>
<Name>File Group 1</Name>
<Title>This is the first file group</Title>
<DownloadPath>/this/1/2.zip</DownloadPath>
</Item>
<Item>
<Name>File Group 2</Name>
<Title>This is the second file group</Title>
<DownloadPath>/this/2/1.zip</DownloadPath>
</Item>
<Item>
<Name>File Group 2</Name>
<Title>This is the second file group</Title>
<DownloadPath>/this/2/2.zip</DownloadPath>
</Item>
<Item>
<Name>File Group 3</Name>
<Title>This is the third file group</Title>
<DownloadPath>/this/3/1.zip</DownloadPath>
</Item>
Table 2
group_id | file_id
-----------------------------
1 | 1
1 | 2
2 | 3
2 | 4
3 | 5
What is the best way to do this when parsing through the XML. If i put the XML data into an array and foreach through each item, i need to be able to able to group them on the fly and create the relationship in table 2. I did have the idea of just creating table 1 and then afterwards building the relationship table, but even then then i dont know how best to group them. I have nothing in the XML to say they are grouped other that name and title. Each group can have any number of file download paths.
I do not have any say on the XML file creation. Its all I have to deal with.
Your table structure is not normalized. You can update one FileGroup/Title row without updating the other, and this would be wrong. Instead, FileGroup/Title should be in one table and FileGroup/download_path should be in another table.
As for organizing the DB based on the xml, imagine you are parsing the xml by node:
The xml in your example is also invalid, so good luck processing it if that’s how it really is..