I’m working with two hierarchical datasets that contain a complex relation (I’m not using SQL) and they don’t share their primary look-up keys. We use this process to keep the two datasets syncronized.
Each dataset is currently stored as a dictionary with the dataset’s key as the dictionary’s key. After the complex relation is determined I store the other dataset’s key as an attribute in the other. This has created the need to create some odd looking helper functions to then follow some of the parent-child relationships.
I was wondering if there might be a more-effective or faster method to this madness as i currently have to pass both datasets to processing functions that need to parse relationships.
examples:
leftdataset = {'10000': { 'key': '10000', 'fkey':'asdf', 'parent':'10001'},
'10001': { 'key': '10001', 'fkey':'qwer', 'parent':''},}
rightdataset= {'asdf': { 'key': 'asdf', 'fkey':'10000', 'parent':'qwer'},
'qwer': { 'key': 'qwer', 'fkey':'10001', 'parent':''},
In order to find the parent’s fkey I need to:
fkey = dataset[dataset['10000']['parent']]['fkey']
I was toying around with the idea of presenting a tuple of the key pairs and then looking for the key i need in it such as:
keys = [('10000', 'asdf'), ('10001', 'qwer')]
def find_key(key, keyset):
for keypair in keys:
if key in keypair:
k1, k2 = keypair
if k1 == key:
return k2
else:
return k1
But this sounds even less efficient than what I’m doing now. Am I just beating down the wrong path?
Is this usage appealing to you?
Easy look-up and usage of single entries:
Easy look-up of parents:
Easy look-up of related entries:
Particularly here is the example in your question: parent’s foreign key:
If so, then here is the code! Classes:
And the usage for the data you’ve provided:
Explanation: Wrap each dict value in an
Entryclass with__getitem__defined to make it usable as a dict (more or less). Have aDatasetclass that maps primary keys to theseEntrys. Provide theEntryaccess to this dataset and provide convenient methods.parent()and.related(). In order for.related()to work, set which dataset the “related” one should be withset_related_datasetand it all ties together.Now you can even just pass
Entrys and you’ll be able to access the related entries without needing to pass both datasets in.