I have a mySQL table which represents a company’s products. The table shows whether or not two products are compatible with each other.
The table looks somewhat like
Product1 Product2 Compatible?
A A Yes
A B ?
A C No
A D ?
B A ?
B B Yes
B C ?
B D Maybe
C A No
C B ?
C C Yes
C D ?
D A ?
D B Maybe
D C ?
D D Yes
Note that while none of the rows are duplicates, some data is redundant. If A is not compatible with C, then C is obviously not compatible with A. This renders the one row redundant. The reason why I have these rows to begin with is that I built the table using a nested for loop. Would you recommend deleting the rows with duplicate meaning for the sake of saving space? Or leave them there for (possibly?) easier maintenance?
It is almost always worthwhile to properly normalize your data. Lower volume of data and no chance of inconsistency are just the two most obvious reasons why. So if your compatibility is in fact guaranteed to be symmetric, and remain symmetric forever (and not some kind of upward- vs. backward-compatibility…), then yes, you should delete the redundant rows.
The only caveat is that in the future you must either query the compatibility in the canonical order (with the lower product, however you define that, in the first slot of your query), or use a disjunctive query, otherwise you might miss a legitimate combination. (The first of those options is obviously the better solution, since the second reintroduces unnecessary processing effort.)