I’m working on a problem where I’m instantiating many instances of an object. Most of the time the instantiated objects are identical. To reduce memory overhead, I’d like to have all the identical objects point to the same address. When I modify the object, though, I’d like a new instance to be created–essentially copy-on-write behavior. What is the best way to achieve this in Python?
The Flyweight Pattern comes close. An example (from http://codesnipers.com/?q=python-flyweights):
import weakref
class Card(object):
_CardPool = weakref.WeakValueDictionary()
def __new__(cls, value, suit):
obj = Card._CardPool.get(value + suit, None)
if not obj:
obj = object.__new__(cls)
Card._CardPool[value + suit] = obj
obj.value, obj.suit = value, suit
return obj
This behaves as follows:
>>> c1 = Card('10', 'd')
>>> c2 = Card('10', 'd')
>>> id(c1) == id(c2)
True
>>> c2.suit = 's'
>>> c1.suit
's'
>>> id(c1) == id(c2)
True
The desired behavior would be:
>>> c1 = Card('10', 'd')
>>> c2 = Card('10', 'd')
>>> id(c1) == id(c2)
True
>>> c2.suit = 's'
>>> c1.suit
'd'
>>> id(c1) == id(c2)
False
Update: I came across the Flyweight Pattern and it seemed to almost fit the bill. However, I’m open to other approaches.
Do you need
id(c1)==id(c2)to be identical, or is that just a demonstration, where the real objective is avoiding creating duplicated objects?One approach would be to have each object be distinct, but hold an internal reference to the ‘real’ object like you have above. Then, on any
__setattr__call, change the internal reference.I’ve never done
__setattr__stuff before, but I think it would look like this:And similarly, expose the attributes through
getattr.You’d still have lots of duplicated objects, but only one copy of the ‘real’ backing object behind them. So this would help if each object is massive, and wouldn’t help if they are lightweight, but you have millions of them.