I am currently refactoring a very useful but poorly designed class in C++, and I’m running into a problem with the design: rather passing data around using arguments to methods, the data is passed around by setting private state variables in the class. This makes it very difficult for me to diagram out how data moves through functions. It’s my weekend task to try and remove this style of passing data around as much as possible, as makes the program very impossible to understand from just the method signatures, as the signatures only tell a part of the story. I’ve decided
My current approach to test if a method communicates using private class-level variables is the following:
- Edit the method and make it a function rather than a method, which removes its access to the state variables in the class.
- Edit all of the calls to the method so that they call the function rather than the method.
- Compile, see if anything breaks. Make a list of accessors to add to the original class.
- Run the unit tests to see if I’ve broken anything in a very subtle way.
Is there a better way of doing this, perhaps one that can be easily automated? Is this refactoring a well-known technique that I can cite if I show it to other people?
The only mention of this problem that I’ve found so far is this quote from Coders at Work via the Object-oriented programming Wikipedia entry:
“The problem with object-oriented languages is they’ve got all this implicit environment that they carry around with them. You wanted a banana but what you got was a gorilla holding the banana and the entire jungle.” – Joe Armstrong
Edit in response to a good question from Oli Charlesworth:
I understand that the point of OOP is to sometimes communicate through state variables of the class. The difficulty with my current case is that there are currently 78 different data members in the class, many of which are key-value pairs of strings to other data types, and there are undocumented implicit dependencies on the order in which they need to be initialized. It’s possible that given a sufficiently smart programmer working with this class would be easy, but it’s currently very difficult for me. I think that several of these data types could be abstracted into their own classes, but before I can do that I need to understand more clearly how the data members interact with each other.
Given the clarification in the question my “are you sure it’s not just that you don’t like the other programmer’s style” comment dies a death 😉
Personally I’d just refactor normally. That is, with 78 data members and lots of bits that are related but not in a class of their own I’d start by grouping the related data and extracting the functionality that works on it. There’s no need, IMHO, to go through a stage where you explicitly pass the data into the functions in the existing class. Just pick a group of related data items, come up with a decent name, extract them and work out where they were used and how you need to move functionality into the new class.
Ideally, I’d start writing unit tests for the main class and the new broken out classes as I went along…