Having just read the first four chapters of Refactoring: Improving the Design of Existing Code, I embarked on my first refactoring and almost immediately came to a roadblock. It stems from the requirement that before you begin refactoring, you should put unit tests around the legacy code. That allows you to be sure your refactoring didn’t change what the original code did (only how it did it).
So my first question is this: how do I unit-test a method in legacy code? How can I put a unit test around a 500 line (if I’m lucky) method that doesn’t do just one task? It seems to me that I would have to refactor my legacy code just to make it unit-testable.
Does anyone have any experience refactoring using unit tests? And, if so, do you have any practical examples you can share with me?
My second question is somewhat hard to explain. Here’s an example: I want to refactor a legacy method that populates an object from a database record. Wouldn’t I have to write a unit test that compares an object retrieved using the old method, with an object retrieved using my refactored method? Otherwise, how would I know that my refactored method produces the same results as the old method? If that is true, then how long do I leave the old deprecated method in the source code? Do I just whack it after I test a few different records? Or, do I need to keep it around for a while in case I encounter a bug in my refactored code?
Lastly, since a couple people have asked…the legacy code was originally written in VB6 and then ported to VB.NET with minimal architecture changes.
Good example of theory meeting reality. Unit tests are meant to test a single operation and many pattern purists insist on Single Responsibilty, so we have lovely clean code and tests to go with it. However, in the real (messy) world, code (especially legacy code) does lots of things and has no tests. What this needs is dose of refactoring to clean the mess.
My approach is to build tests, using the Unit Test tools, that test lots of things in a single test. In one test, I may be checking the DB connection is open, changing lots of data, and doing a before/after check on the DB. I inevitably find myself writing helper classes to do the checking, and more often than not those helpers can then be added into the code base, as they have encapsulated emergent behaviour/logic/requirements. I don’t mean I have a single huge test, what I do mean is mnay tests are doing work which a purist would call an integration test – does such a thing still exist? Also I’ve found it useful to create a test template and then create many tests from that, to check boundary conditions, complex processing etc.
BTW which language environment are we talking about? Some languages lend themselves to refactoring better than others.