When adding a new feature to an existing system if you come across an existing function that almost does what you need is it best practice to:
- Copy the existing function and make your changes on the new copy (knowing that copying code makes your fellow devs cry).
-or-
- Edit the existing function to handle both the existing case and your new case risking that you may introduce new bugs into existing parts of the system (which makes the QA team cry)
- If you edit the existing function where do you draw the line before you should just create a new independent function (based on a copy)…10% of the function, 50% of the function?
Rule of thumb I tend to follow is that if I can cover the new behaviour by adding an extra parameter (or new valid value) to the existing function, while leaving the code more-or-less “obviously the same” in the existing case, then there’s not much danger in changing a function.
For example, old code:
New use case – I’m writing some code in a style that uses the null object pattern, so I want
utf8len(None)to returnNoneinstead of throwing an exception. I could define a new functionutf8len_nullobjectpattern, but that’s going to get quite annoying quite quickly, so:Then even if the unit tests for
utf8lenwere incomplete, I can bet that I haven’t changed the behavior for any input other thanNone. I also need to check that nobody was ever relying onutf8lento throw an exception for aNoneinput, which is a question of (1) quality of documentation and/or tests; and (2) whether people actually pay any attention to defined interfaces, or just Use The Source. If the latter, I need to look at calling sites, but if things are done well then I pretty much don’t.Whether the old allowed inputs are still treated “obviously the same” isn’t really a question of what percentage of code is modified, it’s how it’s modified. I’ve picked a deliberately trivial example, since the whole of the old function body is visibly still there in the new function, but I think it’s something that you know when you see it. Another example would making something that was fixed configurable (perhaps by passing a value, or a dependency that’s used to get a value) with a default parameter that just provides the old fixed value. Every instance of the old fixed thing is replaced with (a call to) the new parameter, so it’s reasonably easy to see on a diff what the change means. You have (or write) at least some tests to give confidence that you haven’t broken the old inputs via some stupid typo, so you can go ahead even without total confidence in your test coverage.
Of course you want comprehensive testing, but you don’t necessarily have it. There are also two competing maintenance imperatives here: 1 – don’t duplicate code, since if it has bugs in it, or behavior that might need to change in future, then you’re duplicating the bugs / current behavior. 2 – the open/closed principle, which is a bit high-falutin’ but basically says, “write stuff that works and then don’t touch it”. 1 says that you should refactor to share code between these two similar operations, 2 says no, you’ve shipped the old one, either it’s usable for this new thing or it isn’t, and if it isn’t then leave it alone.