I participate in a TDD Coding Dojo, where we try to practice pure TDD on simple problems. It occured to me however that the code which emerges from the unit tests isn’t the most efficient. Now this is fine most of the time, but what if the code usage grows so that efficiency becomes a problem.
I love the way the code emerges from unit testing, but is it possible to make the efficiency property emerge through further tests ?
Here is a trivial example in ruby: prime factorization. I followed a pure TDD approach making the tests pass one after the other validating my original acceptance test (commented at the bottom).
What further steps could I take, if I wanted to make one of the generic prime factorization algorithms emerge ? To reduce the problem domain, let’s say I want to get a quadratic sieve implementation … Now in this precise case I know the “optimal algorithm, but in most cases, the client will simply add a requirement that the feature runs in less than “x” time for a given environment.
require 'shoulda'
require 'lib/prime'
class MathTest < Test::Unit::TestCase
context "The math module" do
should "have a method to get primes" do
assert Math.respond_to? 'primes'
end
end
context "The primes method of Math" do
should "return [] for 0" do
assert_equal [], Math.primes(0)
end
should "return [1] for 1 " do
assert_equal [1], Math.primes(1)
end
should "return [1,2] for 2" do
assert_equal [1,2], Math.primes(2)
end
should "return [1,3] for 3" do
assert_equal [1,3], Math.primes(3)
end
should "return [1,2] for 4" do
assert_equal [1,2,2], Math.primes(4)
end
should "return [1,5] for 5" do
assert_equal [1,5], Math.primes(5)
end
should "return [1,2,3] for 6" do
assert_equal [1,2,3], Math.primes(6)
end
should "return [1,3] for 9" do
assert_equal [1,3,3], Math.primes(9)
end
should "return [1,2,5] for 10" do
assert_equal [1,2,5], Math.primes(10)
end
end
# context "Functionnal Acceptance test 1" do
# context "the prime factors of 14101980 are 1,2,2,3,5,61,3853"do
# should "return [1,2,3,5,61,3853] for ${14101980*14101980}" do
# assert_equal [1,2,2,3,5,61,3853], Math.primes(14101980*14101980)
# end
# end
# end
end
and the naive algorithm I created by this approach
module Math
def self.primes(n)
if n==0
return []
else
primes=[1]
for i in 2..n do
if n%i==0
while(n%i==0)
primes<<i
n=n/i
end
end
end
primes
end
end
end
edit 1 Judging from the first answers, I guess I wasn’t clear in my initial description: the performance test is not a standard part of my unit test, it is a new acceptance test written to answer a specific requirement from the client.
edit 2 I know how to test the execution time,but it seems like moving from the trivial algorithm to the optimized one is a huge step. my question is how to make the optimal code emerge, in other terms : how do you decompose the migration from the trivial code to the optimal one ?
Some mentioned it is a problem specific approach : I provided a sample problem for which I don’t know how to continue.
I also used to participate in a weekly TDD coding Dojo and we tried some experiments to see if it was possible to use it for algorithmic purpose (find better algorithm, find an algorithm where there is no obvious one) or built-in performance constraints.
When using TDD in Dojo we try to follow the rules below
Given these rules we have more room to experiment than what is obvious at first sight. We can tweak the definition of simplest and add code smells to take efficiency into account (basically: if we think of several easy ways of implementing something prefer the most efficient and if we know of some more efficient – but still simple of well known – algorithm than the one used in our code it’s a smell).
Summarily the results was that TDD itself is not well fitted to predict overall code performance and achieve efficient code from start, even if using TDD and refactoring we succeeded achieving a better insight on our code and could enhance it to achieve better readability and avoid some obvious performance bottlenecks. Trying to insert performances constraints in code at that test level was usually disastrous (we got code and test much too complex and often broken code or too complex to change).
One reason is that TDD we usually work with very small tests set (the simplest test that fail). On the other hand more performance problems occurs with real data set and it fits very poorly with the above rules. Performance tests, even if formally still unit testing, are more alike functional testing. Common optimization strategy involve adding caches, or taking into account some property of real data distribution, or negociate changes in user stories when some small benefit feature has great negative impact on performance. All of these can’t really be built-in in TDD but are more likely found while profiling code.
I believe performances goal are basically a functional testing problem.