This weekend I was working on a project and I needed to use a binomial distribution to test the probability of an event (the probability that x of y characters would be alphanumeric given random bytes). My first solution was to write the test myself since it is rather simple.
def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n-1)
def binomial_prob(n,k,p):
bin_coeff = (factorial(n))/(factorial(k)*factorial(n-k))
return = bin_coeff * pow(p,k) * pow((1 - p),(n-k))
And I used that. However, SciPy includes a binom_test method that does exactly this. But, for distribution this probably increases the size significantly (both SciPy and NumPy would be required) and it is for a relatively simple test. I suppose an auxiliary question is how intelligent is py2exe. Does it just import the modules I use from SciPy and NumPy, or the whole libraries. I expect just the modules that I reference, but I guess the next question is on how many modules does SciPy.stats depend on. But I digress… So my question is this, when should I use code already written at the cost of including far more than I need, and when should I just write my own implementation?
(I tagged this as python, but I suppose it could be a more general question)
“when should I use code already written at the cost of including far more than I need“
Always.
When should I just write my own implementation?
Never.
The “including far more than I need” question is generally quite silly. What do you care how much is “included”?
The only time this can ever matter is when you’re writing embedded software and are severely memory-constrained.
For all other programming — All other programming — don’t think twice. Include pre-written code early and often. Write less. Solve problems more quickly. The operating system will swap the unused pages out of memory. You can safely ignore them.
Programming is about solving problems, not producing code. Less code is better. No code is best.