I need logit and inverse logit functions so that logit(inv_logit(n)) == n. I use numpy and here is what I have:
import numpy as np
def logit(p):
return np.log(p) - np.log(1 - p)
def inv_logit(p):
return np.exp(p) / (1 + np.exp(p))
And here are the values:
print logit(inv_logit(2))
2.0
print logit(inv_logit(10))
10.0
print logit(inv_logit(20))
20.000000018 #well, pretty close
print logit(inv_logit(50))
Warning: divide by zero encountered in log
inf
Now let’s test negative numbers
print logit(inv_logit(-10))
-10.0
print logit(inv_logit(-20))
-20.0
print logit(inv_logit(-200))
-200.0
print logit(inv_logit(-500))
-500.0
print logit(inv_logit(-2000))
Warning: divide by zero encountered in log
-inf
So my questions are: what is the proper way to implement these functions so that the requirement logit(inv_logit(n)) == n will hold for any n in as wide a range as possible (at least [-1e4; 1e4)?
And also (and I’m sure this is connected to the first one), why are my function more stable with negative values, compared to the positive ones?
Either use
1.
The bigfloat package with supports arbitrary precision floating point operations.
2.
The SymPy symbolic math package. I’ll give examples of both:
First, bigfloat:
http://packages.python.org/bigfloat/
Here’s a simple example:
This is really slow. You may want to consider restructuring your problem and do some parts analytically. Cases like these are rare in real problems – I’m curious about what kind of problem you are working on.
Example installation:
About the reason your functions wore better with negative values. Consider:
In the first case floating point numbers represent this value easily. The decimal point is moved so that the leading zeroes: 0.0000… does not need to be stored. In the second case all the leading 0.999 needs to be stored, so you need all that extra precision to get an exact result when later doing 1-p in logit().
Here’s the symbolic math way (significantly faster!):
Sympy is found here http://docs.sympy.org/. In ubuntu it’s found via synaptic.