I have been asked to test a library provided by a 3rd party. The library is known to be accurate to n significant figures. Any less significant errors can safely be ignored. I want to write a function to help me compare the results:
def nearlyequal( a, b, sigfig=5 ):
The purpose of this function is to determine if two floating-point numbers (a and b) are approximately equal. The function will return True if a==b (exact match) or if a and b have the same value when rounded to sigfig significant-figures when written decimal.
Can anybody suggest a good implementation, I've written a mini unit-test. Unless you can see a bug in my tests then a good implementation should be pass the following:
assert nearlyequal(1, 1, 5)
assert nearlyequal(1.0, 1.0, 5)
assert nearlyequal(1.0, 1.0, 5)
assert nearlyequal(-1e-9, 1e-9, 5)
assert nearlyequal(1e9, 1e9 + 1 , 5)
assert not nearlyequal( 1e4, 1e4 + 1, 5)
assert nearlyequal( 0.0, 1e-15, 5 )
assert not nearlyequal( 0.0, 1e-4, 6 )
Additional notes:
Values a and b might be of type int, float or numpy.float64. Values a and b will always be of the same type. It's vital that conversion does not introduce additional error into the function.
Lets keep this numerical, so functions that convert to strings or use non-mathematical tricks are not ideal. This program will be audited by somebody who is a mathematician who will want to be able to prove that the function does what it is supposed to do.
Speed... I've got to compare a lot of numbers so the faster the better.
I've got numpy, scypy and the standard-library. Anything else will be hard for me to get, especially for such a small part of the project.
Thanks!
-
Here's a take.
def nearly_equal(a,b,sig_fig=5): return ( a==b or int(a*10**sig_fig) == int(b*10**sig_fig) )
S.Lott : +1: Beat me by 30 seconds!Triptych : Haha @S.Lott - I've definitely been on the other end of that.SilentGhost : you don't really need that if statement. you can just return result of comparison: it's either True or False, it's shorter and surely more pythonicnlucaroni : will this work for the _nearlyequal((1e9,1+1e9, 5)_?Triptych : good point @silentghost. Made the change.Salim Fadhley : This fails nearlyequal(1e9, 1e9 + 1 , 5) - it does not work for big numbers.S.Lott : The issue is that "significant figures" depends on whether the number is >1 or <1. >1 you have to use sig_fig-math.log10(n) to shift "to the right". If <1 you simple use sig_fig to shift "to the left".Salim Fadhley : For the purposes of the problem, the numbers can be any number which can be represented by floating-point. -
"Significant figures" in decimal is a matter of adjusting the decimal point and truncating to an integer.
>>> int(3.1415926 * 10**3) 3141 >>> int(1234567 * 10**-3) 1234 >>>
Salim Fadhley : Can you provide an algorithm to do this?S.Lott : Different from the code I already posted? What more do you need to know? -
There is a function
assert_approx_equal
innumpy.testing
(source here) which may be a good starting point.>>> help(assert_approx_equal) Help on function assert_approx_equal in module numpy.testing.utils: assert_approx_equal(actual, desired, significant=7, err_msg='', verbose=True) Raise an assertion if two items are not equal. I think this should be part of unittest.py Approximately equal is defined as the number of significant digits correct
Salim Fadhley : Great! This will do exactly what I need. Thank you. -
I believe your question is not enough well defined, and the unit-tests you present prove it:
If by 'round to N sig-fig decimal places' you mean 'N decimal places to the right of the decimal point', then the test "assert nearlyequal(1e9, 1e9 + 1 , 5)" should fail because even when you round 1000000000 and 1000000001 to 0.00001 accuracy, they are still different.
And if y 'round to N sig-fig decimal places' you mean 'The N most significant digits, regardless of the decimal point', then the test "assert nearlyequal(-1e-9, 1e-9, 5)" should fail, because 0.000000001 and -0.000000001 are totally different when viewed this way.
If you meant the first definition, then the first answer on this page (by Triptych) is good. If you meant the second definition, please say it, I promise to think about it :-)
Salim Fadhley : Actually, you are right about "assert nearlyequal(-1e-9, 1e-9, 5)" - it breaks the rules! +1 -
I would actually say:
def nearlyequal( a, b, sigfig=5 ): return round(a, sigfig) == round(b, sigfig)
as the methods above using
int()
always round down (ie.int(0.9)
goes to0
).EDIT: This is horribly wrong. I just wanted to make the point that int rounds down.
Salim Fadhley : Totaly wrong, but +1 for being honest! -
This is a fairly common issue with floating point numbers. I solve it based on the discussion in Section 1.5 of Demmel[1]. (1) Calculate the roundoff error. (2) Check that the roundoff error is less than some epsilon. I haven't used python in some time and only have version 2.4.3, but I'll try to get this correct.
Step 1. Roundoff error
def roundoff_error(exact, approximate): return abs(approximate/exact - 1.0)
Step 2. Floating point equality
def float_equal(float1, float2, epsilon=2.0e-9): return (roundoff_error(float1, float2) < epsilon)
There are a couple obvious deficiencies with this code.
- Division by zero error if the exact value is Zero.
- Does not verify that the arguments are floating point values.
Revision 1.
def roundoff_error(exact, approximate): if (exact == 0.0 or approximate == 0.0): return abs(exact + approximate) else: return abs(approximate/exact - 1.0) def float_equal(float1, float2, epsilon=2.0e-9): if not isinstance(float1,float): raise TypeError,"First argument is not a float." elif not isinstance(float2,float): raise TypeError,"Second argument is not a float." else: return (roundoff_error(float1, float2) < epsilon)
That's a little better. If either the exact or the approximate value is zero, than the error is equal to the value of the other. If something besides a floating point value is provided, a TypeError is raised.
At this point, the only difficult thing is setting the correct value for epsilon. I noticed in the documentation for version 2.6.1 that there is an epsilon attribute in sys.float_info, so I would use twice that value as the default epsilon. But the correct value depends on both your application and your algorithm.
[1] James W. Demmel, Applied Numerical Linear Algebra, SIAM, 1997.
-
Oren Shemesh got part of the problem with the problem as stated but there's more:
assert nearlyequal( 0.0, 1e-15, 5 )
also fails the second definition (and that's the definition I learned in school.)
No matter how many digits you are looking at, 0 will not equal a not-zero. This could prove to be a headache for such tests if you have a case whose correct answer is zero.
-
There are already plenty of great answers, but here's a think:
def closeness(a, b): """Returns measure of equality (for two floats), in unit of decimal significant figures.""" if a == b: return float("infinity") difference = abs(a - b) avg = (a + b)/2 return math.log10( avg / difference ) if closeness(1000, 1000.1) > 3: print "Joy!"
-
There is a interesting solution to this by B. Dawson (with C++ code) at "Comparing Floating Point Numbers". His approach relies on strict IEEE representation of two numbers and the enforced lexicographical ordering when said numbers are represented as unsigned integers.
0 comments:
Post a Comment