"how numbers are stored in computers""hownumbersarestoredincomputers"
Exact multiplications and divisions
This theorem addresses the behavior of floating-point arithmetic if integers and satisfy and has the form . In this case, the operation yields exactly , despite the involvement of division and multiplication.
The proof shows that scaling by powers of 2 doesn't affect the significand, so we can normalize and such that and has at most one bit after the binary point, to ensure rounding will be exact. The special structure of helps control rounding errors, particularly in edge cases like halfway rounding.
Theorem
Let . If and are integers such that and , then , assuming floating-point operations are exactly rounded.
Proof
Since multiplying or dividing by powers of two only shifts the exponent, we can scale and without loss of generality.
Scale so that , and scale accordingly to ensure satisfies . This implies , so has significant bits and at most one bit right of the binary point. We may assume without loss of generality.
Let . To show , it suffices to prove that the rounding error satisfies .
Since has at most one fractional bit, this ensures that rounds to . In the exact halfway case where the error equals , 's least significant bit is zero (due to the original bound on ), so it will round back to .
Now write the binary expansion of as , and let . Then:
where is an odd integer. Since and , we can write for some . Substituting:
The numerator is odd, so the expression is at least . Assuming (the opposite case is similar), we get:
Now plug in :
Discussion
The theorem holds true for any base , as long as is replaced by . As gets larger, however, denominators of the form are farther and farther apart. Most importantly, this theorem emphasizes that accurate rounding in basic operations is crucial not only for precision but also for proving exact algebraic relationships in floating-point arithmetic, even when inputs are only approximations of real values.
References
1. What Every Computer Scientist Should Know About Floating-Point Arithmetic David Goldberg