#Python tip: Given inexact data, subtracting nearly equal
values increases relative error significantly more
than absolute error.
4.6 ± 0.2 Age of Earth (4.3%)
4.2 ± 0.1 Age of Oceans (2.4%)
___
0.4 ± 0.3 Huge relative error (75%)
This is called “catastrophic cancellation”.
1/
values increases relative error significantly more
than absolute error.
4.6 ± 0.2 Age of Earth (4.3%)
4.2 ± 0.1 Age of Oceans (2.4%)
___
0.4 ± 0.3 Huge relative error (75%)
This is called “catastrophic cancellation”.
1/
The subtractive cancellation issue commonly arises in floating point arithmetic. Even if the inputs are exact, intermediate values may not be exactly representable and will have an error bar. Subsequent operations can amplify the error.
1/7th is inexact but within ± ½ ulp.
2/
1/7th is inexact but within ± ½ ulp.
2/
A commonly given example arises when estimating derivatives with f′(x) ≈ Δf(x) / Δx.
Intuitively, the estimate improves as Δx approaches zero, but in practice, the large relative error from the deltas can overwhelm the result.
Make Δx small, but not too small.
3/
Intuitively, the estimate improves as Δx approaches zero, but in practice, the large relative error from the deltas can overwhelm the result.
Make Δx small, but not too small.

3/
While people tend to think of this as a floating point arithmetic problem, it arises anytime data has a range of uncertainty.
Example with integer arithmetic:
(50 ± 2) - (45 ± 2) ⟶ (5 ± 4)
4% relative errors turn into 80%.
4/
Example with integer arithmetic:
(50 ± 2) - (45 ± 2) ⟶ (5 ± 4)
4% relative errors turn into 80%.
4/
Part of the art of numeric computing is algebraically rearranging calculations to avoid subtracting nearly equal values.
A famous example is rewriting the quadratic equation as:
𝑥 = 2𝑐 ÷ (𝑏∓ √(𝑏²−4𝑎𝑐))
This is used when |4ac| is small relative to |b²|.
5/
A famous example is rewriting the quadratic equation as:
𝑥 = 2𝑐 ÷ (𝑏∓ √(𝑏²−4𝑎𝑐))
This is used when |4ac| is small relative to |b²|.
5/
The “loss of significance” problem also arises with computing time differences:
Bad:
start = time()
delta = time() - start
Better way:
start = perf_counter()
delta = perf_counter() - start
The former measures from 1970. The latter measures from the start of the process.
6/
Bad:
start = time()
delta = time() - start
Better way:
start = perf_counter()
delta = perf_counter() - start
The former measures from 1970. The latter measures from the start of the process.
6/
If you only have 16 decimal places of precision in a floating point number, why waste 9 of them by measuring from 1970?
>>> time()
1593041920.6464949
>>> time()
1593041922.272326
>>> len('159304192')
9
>>> perf_counter()
8.867890209
>>> perf_counter()
10.422069702
7/
>>> time()
1593041920.6464949
>>> time()
1593041922.272326
>>> len('159304192')
9
>>> perf_counter()
8.867890209
>>> perf_counter()
10.422069702
7/