# Difference between revisions of "Integers too big for floats"

(describe the problem) |
(implementations and optimizations) |
||

Line 25: | Line 25: | ||

Before <hask>fromRational</hask> can perform the imprecise division, |
Before <hask>fromRational</hask> can perform the imprecise division, |
||

the <hask>%</hask> operator will cancel the fraction precisely. |
the <hask>%</hask> operator will cancel the fraction precisely. |
||

− | You may use the <hask>Rational</hask> constructor <hask |
+ | You may use the <hask>Rational</hask> constructor <hask>:%</hask> instead. |

However that's a hack, since it is not sure that other operations work well on non-cancelled fractions. |
However that's a hack, since it is not sure that other operations work well on non-cancelled fractions. |
||

You had to import <hask>GHC.Real</hask>. |
You had to import <hask>GHC.Real</hask>. |
||

+ | |||

+ | But since we talk about efficiency let's go on to the next paragraph, |
||

+ | where we talk about ''real'' performance. |
||

+ | |||

+ | |||

+ | == Avoid big integers at all == |
||

+ | |||

+ | The example seems to be stupid, because you may think that nobody divides <hask>factorial 777</hask> by <hask>factorial 778</hask> |
||

+ | without noticing, that this can be greatly simplified. |
||

+ | So let's take the original task which led to the problem above. |
||

+ | The problem is to compute the reciprocal of <math>\pi</math> using [http://en.wikipedia.org/wiki/Chudnovsky_algorithm Chudnovsky's] algorithm: |
||

+ | :<math> |
||

+ | \frac{1}{\pi} = 12 \sum^\infty_{k=0} \frac{(-1)^k (6k)! (13591409 + 545140134k)}{(3k)!(k!)^3 640320^{3k + 3/2}}\ . |
||

+ | </math> |
||

+ | |||

+ | A possible [http://paste.pocoo.org/show/102801/ Haskell implementation] is: |
||

+ | <haskell> |
||

+ | -- An exact division |
||

+ | -- Courtesy of Max Rabkin |
||

+ | (/.) :: (Real a, Fractional b) => a -> a -> b |
||

+ | x /. y = fromRational $ toRational x / toRational y |
||

+ | |||

+ | -- Compute n! |
||

+ | fac :: Integer -> Integer |
||

+ | fac n = product [1..n] |
||

+ | |||

+ | -- Compute n! / m! efficiently |
||

+ | facDiv :: Integer -> Integer -> Integer |
||

+ | facDiv n m |
||

+ | | n > m = product [n, n - 1 .. m] |
||

+ | | n == m = 1 |
||

+ | | otherwise = facDiv m n |
||

+ | |||

+ | |||

+ | -- Compute pi using the specified number of iterations |
||

+ | pi' :: Integer -> Double |
||

+ | pi' steps = 1.0 / (12.0 * s / f) |
||

+ | where |
||

+ | s = sum [chudnovsky n | n <- [0..steps]] |
||

+ | f = fromIntegral c ** (3.0 / 2.0) -- Common factor in the sum |
||

+ | |||

+ | -- k-th term of the Chudnovsky serie |
||

+ | chudnovsky :: Integer -> Double |
||

+ | chudnovsky k |
||

+ | | even k = num /. den |
||

+ | | otherwise = -num /. den |
||

+ | where |
||

+ | num = (facDiv (6 * k) (3 * k)) * (a + b * k) |
||

+ | den = (fac k) ^ 3 * (c ^ (3 * k)) |
||

+ | |||

+ | a = 13591409 |
||

+ | b = 545140134 |
||

+ | c = 640320 |
||

+ | |||

+ | main = print $ pi' 1000 |
||

+ | </haskell> |
||

+ | |||

+ | To be honest, this program doesn't really need much more optimization than limiting the number of terms to 2, |
||

+ | since the subsequent terms are much below the precision of <hask>Double</hask>. |
||

+ | For these two terms it is not a problem to convert the <hask>Integer</hask>s to <hask>Double</hask>s. |
||

+ | |||

+ | But assume these conversions are a problem. |
||

+ | We will show a way to avoid them. |
||

+ | The trick is to compute the terms incrementally. |
||

+ | We do not need to compute the factorials for each term, |
||

+ | instead we compute each term using the term before. |
||

+ | <haskell> |
||

+ | start :: Floating a => a |
||

+ | start = |
||

+ | 12 / sqrt 640320 ^ 3 |
||

+ | |||

+ | arithmeticSeq :: Num a => [a] |
||

+ | arithmeticSeq = |
||

+ | iterate (545140134+) 13591409 |
||

+ | |||

+ | factors :: Floating a => [a] |
||

+ | factors = |
||

+ | -- note canceling of product[(6*k+1)..6*(k+1)] / product[(3*k+1)..3*(k+1)] |
||

+ | map (\k -> -(6*k+1)*(6*k+3)*(6*k+5)/(320160*(k+1))^3) $ iterate (1+) 0 |
||

+ | |||

+ | summands :: Floating a => [a] |
||

+ | summands = |
||

+ | zipWith (*) arithmeticSeq $ scanl (*) start factors |
||

+ | |||

+ | recipPi :: Floating a => a |
||

+ | recipPi = |
||

+ | sum $ take 2 summands |
||

+ | </haskell> |
||

## Revision as of 16:17, 6 February 2009

Although floating point types can represent a large range of magnitudes,
you will sometimes have to cope with integers that are larger than what is representable by `Double`

.

## Dividing large integers to floats

Consider

```
factorial :: (Enum a, Num a) => a -> a
factorial k = product [1..k]
```

You will find that `factorial 777`

is not representable by `Double`

.
However it is representable by an `Integer`

.
You will find that `factorial 777 / factorial 778`

is representable as `Double`

but not as `Integer`

,
but the temporary results are representable by `Integer`

s and not by `Double`

s.
Is there a variant of division which accepts big integers and emits floating point numbers?

Actually you can represent the fraction `factorial 777 / factorial 778`

as `Rational`

and convert that to a floating point number:

```
fromRational (factorial 777 % factorial 778)
```

Fortunately `fromRational`

is clever enough to handle big numerators and denominators.

But there is an efficiency problem:
Before `fromRational`

can perform the imprecise division,
the `%`

operator will cancel the fraction precisely.
You may use the `Rational`

constructor `:%`

instead.
However that's a hack, since it is not sure that other operations work well on non-cancelled fractions.
You had to import `GHC.Real`

.

But since we talk about efficiency let's go on to the next paragraph,
where we talk about *real* performance.

## Avoid big integers at all

The example seems to be stupid, because you may think that nobody divides `factorial 777`

by `factorial 778`

without noticing, that this can be greatly simplified.
So let's take the original task which led to the problem above.
The problem is to compute the reciprocal of using Chudnovsky's algorithm:

A possible Haskell implementation is:

```
-- An exact division
-- Courtesy of Max Rabkin
(/.) :: (Real a, Fractional b) => a -> a -> b
x /. y = fromRational $ toRational x / toRational y
-- Compute n!
fac :: Integer -> Integer
fac n = product [1..n]
-- Compute n! / m! efficiently
facDiv :: Integer -> Integer -> Integer
facDiv n m
| n > m = product [n, n - 1 .. m]
| n == m = 1
| otherwise = facDiv m n
-- Compute pi using the specified number of iterations
pi' :: Integer -> Double
pi' steps = 1.0 / (12.0 * s / f)
where
s = sum [chudnovsky n | n <- [0..steps]]
f = fromIntegral c ** (3.0 / 2.0) -- Common factor in the sum
-- k-th term of the Chudnovsky serie
chudnovsky :: Integer -> Double
chudnovsky k
| even k = num /. den
| otherwise = -num /. den
where
num = (facDiv (6 * k) (3 * k)) * (a + b * k)
den = (fac k) ^ 3 * (c ^ (3 * k))
a = 13591409
b = 545140134
c = 640320
main = print $ pi' 1000
```

To be honest, this program doesn't really need much more optimization than limiting the number of terms to 2,
since the subsequent terms are much below the precision of `Double`

.
For these two terms it is not a problem to convert the `Integer`

s to `Double`

s.

But assume these conversions are a problem. We will show a way to avoid them. The trick is to compute the terms incrementally. We do not need to compute the factorials for each term, instead we compute each term using the term before.

```
start :: Floating a => a
start =
12 / sqrt 640320 ^ 3
arithmeticSeq :: Num a => [a]
arithmeticSeq =
iterate (545140134+) 13591409
factors :: Floating a => [a]
factors =
-- note canceling of product[(6*k+1)..6*(k+1)] / product[(3*k+1)..3*(k+1)]
map (\k -> -(6*k+1)*(6*k+3)*(6*k+5)/(320160*(k+1))^3) $ iterate (1+) 0
summands :: Floating a => [a]
summands =
zipWith (*) arithmeticSeq $ scanl (*) start factors
recipPi :: Floating a => a
recipPi =
sum $ take 2 summands
```

## See also

- Haskell Cafe on "about integer and float operations"
- Generic number type