For the Purely Lazy

The Low Down

Looking at Lazy Evaluation in the purely functional language Haskell.
All about the lazy, purity just along for the ride.
Wat is Lazy Evaluation?
What is it good for?
What are the gotchas?

Evaluation strategies

Eager or Lazy

In Haskell like languages evaluation is reducing expressions to their simplest form.
(5 + 4 - 2) * (5 + 4 - 2) ⇒ 49
Two options when expression includes function application.
square (5 + 4 - 2) ⇒ 49
- Reduce arguments first (Innermost reduction / Eager Evaluation)
  square (5 + 4 - 2) ⇒ square(7) ⇒ 7 * 7 ⇒ 49
- Apply function first (Outermost reduction / Lazy Evaluation)
  square (5 + 4 - 2) ⇒ (5 + 4 - 2) * (5 + 4 - 2) ⇒ 7 * 7 ⇒ 49
Final answer == expression in normal form
49 is the normal form of square (5 + 4 - 2)
Eager also called strict.
Lazy also called non-strict (sort of)
- Lazy is non-strict but non-strict is not necessarily Lazy.
- Haskell requires non-strictness
- Most implementations are Lazy

Lazy is a bit more involved

Outermost reduction
Only evaluate as needed
- Not to normal form
- To head normal form / weak head normal form
- Outer most constructor
Shares sub expressions when expanding

square (5 + 4)
⇒ let x = 5 + 4 in square x
⇒ x * x
⇒ let x = 7 in x * x
⇒ 7*7
⇒ 49

(5,square (5 + 4))
⇒ let a = 5, b = square (5 + 4) in (a,b)

Lazy evaluation showing sharing and evaluating to head normal form

Whats the difference

Given

fst (a, b) = a
fst (0, square (5 + 4))

Eager

fst (0, square (5 + 4)) 
⇒ fst (0, square 9) 
⇒ fst (0, 9 * 9)
⇒ 0

Lazy

fst (0, square (5 + 4)) 
⇒ let a = 0, b = square (5 + 4) in fst (a, b) 
⇒ a
⇒ 0

Lazy never evaluated square (5 + 4) where eager did.

Given

fst (0, ⊥)

Eager

fst (0, ⊥) 
⇒ fst ⊥
⇒ ⊥

Lazy

fst (0, ⊥) 
⇒ let a = 0, b = ⊥ in fst (a, b) 
⇒ a
⇒ 0

In the presence of bottom (⊥, undefined / non termination)
Lazy evaluation can still return a result
Lazy version never evaluated the ⊥ argument.

Lazy always best! Well no.

Lazy evaluation has a bookkeeping overhead
Unevaluated expression builds up in memory
Eager is not always better than Lazy
Lazy is not always better than Eager
Just a note, technically primitive operations in GHC are not lazy.

The Good, The Bad and The Smugly

The Good and Bad

Pros

More efficient ?
- This is a red herring.
- It could be or it couldn’t.
Modularity!
- This is the real reason
- John Hughes - Why Functional Programming Matters
- Glue allowing composition of functional programs
Efficient tricks for pure languages.
- Again ? Contradiction ? Nope.
- Memoization preserving purity and abstraction.
- Caching preserving purity and abstraction.

Cons

Difficult to reason about time and space usage.
- Time ? Not sold.
  - Lazy O(n) <= Eager O(n)
  - When is the cost? Latency is a concern.
- Space ?
  - Unfortunately yes.
  - The dreaded space leak.
Parallel unfriendly
- Doing work in parallel means doing the work in parallel.
- Have to force work to be done.
- Only an issue if you believe in automagic parallelization.
- See Parallel and Concurrent Programming in Haskell

So Don’t be Smug

Laziness is not a silver bullet.
Laziness is not a scarlet letter.
Most eager languages have lazy constructs.
Most lazy languages have eager constructs.

Modularity

Lazy Aids Composition

From Why Functional Programming Matters
Can structured whole programs as function composition
(f . g) input == f (g input)
g only consumes input as f needs it.
f knows nothing of g
g knows nothing of f
When f terminates g terminates
Allows termination conditions to be separated from loop bodies.
Can modularise as generators and selectors.

Lazy Composition Sqrt

Example from Why Functional Programming Matters
Newton-Raphson to calculate square root approximation.
One function generates sequence of approximations.
Two options to chose when approximation is good enough.
- Don’t care approximations to what.
Can to give different square root approximations.
Paper expands on examples of Laziness aiding composition.

-- Generate a sqrt sequence of operations using Newton-Raphson
nextSqrtApprox n x = (x + n/x) / 2
-- iterate f x == [x, f x, f (f x), ...]
sqrtApprox n = iterate (nextSqrtApprox n) (n/2)
-- element where the difference with previous is below threshold
within eps (a:b:bs) | abs(a-b) <= eps = b
                    | otherwise = within eps (b:bs)
-- Calculate approximate sqrt using within
withinSqrt eps n = within eps (sqrtApprox n)
-- element where ratio with previous is close to 1
relative eps (a:b:bs) | abs(a-b) <= eps * abs b = b
                      | otherwise = relative eps (b:bs)
-- Calculate approximate sqrt using relative
relativeSqrt eps n = within eps (sqrtApprox n)

Caveats about Lazy Composition

When using Lazy IO
- Promptness of resource finalization is a problem
Use libraries libraries like
- Conduit
- Pipes

Lazy Tricks

Memoization

Use Laziness to incrementally build the list of all Fibonacci numbers.
The list refers to itself to reuse calculations.
The list is in constant applicative form (CAF) so is memoized
See Haskell wiki for more tricks using infinite data structures to memoize functions.

fib_list :: [Integer]
fib_list = 0:1:1:2:[n2 + n1| (n2, n1) <- 
                    zip (drop 2 fib_list) (drop 3 fib_list)]
fib_best :: Int -> Integer
fib_best n = fib_list !! n

GHCi with timing

*Main> 5 < fib_best 100000
True
(0.86 secs, 449258648 bytes)
*Main> 5 < fib_best 100001
True
(0.02 secs, 0 bytes)
*Main>

Caching

So you have some data/results and some expensive queries against it.
You want to perform the queries only when they are needed.
You want to perform the queries only once.
You must preserve purity.
What do you do?
Perform the queries and store in the data structure.
Laziness means they will only be evaluated once and only when needed.

Canned example using really expensive fib.
Fibber is some Num type you can do math on.
You can ask it for the resultant value of the Fibonacci number.
You wont export the constructor.

fib_worst :: Int -> Integer
fib_worst 0 = 0
fib_worst 1 = 1
fib_worst 2 = 1
fib_worst n = fib_worst(n-2) + fib_worst(n-1)

data Fibber = Fibber{fibNum :: Int, fibValue :: Integer}
makeFibber :: Int -> Fibber
makeFibber a = Fibber a (fib_worst a)
instance Eq Fibber where a == b = fibNum a == fibNum b
instance Ord Fibber where a <= b = fibNum a <= fibNum b
instance Show Fibber where show a = show . fibNum $ a
instance Num Fibber where
    a + b = makeFibber (fibNum a + fibNum b)
    a * b = makeFibber (fibNum a * fibNum b)
    abs = makeFibber . abs . fibNum
    signum = makeFibber . signum . fibNum
    fromInteger = makeFibber . fromInteger
    negate = makeFibber . negate . fibNum

GHCi

*Main> let fibber30 = makeFibber 30
(0.00 secs, 0 bytes)
*Main> let fibber25 = makeFibber 25
(0.00 secs, 0 bytes)
*Main> fibValue (fibber30 - fibber25)
5
(0.00 secs, 0 bytes)
*Main> fibValue fibber30
832040
(1.22 secs, 127472744 bytes)
*Main> fibValue fibber30
832040
(0.00 secs, 0 bytes)
*Main> fibValue fibber25
75025
(0.11 secs, 10684624 bytes)
*Main>

Time and Space

Times up

Algorithmic complexity of Lazy evaluation is never more than Eager.
Consider Eager behaviour for upper bound.
Real issue - work might be delayed.
Delayed work is real issue for parallelism.
Can always force work using seq and deepseq and force
See Parallel and Concurrent Programming in Haskell

Space Leaks

Space Leak - program / expression uses more memory than required.
- Memory is eventually released
- You think it will run in constant memory but it doesn’t
- Building up unevaluated thunks.
- Unnecessarily keeping reference to data alive.
Memory leak - program allocates memory that is never reclaimed.
Pure Haskell code can only be able to Space Leak (no Memory Leak).
Most other Haskell code should only be able to Space Leak (mostly no Memory Leak).

Examples from “Leaking Space”

Some space leak examples from [Leaking Space - Eliminating memory hogs][8]
```
xs = delete "dead" ["alive", "dead"]
```
Lazy evaluation will keep dead alive until evaluation of xs is forced.
One form of space leak results from adding to and removing from lists but never evaluating (to reduce).
```
xs = let xs' = delete "dead" ["alive", "dead"] in xs' `seq` xs'
```
Why not always strict/eager.
- Composition.
- Composing strictly requires arguments to be evaluated fully.
- sum [1..n] will consume O(n) space when evaluated strictly.
- sum [1..n] should consume O(1) space when evaluated lazily (implementation).
- Usually easier to introduce strictness when lazy than laziness when strict.

This definition is O(n) space.
List is not actually kept in memory
Accumulates (+) operations.
```
sum1 (x:xs) = x + sum1 xs 
sum1 [] = 0
```
This definition is O(n) space.

Also accumulates (+) operations.

sum2 xs = sum2’ 0 xs 
   where 
   sum2’ a (x:xs) = sum2’ (a+x) xs 
   sum2’ a [] = a

This definition is O(1) space.

sum3 xs = sum3’ 0 xs 
   where 
   sum3’ !a (x:xs) = sum3’ (a+x) xs 
   sum3’ !a [] = a

With optimizations in GHC sum2 may be transformed into sum3 during strictness analysis.
The article has more examples of space leaks. [8]: http://queue.acm.org/detail.cfm?id=2538488 (Leaking Space - Eliminating memory hogs: By Neil Mitchell)

When Strictness can make things slow

Maybe one should just through strictness in everywhere
Something I did not know. Pattern matches are strict.
Example from Haskell Wiki
Strict pattern match forces all recursive calls to splitAt

-- Strict
splitAt_sp n xs = ("splitAt_lp " ++ show n) $
    if n<=0
        then ([], xs)
        else
            case xs of
                [] -> ([], [])
                y:ys ->
                    case splitAt_lp' (n-1) ys of
                        -- pattern match is strict
                        (prefix, suffix) -> (y : prefix, suffix)

-- Lazy
splitAt_lp n xs = ("splitAt_lp " ++ show n) $
    if n<=0
        then ([], xs)
        else
            case xs of
                [] -> ([], [])
                y:ys ->
                    case splitAt_lp' (n-1) ys of
                        -- pattern match is lazy
                        ~(prefix, suffix) -> (y : prefix, suffix)

GHCi

*Main> sum . take 5 . fst . splitAt_sp 10000000 $ repeat 1
5
(20.78 secs, 3642437376 bytes)
*Main> sum . take 5 . fst . splitAt_lp 10000000 $ repeat 1
5
(0.00 secs, 0 bytes)

What to do with thorny space leaks

Pinpoint leaks using GHC’s profiling tools.
For some domains libraries exist that eliminate large classes of space leaks by design
- Streaming libraries like
  - Conduit
  - Pipes
- FRP Libraries like
  - Netwire

For the Purely Lazy

The Low Down

Evaluation strategies

Eager or Lazy

Lazy is a bit more involved

Whats the difference

Lazy always best! Well no.

The Good, The Bad and The Smugly

The Good and Bad

Pros

Cons

So Don’t be Smug

Modularity

Lazy Aids Composition

Lazy Composition Sqrt

Caveats about Lazy Composition

Lazy Tricks

Memoization

Caching

Time and Space

Times up

Space Leaks

Examples from “Leaking Space”

When Strictness can make things slow

What to do with thorny space leaks

Further reading