Wednesday, January 27, 2016

The Book of F#, Chapter 5: Let's Get Functional

This chapter looks at what functional programming really is. In it there are a number of concepts that we’ve either seen in code or alluded to in text as part of the previous chapters, and it was valuable to get more detailed information about things that I had ‘seen but not understood’.

I also feel vindicated for skipping Chapter 4 (for the time being): the author suggests that we should always favour a functional approach when developing in F#, which is precisely the reason that I didn’t want to read about its object-oriented capabilities before understanding its functional core.

Also, the chapter is quite long. As such, I’ll probably write this post in two halves and upload the first once it’s finished, with the second half following a couple of days afterwards.


What is Functional Programming

We’ve touched on this before, but it’s nice to think of functional programming as applying idempotent functions to data. That is, the data structures and objects we create are immutable, and the functions we write will always evaluate to the same output for a fixed input.

Bringing these two concepts together is referential transparency, which encapsulates the situation I have just described by stating that an expression can be directly replaced by its result without affecting the program’s execution. An example would be the ability to replace let x = 2 + 3 with let x = 5, which means that the additional operator + is referentially transparent.

Whilst these are only a couple of functional programming concepts, they are two that make a lot of sense to me, and towards which I like to strive even when writing object-oriented code.


Programming with Functions

F# functions are built from the premise that they always accept a single input, and always return a single output. We discussed this, and the handy unit type, in my previous post, but it comes up again here. Specifically, this fact allows the compiler to assume that the last expression evaluated in the function is the return value, so you don’t have to explicitly specify this with a keyword.

Functions as Data

Similar to delegation in C# is the treatment of functions as data. In F# this is given first-class support, which is one of the features that distinguishes it as a functional language. As a result, we can pass around functions like we can any other object, and we’ll see how useful this is later on in the post. For the time being we will just note the use of ->, which is an infix token that means ‘take something of the left-hand type and return something of the right-hand type’, e.g. int -> int.

The concept of higher-order functions is also introduced here, albeit briefly. To the untrained eye this makes function signatures long and confusing (if you haven’t seen it before, could you possibly understand (string -> string) -> string -> unit?), but hopefully I will get used to it given time.


Currying

A lot of detailed stuff is written about curried functions. I am choosing to deliberately simplify my understanding of it as such:

In a language where functions accept exactly one input, currying is how we create things with more than one input.

In essence, if we want to be able to write things like let add a b = a + b, then there has to be something behind the scenes that means this doesn’t type check to a function taking two inputs. This thing is currying, or automatic function chaining, which will look at the statement about and say:

Hey, if add a b needs to return an int, then add a needs to return something that takes b and returns an int.

As you might have figured out, this means the type of add is val add : a:int -> b:int -> int. In English, add is a function that takes an integer a and returns another function. The function it returns takes another integer b and returns a third integer (here a + b).

The author uses a nice trick to show another way of thinking about this: if we write the function as let add a = fun b -> (+) a b, it’s pretty clear that add a returns a function!

This is all very interesting from a theoretical point of view, but it seems to me that in day-to-day development, you really don’t need to know or care that your functions are being curried behind the scenes. Perhaps it helps in understanding the types that are inferred from your code, but beyond that, currying alone does not seem much to write home about.

Partial application

This is another subtle distinction that I really don’t want to get in to. Partial application means that if you call a function with an incomplete set of arguments, it will be evaluated as far as possible.

This seems to have huge benefit for confusion — what if you accidentally miss out a parameter somewhere in your calling code? Rather than not compiling, your work will be valid but ultimately output something of the wrong type, which might be a bit of a shock when you come to use it days or weeks later.

Pipelining and Composition

If I sounded nonplussed about currying, the opposite is true for two other functional features that we are introduced to here.

The ability to send the result of one function to another using one of the pipeline operators (|> and <| for forwards and backwards respectively) seems incredibly useful at first glance. So much of the code that I write comprises half a dozen or so procedural steps, the result of each being passed in to the next statement. The fact that this can be done with an operator in F# is great. Here’s the example from the book:

let fahrenheitToCelsius degreesF = (degreesF - 32.0) * (5.0 / 9.0)

let marchHighTemps = [ 33.0; 30.0; 33.0; 38.0; 36.0; 31.0; 35.0;
                   42.0; 53.0; 65.0; 59.0; 42.0; 31.0; 41.0;
                   49.0; 45.0; 37.0; 42.0; 40.0; 32.0; 33.0;
                   42.0; 48.0; 36.0; 34.0; 38.0; 41.0; 46.0;
                   54.0; 57.0; 59.0 ]
marchHighTemps
|> List.average
|> fahrenheitToCelsius
|> printfn "March Average (C): %f"

Simple, clear, and so much nicer than object-oriented code!

Similar in nature and style is the concept of function composition, which takes two functions and returns a third. Keeping with the example above, we could define

let averageInCelsius = List.average >> fahrenheitToCelsius

which binds two steps of the computation in one function. Pretty damn cool!

I reckon these two features, combined together and in the right project, could be reason alone to pick F# over C#. It’s a bold statement, but notice I said in the right project. I can foresee that in a lot of applications, this kind of syntax won’t be use nearly often enough to reap any reward.


Recursive Functions

We said in the previous post that recursion is the preferred method of looping in a functional language, and now we’ll get to see why, and how.

What I (naïvely) didn’t realise is that all the traditional lopping constructs (while, for and foreach) rely on a chance in state to terminate, and so don’t really mesh with pure functional programming so well. Instead, if we have a recursive function that calls itself, all we need is some base case that causes it to eventually terminate.

This looks very similar to induction to my maths brain, in that we say what to do in the base case as well how any other case relates to it’s predecessor (i.e. the inductive step).

In F#, we do recursion with the rec keyword. The example given is the classic factorial computation:

let rec factorial v =
  match v with | 1 -> 1
               | _ -> v * factorial (v - 1)

This reads almost as-is: if the input is 1 then return 1, otherwise return v times the same function called with v - 1.

On the negative side, this is not tail-call optimised, and is the kind of function that gave rise to the most famous of programming websites, Stack Overflow. Basically, writing the function like this means that if we start with a really big number, we will end up with a new stack frame for each function call, which will cause a Stack Overflow!

We get around this fairly simply by ensuring that the recursive call is done last (it might look like this is the case for our current implementation, but it’s actually the multiplication operation that end up in the tail position). One way to do this is by keeping track of both the current number we are ‘down to’ as well as the product computed so far:

let factorial v =
  let rec fact current product =
    match current with | 0 -> product
                       | _ ->   fact <| current - 1 <| current * product
   fact v 1

As shown here, this form will allow the program to use a JUMP statement or similar and replace the arguments to fact on each iteration, rather than calling the function again.

Whilst you can obviously write a recursive function in C#, the compiler doesn’t support tail recursion, so this gives F# a big thumbs up from me.

Mutually Recursive Functions

I’m not entirely sure how often I will use this concept, but as the name suggests, two functions can call each other recursively. The example from the book is the fibonacci sequence:

let fibonacci n =
  let rec f = function
          | 1 -> 1
          | n -> g (n - 1)
  and g = function
      | 1 -> 0
      | n -> g (n - 1) + f (n - 1)
  f n + g n

Nice, but I cannot immediately think of an example in my current work where I would use this.


Lambda Expressions

These are something that are very familiar to me, as they are used extensively in C#. They seem pretty similar in F# to be honest, using the same -> token and with broadly similar use cases (i.e. when we have inline expressions that are only ever used in a very narrow scope.


Next Time

I’ll leave it there for the time being, and finish this post off within a few days. Here’s a sneak peek of what’s left to cover in this chapter.

  • Closures
  • Functional Types
  • Discriminated Unions
  • Lazy Evaluation
  • Summary

Reprise

As promised (though a week or so after originally planned), it’s time to finish off my overview of what functional programming comprises. It looks like the book is quite in-depth at parts, so I will skip anything that is more details than necessary.


Closures

These are a big feature that I’m already quite familiar with from C#, and are quite well documented (see this post by Jon Skeet as an example).

In a nutshell, they allow a function to capture a value that is visible to it, even if it’s not part of the function. A good example of this is something that increments a global counter:

let g() = 
    let mutable counter = 0;
    fun () -> counter <- counter + 1

The only thing I noticed here was something that I first encountered back in Chapter 3: as of F# 4.0, you can mutable keyword for values that change inside closures. Beforehand this wasn’t possible and you needed a reference cell instead.


Functional Types

This rather long section boils down to Tuples, Records, and a lot of syntax. On first reading this, I thought there was a lot to take in. However, I since read some notes the form part of the University of Washington’s Programming Language Course which explained that tuples are just syntactic sugar for records with particular component naming conventions. I will talk a lot more about this when I eventually go through those notes in a subsequent post, so here I will focus more on the F# syntax rather than the more precise semantics.

Tuples

Tuples are a way to group an arbitrary number of values in a single construct. In F# they are created by separating their arguments with ,, and have a type signature where elements are separated by *:

> let tuple1 = 1.0, "hello";;

val tuple1 : float * string = (1.0, "hello")

They are also, as of .NET 4, in C#. I haven’t worked with them much in a C# context as rather than writing stuff like

var tuple = new Tuple<string, int>("Doug", 28)
Console.WriteLine(tuple.Item1 + ": " + tuple.Item2)

I would rather write

var person = new Person("Doug", 28)
Console.WriteLine(person.Name + ": " + person.Age)

The second version makes it so much more obvious what’s going on.

In F# they seem to have much better first-class support (probably because F# is based off ML, which has first-class tuple support), and so are a bit easier to work with. There are still limitations, though. You can access the first and second elements of a pair with fst and snd, but to access all three elements of a triple you would have to use Item1 etc. or deconstruct the tuple:

let triple = "Doug", 28, "London"
  let name, age, city = triple 
  in (...)

Essentially, you construct only to deconstruct at first use, which doesn’t immediately chime with me.

It’s worth bearing in mind how often tuples are used within F#, though. Consider calling a .NET library function that takes two arguments. We’ve seen already that every function in F# takes in one arguments and returns one result. So how can we call String.Format, for example? Using a tuple! F# converts the function signature of things like this to be a function from the required tuple type, to the output type. This gives the illusion of calling things in C# style, which provides some comfort I suppose.

Furthermore, F# uses tuples in place of out parameters. To be honest, I don’t like using these in C# (the TryGetxxx functions make me squirm!), but using the magic of tuples F# can return two results that are boxed up into one:

let parseSuccessful, parsedValue = Int32.TryParse "10"

This, I admit, is nicer that C#. I’ve already written too many custom return structs for functions where I want to return more than one thing, and F# solves these issues in a trifle.

So, tuples are great for returning stuff and calling .NET library functions, but look a bit rubbish when explicitly creating them yourself.

Record Types

I’ve already said that tuples are just record types with a special syntax. So what does one look like in F#?

type person = { Name : string; Age : int; City : string }

Isn’t this exactly what I wanted earlier in C# and wrote a class for? Yes it is! We get the best of both world here, in my opinion: creating custom data types without needing whole classes, and being able to access the bits of a class in a strongly-typed and clear manner.

There are a few other syntax bits with tuples that might be useful: you can do a ‘copy and update’:

let doug = { Name = "Doug"; Age = 27; City = "London" }
let itsMyBirthday = { doug with Age = 28}

You can also declare individual bits of a tuple as mutable, but that sounds like a road I’d rather not go down given that I am trying to learn a relatively pure functional paradigm.

Finally, you can add a member to a record:

type person = { Name : string; Age : int; City : string }
              member x.Description() =
                x.Name + " lives in " x.City 

This is because they are syntactic sugar for classes! All along we really wanted a Person class, and now we have it. But we also have a really nice way of declaring it, and excellent type-safety.


Discriminated Unions

Returning to the UW Programming Languages course, we can now see that tuples and records are ‘each of’ types.

The F# version of a ‘one of’ type is the discriminated union. This is something where the values are restricted to be one of a set of cases. Each case can be of any type, and these can be recursive:

type Tree =
    | Empty
    | Node of int * Tree * Tree

The above type says that a tree is either the tip of the tree, or it is a node that has a value int, and left and right Tree instances.

This is another area that I’ll cover in more detail when I go through the UW course in a future post, but I think it’s still useful to know now that all this data type is doing is providing us with a way to ‘tag’ data with it’s type, so in the above example we can say that something that’s empty is really Empty, and something that looks like int * Tree * Tree is really a Node.

As of F# 3.1, you can actually create instances of these types using named arguments, so we might have actually written

type Tree =
    | Empty
    | Node of value : int * left : Tree * right : Tree

let newNode = Node (value = 3, left = EmptyTree, right = EmptyTree)

The power of discriminated unions is really in their ability to be pattern matched. As an example, summing the nodes of the tree could be done as:

let rec sumTree tree =
    match tree with
    | Empty -> 0
    | Node(value, left, right) ->
        value + sumTree(left) + sumTree(right)

let myTree = Node(0, Node(1, Node(2, Empty, Empty), Node(3, Empty, Empty)), Node(4, Empty, Empty))

let resultSumTree = sumTree myTree

I don’t think I’ve ever summed the contents of a tree in C#, but it would definitely have more if or switch statements than that!

The final couple of points on discriminated unions are that they can be convenient aliases, and they can be extended with additional members. Coincidentally, I have been reading Eric Lippert’s blog series on OCaml and in one of the posts he uses a very similar aliasing technique to ensure type safety when passing around primitive stuff. That is, we could treat a binary tree value as

type NodeValue = NodeValue of int

In a more complex module, this would prevent us passing in another random int as a node value (e.g. the height of the tree). This syntax is so helpful and so lightweight, I am amazed it isn’t more ubiquitous in programming languages.


Lazy Evaluation

This is another feature that has made it across to C#. Once again it’s something I haven’t used explicitly that much. However, I am familiar with the concept from things like IEnumerable<T>.

In F#, we have syntactic support for lazy evaluation with the lazy keyword, which creates an instance of Lazy<T> that can be executed using the Force() function or Value property. There’s not much more to say at this stage, lazy evaluation is a useful concept and simple to implement in F#!


Summary

There was a lot in this chapter. Some of the concepts made a lot more sense after reading around the topic, espeically in the area of tuples, records and discrinimated unions. Yet further topics were already familiar from C# such as closures and lazy evaluation, and others from my brief forays in to Haskell such as currying and partial application.

The topics slowly sink in, but will take more time! Next up I will break away from the book and go back to some more academic and basic principles to cement my knowledge as well as do some real exercises.

No comments:

Post a Comment