Thursday, October 29, 2009

How do I compile, and what the heck is a monad, anyway?

Compiling

So you've written a Haskell program. It's the most fantastically awesome thing ever saved to a hard drive. It will surely make you millions of dollars. But there's a problem. How do you compile it?

Fortunately, compiling is very simple. Just do like so from a terminal:
ghc foo.hs

(.hs is a commonly-used extension for Haskell source code files.)
Much like gcc, this will compile the source in foo.hs to a.out. If you want a slightly more informative file name, you can use the -o flag, like so:
ghc -o bar foo.hs

This will compile foo.hs to the file bar.

What's this monad thing?

Monads are weird. There's really no better word for them. You see, functions in pure functional languages aren't supposed to have any side effects. They're supposed to return a value and do nothing else. This means that, technically, a pure functional language can't do some of those less important things - you know, things that nobody ever uses. Things like random numbers. Things like I/O.

Fortunately, being a bunch of really smart guys, the creators of Haskell decided this was ridiculous. The solution to this problem? Monads. Monads allow Haskell to do things like I/O that pure functional languages can't normally do. Random numbers can be handled without monads, but we'll start with them anyway.

In the Random package, there's a function called mkStdGen. It takes an Integer seed and returns a StdGen (a specific implementation of a random number generator) with that seed. There's also a function called next. Let's look at the type.
> import Random
> :t next
next :: (RandomGen g) => g -> (Int, g)

In both the interactive shell and programs to be compiled, you can bring in external packages with import.
RandomGen is a type class for generators of random stuff. (Surprising, I know.) The next function takes a random generator and returns a pair containing a number (your random number) and a new random generator. This new random generator is the key.
In a language like C++ or Java, the next function would return only the number, and modify the existing random number generator. In Haskell, we can't do that; that would be a side effect. So, instead of modifying the generator, we just make a new one. It essentially works like this: instead of modifying the current generator so it will produce the next number in the sequence on the next call, we create a new one which will produce the next number in the sequence.

That's basically how monads work. Instead of modifying the state (which we aren't allowed to do), we pass it in as a parameter, and get a new one out as a result. For the remainder of this post, I'll be mostly looking at the IO monad. This monad lets us essentially treat I/O actions like variables.

Let's look at the type of the function putStrLn, which prints a String to the console.
> :t putStrLn
putStrLn :: String -> IO ()

Notice those parentheses? IO isn't a single type; it's a whole group of them. (This isn't specific to IO; it's the case with all monads in Haskell.) That part of the type signature after the arrow is a type constructor. In this case, it's creating an IO monad that carries no additional information. Compare that to the type signature of another function, getLine. This one reads a line from the console.
> :t getLine
getLine :: IO String

getLine takes no parameters, and returns an IO monad carrying a String with it. Let's look at a simple Haskell program using these two functions.

main = (getLine) >>= (\x -> putStrLn x)

First, the backslash. This is used for lambda expressions - basically functions that are defined inline. That >>= works somewhat like bash's pipe. It passes the value returned by the first function to the second. If you have a lot of actions, this could get unwieldy, so Haskell provides an alternative syntax:
main = do
x <- getLine
putStrLn x

Do hooks everything following it together with >>=, with the added benefit of being able to bind values to names (That's what the <- does.). This program will read a line from the console and print it back. Now, let's try something a bit more complicated.

main = do
x <- getLine
x1 <- reverse x
putStrLn x1

Looking at it, it should do the same thing as the last program, except it should print the line back reversed. However, if we try to compile this, we get an error:
cat.hs:3:1:
Couldn't match expected type `[Char]'
against inferred type `IO Char'
In a stmt of a 'do' expression: x1 <- reverse x
In the expression:
do x <- getLine
x1 <- reverse x
putStrLn x1
In the definition of `main':
main = do x <- getLine
x1 <- reverse x
putStrLn x1

Everything in a do expression must be monadic. So how do we include the results of non-monadic functions? We do this:
main = do
x <- getLine
x1 <- return (reverse x)
putStrLn x1

See return? It does not do what Java or C's return does. Take a look at the type:
> :t return
return :: (Monad m) => a -> m a

Monad is a type class; it should be pretty obvious what kind of types it contains. So return takes a something with type a, and gives back a monad containing something with type a. In fact, these two "somethings" are the same something. Return wraps a value in a monad so it can be used in a monadic sequence.

6 comments:

  1. Um, you're not supposed to use <- and then use "return", that seems ridiculously redundant and is an instant sign of Haskell newbie.

    Why use "<- return" when you can just go, for your example:

    main = do
    x <- get Line
    let x1 = reverse x
    putStrLn x1

    ?

    ReplyDelete
  2. Keep in mind that >>= and do notation are equivalent. What you're writing is:

    getLine >>= (\x -> (return (reverse x) >>= (\x1 -> putStrLn x1)))

    You're tying way too much into the monad there, you should go:

    getLine >>= (\x -> let x1 = reverse x in putStrLn x1)

    ReplyDelete
  3. Also, The Standard Number generator you used there wasn't actually a monad at all, and you weren't 100% clear on that. It would only become a monad if you used a State monad with the StdGen.

    Also, IO is a really, really, really bad example to teach Monads. It's basically only decent for teaching IO.

    ReplyDelete
  4. "...that seems ridiculously redundant and is an instant sign of Haskell newbie." <-- Hey, be nice!

    Master_bratac: if you haven't already, you should drop by the #haskell IRC channel on freenode.net: it's a friendly place with lots of people that like answering questions about Haskell. =)

    ReplyDelete
  5. Also:
    main = getLine >>= (putStrLn . reverse)

    ReplyDelete
  6. Something else that may be of interest to you :

    > ghc foo.hs

    while this will work for the simplest case, it will fail with linking errors as soon as you start using functions that aren't in the base package (which only contains the most basic commodity). You could specify each package used by your program and that's what "cabal" does since it needs to be precise regarding the versions and names of the package used to ensure compatibility, but there's a better solution for your own usage :

    > ghc --make foo.hs

    With "--make", ghc will search itself the dependancy of your program and provides the linker with the reference it needs, it will even compile those dependency if they need it (if they changed since the last time or weren't compiled in the first place).

    ReplyDelete