Deploying Haskell applications with ECS, Docker, and Nix

April 09, 2019

« Previous post Next post »

I've usually had a lot of problems deploying Haskell applications, but as it turns out, using Terraform and Nix, we can easily get reproducible, single-command builds and deployments. I haven't tried implementing it into CI yet, but it should be fairly easy given working installations of all the build tools.

None of this infrastructure is really specific to Haskell, except for the Nix-specific build support for Haskell, but I think this is a useful example regardless.

The output

At the end of this whole process, we'll have:

A working Haskell web application that both sends and accepts HTTP requests
- Deployed on AWS behind a load balancer
A build script (via Nix) that gives us reproducible builds and Docker images for our Haskell application with a simple nix build .
An infrastructure script (via Terraform) that reproducibly sets up all the AWS resources (load balancers, compute instances, etc.) that we need

The goal isn't just to have something that's working, it's also to make sure that if someone else walks into our project with no knowledge of how it works, they should be able to get the very same application we have up and running in the cloud with a nix build . followed by a terraform apply.

What you'll need

An AWS account. Note that creating servers on the cloud will cost money, but if you clean up everything afterwards, this shouldn't cost you more than 1 USD.
Nix.
Docker.
Terraform.

The build

For our build process, we need to be able to generate a Docker image containing our application, which makes it super easy for AWS to spin up new servers for us automatically. If you don't have any experience with Docker, that's okay! For our purposes, it's a commonly-used way to package up an application and all its dependencies so that it's easy to deploy somewhere, whether that's AWS, Google Cloud, your own server box, whatever.

We'll be using Nix to generate Docker images for us.¹ Again, if you don't have any experience with Nix, that's okay! For us, Nix handles pulling down package dependencies and producing output, like Stack does, but it works for lots of other things than Haskell programs.

But before we even get into packaging up our application for deploy, we need an application!

Create a new Haskell application called haskell-cloud-app. I suggest putting the actual application directory inside another directory within the toplevel project directory. That is, you'd have something like this as your directory structure:

haskell-cloud-app/
 └─ haskell-cloud-app/
     ├─ stack.yaml
     ├─ package.yaml
     …  <all the other application code>

This might seem a little redundant, but we'll see why we structure it this way in a bit.

We'll create a simple application that takes in some text and replies with the list of words in said text, using Servant.

Add Servant to your dependencies:

# package.yaml

…
executables:
  haskell-cloud-app-exe:
    dependencies:
      - aeson
      - servant-server
      - warp
…

...and add the following to your Main.hs:

-- Main.hs

{-# LANGUAGE DataKinds         #-}
{-# LANGUAGE DeriveAnyClass    #-}
{-# LANGUAGE DeriveGeneric     #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE TypeOperators     #-}

module Main where

import Data.Aeson
import GHC.Generics
import Network.Wai.Handler.Warp
import Servant

type WordsAPI = "words"
             :> ReqBody '[PlainText] String
             :> Post '[JSON] Words

data Words = Words { words :: [String] }
  deriving (Eq, Show, Generic, ToJSON)

server :: Proxy WordsAPI
server = Proxy

handler :: Server WordsAPI
handler text = pure $ Words { Main.words = Prelude.words text }

main :: IO ()
main = runSettings appSettings $ serve server handler
  where appSettings = setPort 8000 $ setHost "0.0.0.0" $ defaultSettings

And with that, we can now launch our server with a stack build && stack exec haskell-cloud-app-exe:

$ curl -X POST \
    -H "Content-Type: text/plain;charset=utf-8" \
    -H "Accept: application/json;charset=utf-8" \
    -d "lorum ipsum dolor sit amet" \
    localhost:8000/words

{"words":["lorum","ipsum","dolor","sit","amet"]}

Building our app with Nix

Nix provides some utilities for producing Docker images, but in order to use them, we need to write a Nix 'derivation' (Nix' equivalent to a 'package') so that it knows how to build our program. Thankfully, most of the heavy lifting is already done for us by cabal2nix. If you already have Nix installed, you can install cabal2nix by running nix-env -iA cabal2nix -f '<nixpkgs>'.

Before doing anything with cabal2nix, make sure that the Stack project doesn't have a library component in the package.yaml. This is just to ensure that we don't package the library and all its dependencies with our app.

Run cabal2nix . from the root of the Haskell application (i.e. whichever directory has the cabal file/package.yaml in it), and it should spit out something like this:

## haskell-cloud-app.nix

# Arguments to the derivation.
{ mkDerivation, aeson, base, hpack, lib, servant-server, warp }:

# The output.
mkDerivation {
  pname = "haskell-cloud-app";
  version = "0.1.0.0";
  src = ./.;
  isLibrary = false;  # make sure this isn't set
  isExecutable = true;
  libraryToolDepends = [ hpack ];
  executableHaskellDepends = [ aeson base servant-server warp ];
  testHaskellDepends = [ base ];
  prePatch = "hpack";
  homepage = "https://github.com/githubuser/haskell-cloud-app#readme";
  license = lib.licenses.bsd3;
  mainProgram = "haskell-cloud-app-exe";
}

Save this into haskell-cloud-app.nix. What this says, in Nix-speak, is: haskell-cloud-app is a function that takes in a bunch of dependencies (aeson, hpack, servant-server, etc.) and produces a Haskell program. Which is exactly what we want! We just need to feed this function the arguments it wants.

Thankfully, Nix already has a repository of Haskell packages that we can use.

## default.nix

{ pkgs ? import <nixpkgs> {} }:

with pkgs;

haskell.packages.ghc925.callPackage ./haskell-cloud-app.nix {}

Save this into default.nix. Notice how we specifically use the packages from GHC 9.2.5. If the Stack resolver that you're using uses a different version of GHC, specify that instead.

And that's it; everything Nix needs to build our program. Run nix-build default.nix, and Nix should successfully complete and place the output in result/. It might take a while the first time. (You may need to delete Stack's autogenerated .cabal file before building, if you're using hpack.)

Building the Docker image

Now all we need is a Docker image that contains our application and runs our server on startup. Create a directory called docker/ in the outermost project directory, and add another Nix file:

## docker/default.nix

{ pkgs ? import <nixpkgs> {} }:

let
  haskell-cloud-app = import ../haskell-cloud-app { inherit pkgs; };
in

with pkgs;

dockerTools.buildImage {
  name = "haskell-cloud-add-image";

  copyToRoot = buildEnv {
    name = "image-root";
    paths = [ haskell-cloud-app ];
  };

  config = {
    Cmd = [ "${haskell-cloud-app}/bin/haskell-cloud-app-exe" ];
    ExposedPorts = {
      "8000/tcp" = {};
    };
  };
}

(We create it in the outer directory so that changes to the Docker setup don't require Nix to rebuild the entire application.)

Here, Cmd specifies what this image should run on startup; we also need to explicitly tell Docker which ports this image needs to use.

Run a nix-build . in the Docker directory and Nix should successfully give us back a completed Docker image! (Once again, Nix puts the output by default into result.) Load the result into Docker using docker load -i result, and it should spit out a Docker image name that we can finally run!²

$ docker run -p 8000:8000 \
  haskell-cloud-app-image:6c2gzf7qz9a9laha7vyaw76d17na1slx

# in another terminal...
$ curl -X POST \
    -H "Content-Type: text/plain;charset=utf-8" \
    -H "Accept: application/json;charset=utf-8" \
    -d "lorum ipsum dolor sit amet" \
    localhost:8000/words

{"words":["lorum","ipsum","dolor","sit","amet"]}

With that, all the Docker-related setup is done, and we can move onto setting up the AWS infrastructure.

To recap, at this point your directory structure should look like this:

haskell-cloud-app/
 └─ haskell-cloud-app/
     ├─ stack.yaml
     ├─ package.yaml
     ├─ haskell-cloud-app.nix
     ├─ default.nix
     …  <all the other application code>
 └─ docker/
     └─ default.nix

The deploy

First things first: Let's make sure that our Terraform installation is up and running by creating the smallest possible piece of our infrastructure: a single EC2 instance to run our code.

Create a terraform/ directory underneath the toplevel, and put the following in main.tf:

provider "aws" {
  region = "us-west-2"
}

resource "aws_security_group" "haskell-cloud-app-access" {
  name = "haskell-cloud-app-access"

  ingress {
    from_port = 8000
    to_port = 8000
    protocol = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port = 0
    to_port = 0
    protocol = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

resource "aws_instance" "haskell-cloud-app" {
  ami = "ami-0302f3ec240b9d23c"
  instance_type = "t2.medium"
  vpc_security_group_ids = [
    aws_security_group.haskell-cloud-app-access.id
  ]

  tags = {
    Name = "haskell-cloud-app"
  }
}

Some notes here: We need to specifically boot our instance using a specific AMI (Amazon Machine Image), ami-0302f3ec240b9d23c, as doing so will make it easier for us to plug our app into Amazon ECS later, so that we can use the Docker image we've already created to launch our app . We also set up the accessible ports for our instance; we need to allow incoming requests on port 8000, but we also allow all outbound access, as we'll later be modifying our application to make HTTP requests as well.

If you haven't already, get some AWS access tokens and configure your AWS CLI with aws configure. When it asks for region, specify us-west-2. Then run:

$ terraform init
$ terraform apply

And once it finishes running, voila! You should have a server up and running in the cloud, which you can check by going to the EC2 dashboard from AWS's web interface.

Actually running our app

Well, we've got a server up and running now, but how do we actually run our code on it? AWS has a service called ECS (Elastic Container Service) that can automatically deploy Docker images on our server for us! So let's use that.

First, we need to set up an ECS cluster and connect our server to it. Our server also needs to run with a special set of IAM permissions that allow ECS to manage what runs on it.

Create an IAM role called ecs-instance-role and attach the AmazonEC2ContainerServiceforEC2Role policy to it. Then modify your Terraform file:

resource "aws_ecs_cluster" "haskell-cloud-app" {
  name = "haskell-cloud-app"
}

resource "aws_instance" "haskell-cloud-app" {
  ...

  iam_instance_profile = "ecs-instance-role"

  user_data = <<-EOF
    #!/bin/bash
    hostname ${aws_ecs_cluster.haskell-cloud-app.name}-server
    echo ECS_CLUSTER=${aws_ecs_cluster.haskell-cloud-app.name} \
      >> /etc/ecs/ecs.config
    echo ECS_ENGINE_TASK_CLEANUP_WAIT_DURATION='30m' \
      >> /etc/ecs/ecs.config
  EOF
}

Notice that we specify a script for user_data, which gets run when our instance boots up. Because of the AMI we picked, our instance is already running a program called ecs-agent, which handles registering our instance with our ECS cluster.

Apply everything and you should have an ECS cluster up and running! Check that your EC2 instance shows up under the ECS Instances tab of your cluster. If it does, you're all good!

However, we're still not running any code on our server yet.

Getting ECS to run our Docker image

First things first: we need to put our Docker image somewhere where AWS can reach it!

Let's have Terraform create a Docker repository for us:

resource "aws_ecr_repository" "haskell-cloud-app" {
  name = "haskell-cloud-app"
}

At this point, you know the drill. Once that's done, tag the Docker image you created as latest and upload it to the newly-created repository; the repository page should have instructions for this.

Finally, we just need to tell our ECS cluster how to launch an instance of this Docker image:

resource "aws_ecs_service" "haskell-cloud-app" {
  name = "haskell-cloud-app-server"
  cluster = aws_ecs_cluster.haskell-cloud-app.id
  task_definition = aws_ecs_task_definition.haskell-cloud-app.arn
  desired_count = 1
  launch_type = "EC2"
}

data "template_file" "task-definition" {
  template = "${file("haskell-cloud-app-service.json")}"
  vars = {
    repository_url = aws_ecr_repository.haskell-cloud-app.repository_url
  }
}

resource "aws_ecs_task_definition" "haskell-cloud-app" {
  family = "haskell-cloud-app-service"
  container_definitions = data.template_file.task-definition.rendered
}

ECS also requires us to provide a JSON document describing the parameters of our Docker image, so let's add that to haskell-cloud-app-service.json:

[
  {
    "name": "haskell-cloud-app",
    "image": "${repository_url}:latest",
    "cpu": 10,
    "memory": 1024,
    "essential": true,
    "portMappings": [
      {
        "containerPort": 8000,
        "hostPort": 8000
      }
    ]
  }
]

Notice how we take in the Docker repository URL as a template parameter in our JSON definition, and pass it in using template_file in our Terraform specification.

Before applying this, you may have to rerun terraform init to pull down a templating plugin. Apply these changes and — hey presto! you should see a deployment start on your ECS cluster, eventually resulting in a single running task of the Haskell web app.

Connecting to the outside world

We're nearly there! While our code is running successfully, our web application still isn't accessible from the internet. To fix this, we're going to put a load balancer in front of our ECS cluster. It's not really necessary to have the performance gain and flexibility of a load balancer right now, but our company has big dreams: from these humble beginnings we'll build a multibillion-dollar company!

resource "aws_security_group" "load-balancer-access" {
  name = "load-balancer-access"

  ingress {
    from_port = 80
    to_port = 80
    protocol = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port = 0
    to_port = 0
    protocol = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

resource "aws_alb" "haskell-cloud-app" {
  name = "haskell-cloud-app-lb"
  internal = false
  load_balancer_type = "application"
  security_groups = [aws_security_group.load-balancer-access.id]

  # make sure to set these!!
  subnets = [ "<subnet1>", "<subnet2>" ]
}

resource "aws_alb_target_group" "haskell-cloud-app" {
  name = "haskell-cloud-app-tg"
  port = 8000
  protocol = "HTTP"
  deregistration_delay = 30
  target_type = "instance"

  # make sure to set this as well!!
  vpc_id = "<vpc>"

  health_check {
    path = "/internal/health"
    timeout = 5
    unhealthy_threshold = 10
    healthy_threshold = 2
    interval = 30
  }

  depends_on = [ "aws_alb.haskell-cloud-app" ]
}

resource "aws_alb_listener" "http_forward" {
  load_balancer_arn = aws_alb.haskell-cloud-app.arn
  port = "80"
  protocol = "HTTP"

  default_action {
    type = "forward"
    target_group_arn = aws_alb_target_group.haskell-cloud-app.arn
  }
}

Again, you know the drill. Make sure to set some subnet IDs for the load balancer; you should already have some default subnets on your VPC dashboard which you can use; make sure to include whichever subnet your EC2 is in! Additionally, set the vpc_id of the target group to whatever VPC your EC2 is in. (Unfortunately, it doesn't seem like there's an easy way to have Terraform automatically get the VPC ID of our EC2 instance.)

We've created our load balancer, but we haven't yet connected our ECS cluster to it to forward traffic to. There's one little snag we need to iron out before we can do that: our load balancer runs a health check on our application to decide whether to forward traffic to it, and we don't have an endpoint for that right now. Fortunately, it's easy enough to add:

-- Main.hs

type HealthCheck = "internal" :> "health" :> Get '[JSON] NoContent

type API = WordsAPI :<|> HealthCheck

server :: Proxy API
server = Proxy

handler :: Server API
handler = wordsHandler :<|> healthCheckHandler
  where wordsHandler :: String -> Handler Words
        wordsHandler text = pure $
          Words { Main.words = Prelude.words text }

        healthCheckHandler :: Handler NoContent
        healthCheckHandler = pure NoContent

Build a new Docker image for the latest version of our application and push it up to AWS. Then kill the currently-running task in ECS and start a new deployment.

With that, we're finally ready to flip the switch and start talking to the real world!

# main.tf

resource "aws_ecs_service" "haskell-cloud-app"
  ...

  load_balancer {
    target_group_arn = "${aws_alb_target_group.haskell-cloud-app.arn}"
    container_port = 8000
    container_name = "haskell-cloud-app"
  }

  ...
}

output "haskell-cloud-app-endpoint" {
  value = "${aws_alb.haskell-cloud-app.dns_name}"
}

Thanks to the output stanza, Terraform even helpfully spits out the URL of our now-public application, which we can now send requests to!

$ curl -X POST \
    -H "Content-Type: text/plain;charset=utf-8" \
    -H "Accept: application/json;charset=utf-8" \
    -d "lorum ipsum dolor sit amet" \
    http://haskell-cloud-app-lb-322127660.us-west-2.elb.amazonaws.com/words

{"words":["lorum","ipsum","dolor","sit","amet"]}

Success!

Speaking as well as listening

One last tweak we want to make to our application before we publish it to the whole wide world: have our application make some HTTP requests as well.

These days, there are so many third-party APIs that you need to talk to in order to get anything done, it would be weird if our application didn't send some requests as well as receiving them!

An easy way to check if this is working is to use ngrok and have our application send requests to a server running locally. So let's boot up an ngrok server with ngrok http 3000 and use wreq to have our application send stuff:

-- Main.hs
--   don't forget to add wreq to your dependencies
--   in haskell-cloud-app.nix and package.yaml!

import Control.Monad.IO.Class
import Network.Wreq hiding ( Proxy )

type PingAPI = "ping" :> Post '[JSON] NoContent

type API = WordsAPI :<|> PingAPI :<|> HealthCheck

handler :: Server API
handler = wordsHandler :<|> pingHandler :<|> healthCheckHandler
  where ...

        pingHandler :: Handler NoContent
        pingHandler = do
          _ <- liftIO $ post "https://4ce0a7ed.ngrok.io" ([] :: [Part])
          pure NoContent

This compiles just fine, but if we create the Docker image and test it locally, we run into a problem. Sending a request to the /ping endpoint results in this error:

HttpExceptionRequest Request {
  -- a bunch of stuff
}
ConnectionFailure Network.BSD.getProtocolByName:
 does not exist (no such protocol name: tcp)

The problem is that the Docker image we created is pretty minimal; it only contains the executables we need to run our application. One of the things that's missing from our image that would be in a normal server are root SSL certificates. Without these, no secure HTTPS requests can be made!

But, as is often the case with installing things with Nix, adding these back in is a cinch. Modify docker/default.nix:

# docker/default.nix

...

dockerTools.buildImage {
  ...

  copyToRoot = buildEnv {
    ...
    paths = [ haskell-cloud-app iana-etc cacert ];
  }

  ...
}

and rebuild the Docker image. Run the image and send a POST request to /ping, and voila! You should see a request pop up in the ngrok terminal, which then 502s. Our application has succesfully(?) made an HTTPS request!

Wrapping up

With that, we've set up easy processes around all the fundamental things we need to build and deploy a modern cloud-based web application!

Before we burn any more money, let's tear down all of the AWS infrastructure by running terraform destroy.

Now, this is pretty good as a start for our application, but we shouldn't stop here! There are lots of important enhancements we can make on the infrastructural side of our app.

For instance, our load balancer currently only supports HTTP; we should make it use HTTPS by default, and forward any HTTP traffic to HTTPS. Doing this will require setting up SSL certificates, which you can do using either AWS' own Certificate Manager or a free service like Let's Encrypt.

Another thing that would be good is setting up an actual domain name for our application using Route 53 and connecting it to our load balancer; that way, our users don't have to use the long, icky domain name that Elastic Load Balancer generates by default.

Eventually, we'll probably also want a database to store... data. Terraform can handle setting this up for us as well!

But for a first pass, this is pretty good! We have a complete end-to-end working application in the cloud with minimal build and deployment effort; now all we have to worry about is developing a great, user-friendly application.

Here's the full application. Feel free to use it as a template for building your own projects.

Having trouble getting your own Haskell app to run in Docker/AWS? Got more questions about how to make other bits of infrastructure play nice? Talk to me!

« Previous post Next post »

Want to write practical, production-ready Haskell? Tired of broken libraries, barebones documentation, and endless type-theory papers only a postdoc could understand? I want to help. Subscribe below and you'll get useful techniques for writing real, useful programs straight in your inbox.

Footnotes

^↥1 You might ask, why not just use Dockerfiles to build our application/Dockerfile? We could, but honestly, Dockerfiles are just unpleasant to work with, for a bunch of different reasons:

The results of Dockerfile builds aren’t reproducible! Even if you haven’t changed a single bit of code, trying to build a Dockerfile you wrote months or years ago might mysteriously break things; say, if the build installs a different version of a library or CLI tool which isn’t compatible with your program. This isn’t just a purely academic concern, either; I’ve run into this footgun many times, and every time it’s just as annoying as you might expect.
Build caching in Docker depends on the ordering of statements, even if different orders would produce the same results. This unnecessarily slows down builds.
Getting minimal Docker images requires a lot of manual work. This is helped somewhat by Docker’s multi-stage builds, but it still requires you to know all the runtime dependencies of your application. Nix handles this for you automatically.

^↥2 You might need to do this on a Linux box; at least, I had some issues with getting this to work properly on OSX.

Deploying Haskell applications with ECS, Docker, and Nix

The output

What you'll need

The build

Building our app with Nix

Building the Docker image

The deploy

Actually running our app

Getting ECS to run our Docker image

Connecting to the outside world

Speaking as well as listening

Wrapping up

Before you close that tab...

Footnotes