Create a Test User in Cognito

If you’re using AWS CDK and Cognito, probably you want to have a test user account. I use one mainly for testing GraphQL queries and mutations in the AppSync console which requires you to provide a userpool username and password.

Here it is, using an AWS CloudFormation Custom Resource:

      new CfnUserPoolUser(this, "TestUser", {
        userPoolId: this.userPool.userPoolId,
        username: TEST_USER_EMAIL,
        userAttributes: [
          { name: "email", value: TEST_USER_EMAIL },
          { name: "email_verified", value: "true" },
        ],
        desiredDeliveryMediums: ["EMAIL"],
      })
      // set test user password
      new AwsCustomResource(this, "SetTestUserPassword", {
        onCreate: {
          service: "CognitoIdentityServiceProvider",
          action: "adminSetUserPassword",
          parameters: {
            UserPoolId: this.userPool.userPoolId,
            Username: TEST_USER_EMAIL,
            Password: TEST_USER_PASSWORD,
            Permanent: true,
          },
          physicalResourceId: PhysicalResourceId.of("SetTestUserPassword"),
        },
        policy: AwsCustomResourcePolicy.fromSdkCalls({
          resources: AwsCustomResourcePolicy.ANY_RESOURCE,
        }),
      })
    }

Frameworkless Web Applications

Since we have (mostly) advanced beyond CGI scripts and PHP the default tool many people reach for when building a web application is a framework. Like drafting a standard legal contract or making a successful Hollywood film, it’s good to have a template to work off of. A framework lends structure to your application and saves you from having to reinvent a bunch of wheels. It’s a solid foundation to build on which can be a substantial “batteries included” model (Rails, Django, Spring Boot, Nest) or a lightweight “slap together whatever shit you need outta this” sort of deal (Flask, Express).

The idea of a web framework is that there are certain basic features that most web apps need and that these services should be provided as part of the library. Nearly all web frameworks will give you some custom implementation of some or all of:

Configuration
Logging
Exception trapping
Parsing HTTP requests
Routing requests to functions
Serialization
Gateway adaptor (WSGI, Rack, WAR)
Middleware architecture
Plugin architecture
Development server

There are many other possible features but these are extremely common. Just about every framework has its own custom code to route a parsed HTTP request to a handler function, as in “call hello() when a GET request comes in for /hello.”

There are many great things to say about this approach. The ability to run your application on any sort of host from DigitalOcean to Heroku to EC2 is something we take for granted, as well as being able to easily run a web server on your local environment for testing. There is always some learning curve as you learn the ins and outs of how you register a URL route in this framework or log a debug message in that framework or add a custom serializer field.

But maybe we shouldn’t assume that our web apps always need to be built with a framework. Instead of being the default tool we grab without a moment’s reflection, now is a good time to reevaluate our assumptions.

Serverless

What struck me is that a number of the functions that frameworks provide are not needed if I go all-in on AWS. Long ago I decided I’m fine with Bezos owning my soul and acceded to writing software for this particular vendor, much as many engineers have built successful applications locked in to various layers of software abstraction. Early programmers had to decide which ISA or OS they wanted to couple their application to, later we’re still forced to make non-portable decisions but at a higher layer of abstraction. My python or JavaScript code will run on any CPU architecture or UNIX OS, but features from my cloud provider may restrict me to that cloud. Which I am totally fine with.

I’ve long been a fan of and written about serverless applications on this blog because I enjoy abstracting out as much of my infrastructure as possible so as to focus on the logic of my application that I’m interested in. My time is best spent concerning myself with business logic and not wrangling containers or deployments or load balancer configurations or gunicorn.

I’ve had a bit of a journey over the years adopting the serverless mindset, but one thing has been holding me back and it’s my attachment to web frameworks. While it’s quite common and appropriate to write serverless functions as small self-contained scripts in AWS Lambda, building a larger application in this fashion feels like trying to build a house without a foundation. I’ve done considerable experimentation mostly with trying to cram Flask into Lambda, where you still have all the comforts of your familiar framework and it handles all the routing inside a single function. You also have the flexibility to easily take your application out of AWS and run it elsewhere.

There are a number of issues with the approach of putting a web framework into a Lambda function. For one, it’s cheating. For another, when your application grows large enough the cold start time becomes a real problem. Web frameworks have the side-effect of loading your entire application code on startup, so any time a request comes in and there isn’t a warm handler to process it, the client must wait for your entire app to be imported before handling the request. This means users occasionally experience an extra few seconds of delay on a request, not good from a performance standpoint. There are simple workarounds like provisioned concurrency but it is a clear sign there is a flaw in the architecture.

Classic web frameworks are not appropriate for building a truly serverless application. It’s the wrong tool for the architecture.

The Anti-Framework

Assuming you are fully bought in to AWS and have embraced the lock-in lifestyle, life is great. AWS acts like a framework of its own providing all of the facilities one needs for a web application but in the form of web services of the Amazonian variety. If we’re talking about RESTful web services, it’s possible to put together an extremely scalable, maintainable, and highly available application.

Logging, monitoring: CloudWatch
Tracing: X-Ray
Alerting: Incident Manager
Configuration, exception trapping, execution: Lambda
HTTP request parsing, request routing: API Gateway
Relational database: Aurora Serverless
Configuration, secrets: Secrets Manager

No docker, kubernetes, or load balancers to worry about. You can even skip the VPC if you use the Aurora Data API to run SQL queries.

The above list could go on for a very long time but you get the point. If we want to be as lazy as possible and leverage cloud services as much as possible then what we really want is a tool for composing these services in an expressive and familiar fashion. Amazon’s new Cloud Development Kit (CDK) is just the tool for that. If you’ve never heard of CDK you can read a friendly introduction here or check out the official docs.

In short CDK lets you write high-level code in Python, TypeScript, Java or .NET, and compile it to a CloudFormation template that describes your infrastructure. A brief TypeScript example from cursed-webring:

// API Gateway with CORS enabled
const api = new RestApi(this, "cursed-api", {
  restApiName: "Cursed Service",
  defaultCorsPreflightOptions: {
    allowOrigins: apigateway.Cors.ALL_ORIGINS,
  },
  deployOptions: { tracingEnabled: true },
});
// defines the /sites/ resource in our API
const sitesResource = api.root.addResource("sites");
// get all sites handler, GET /sites/
const getAllSitesHandler = new NodejsFunction(
  this,
  "GetCursedSitesHandler",
  {
    entry: "resources/cursedSites.ts",
    handler: "getAllHandler",
    tracing: Tracing.ACTIVE,
  }
);
sitesResource.addMethod("GET", new LambdaIntegration(getAllSitesHandler));

Is CDK a framework? It depends how you define “framework” but I consider more to be infrastructure as code. By allowing you to effortlessly wire up the services you want in your application, CDK more accurately removes the need for any sort of traditional web framework when it comes to features like routing or responding to HTTP requests.

While CDK provides a great way to glue AWS services together it has little to say when it comes to your application code itself. I believe we can sink even lower into the proverbial couch by decorating our application code with metadata that generates the CDK resources our application declares, specifically Lambda functions and API Gateway routes. I call it an anti-framework.

@JetKit/CDK

Why TypeScript?

As an aside, TypeScript is now my preferred choice for backend development. JavaScript no, but TypeScript yes. The rapid evolution and improvements in the language with Microsoft behind it have been impressive. The language is as strict as you want it to be. Having one set of tooling, CI/CD pipelines, docs, libraries and language experience in your team is much easier than supporting two. All the frontends we work with are React and TypeScript, why not use the same linters, type checking, commit hooks, package repository, formatting configuration, and build tools instead of maintaining say, one set for a Python backend and another for a TypeScript frontend?

Python is totally fine except for its lack of type safety. Do not even attempt to blog at me ✋🏻 about mypy or pylance. It is like saying a Taco Bell is basically a real taqueria. Might get you through the day but it’s not really the same thing 🌮

Construct Generation

So we’ve seen the decorated application code, how does it get turned into cloud resources? With the ResourceGeneratorConstruct, a CDK construct that takes your functions and classes as input and generates AWS resources as output.

import { CorsHttpMethod, HttpApi } from "@aws-cdk/aws-apigatewayv2"
import { Construct, Duration, Stack, StackProps, App } from "@aws-cdk/core"
import { ResourceGeneratorConstruct } from "@jetkit/cdk"
import { aliveHandler, AlbumApi } from "../backend/src"  // your app code
export class InfraStack extends Stack {
  constructor(scope: App, id: string, props?: StackProps) {
    super(scope, id, props)
    // create API Gateway
    const httpApi = new HttpApi(this, "Api", {
      corsPreflight: {
        allowHeaders: ["Authorization"],
        allowMethods: [CorsHttpMethod.ANY],
        allowOrigins: ["*"],
        maxAge: Duration.days(10),
      },
    })
    // transmute your app code into infrastructure
    new ResourceGeneratorConstruct(this, "Generator", {
      resources: [AlbumApi, aliveHandler], // supply your API views and functions here
      httpApi,
    })
  }
}

It is necessary to explicitly pass the functions and classes you want resources for to the generator because otherwise esbuild will optimize them out of existence.

Try It Out

I’m a big fan of Serverless Stack, a lightweight toolkit for doing CDK-driven development with a few very useful features like the most advanced local development environment for AWS serverless applications.

Also please take a look at my starter kit for new serverless TypeScript applications.

Woodworker Designs and Builds the Perfect Tiny House Boat called the Le Koroc — Maybe a foundation isn’t needed after all

Web Services with AWS CDK

If you want to build a cloud-native web service, consider reaching for the AWS Cloud Development Kit. CDK is a new generation of infrastructure-as-code (IaC) tools designed to make packaging your code and infrastructure together as seamless and powerful as possible. It’s great for any application running on AWS, and it’s especially well-suited to serverless applications.

The CDK consists of a set of libraries containing resource definitions and higher-level constructs, and a command line interface (CLI) that synthesizes CloudFormation from your resource definitions and manages deployments. You can imperatively define your cloud resources like Lambda functions, S3 buckets, APIs, DNS records, alerts, DynamoDB tables, and everything else in AWS using TypeScript, Python, .NET, or Java. You can then connect these resources together and into more abstract groupings of resources and finally into stacks. Typically one entire service would be one stack.

class HelloCdkStack extends Stack {
  constructor(scope: App, id: string, props?: StackProps) {
    super(scope, id, props);

    new s3.Bucket(this, 'MyFirstBucket', {
      versioned: true
    });
  }
}

CDK doesn’t exactly replace CloudFormation because it generates CloudFormation markup from your resource and stack definitions. But it does mean that if you use CDK you don’t really ever have to manually write CloudFormation ever again. CloudFormation is a declarative language, which makes it challenging and cumbersome to do simple things like conditionals, for example changing a parameter value or not including a resource when your app is being deployed to production. When using a typed language you get the benefit of writing IaC with type checking and code completion, and the ability to connect resources together with a very natural syntax. One of the real time-saving benefits of CDK is that you can group logical collections of resources into reusable classes, defining higher level constructs like CloudWatch canary scripts, NodeJS functions, S3-based websites with CloudFront, and your own custom constructs of whatever you find yourself using repeatedly.

The CLI for CDK gives you a set of tools mostly useful for deploying your application. A simple cdk deploy parses your stacks and resources, synthesizes CloudFormation, and deploys it to AWS. The CLI is basic and relatively new, so don’t expect a ton of mature features just yet. I am still using the Serverless framework for serious applications because it has a wealth of built-in functionality and useful plugins for things like testing applications locally and tailing CloudWatch logs. AWS’s Serverless Application Model (SAM) is sort of equivalent to Serverless, but feels very Amazon-y and more like a proof-of-concept than a tool with any user empathy. The names of all of these tools are somewhat uninspired and can understandably cause confusion, so don’t feel bad if you feel a little lost.

Sample CDK Application

I built a small web service to put the CDK through its paces. My application has a React frontend that fetches a list of really shitty websites from a Lambda function and saves them in the browser’s IndexedDB, a sort of browser SQL database. The user can view the different shitty websites with previous and next buttons and submit a suggestion of a terrible site to add to the webring. You can view the entire source here and the finished product at cursed.lol.

To kick off a CDK project, run the init command: cdk init app --language typescript.

This generates an application scaffold we can fill in, beginning with the bin/cdk.ts script if using TypeScript. Here you can optionally configure environments and import your stacks.

#!/usr/bin/env node
import "source-map-support/register";
import * as cdk from "@aws-cdk/core";
import { CursedStack } from "../lib/stack";

const envProd: cdk.Environment = {
  account: "1234567890",
  region: "eu-west-1",
};

const app = new cdk.App();
new CursedStack(app, "CursedStack", { env: envProd });

The environment config isn’t required; by default your application can be deployed into any region and AWS account, making it easy to share and create development environments. However if you want to pre-define some environments for dev/staging/prod you can do that explicitly here. The documentation suggests using environment variables to select the desired AWS account and region at deploy-time and then writing a small shell script to set those variables when deploying. This is a very flexible and customizable way to manage your deployments, but it lacks the simplicity of Serverless which has a simple command-line option to select which stage you want. CDK is great for customizing to your specific needs, but doesn’t quite have that out-of-the-box user friendliness.

DynamoDB

Let’s take a look at a construct that defines a DynamoDB table for storing user submissions:

import * as core from "@aws-cdk/core";
import * as dynamodb from "@aws-cdk/aws-dynamodb";

export class CursedDB extends core.Construct {
  submissionsTable: dynamodb.Table;

  constructor(scope: core.Construct, id: string) {
    super(scope, id);

    this.submissionsTable = new dynamodb.Table(this, "SubmissionsTable", {
      partitionKey: {
        name: "id",
        type: dynamodb.AttributeType.STRING,
      },
      billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
    });
  }
}

Here we create a table that has a string id primary key. In this example we save the table as a public property (this.submissionsTable) on the instance of our Construct because we will want to reference the table in our Lambda function in order to grant write access and provide the name of the table to the function so that it can write to the table. This concept of using a class property to keep track of resources you want to pass to other constructs isn’t anything particular to CDK – it’s just something I decided to do on my own to make it easy to connect different pieces of my service together.

Lambda Functions

Here I declare a construct which defines two Lambda functions. One function fetches a list of websites for the user to browse, and the other handles posting submissions which saved into our DynamoDB submissionsTable as well as Slacked to me. I am extremely lazy and manage most of my applications this way. We use the convenient NodejsFunction high-level construct to make our lives easier. This is the most complex construct of our stack. It:

Loads a secret containing our Slack webhook URL
Defines a custom property submissionsTable that it expects to receive
Defines an API Gateway with CORS enabled
Creates an API resource (/sites/) to hold our function endpoints
Defines two Lambda NodeJS functions (note that our source files are TypeScript – compilation happens automatically)
Connects the Lambda functions to the API resource as GET and POST endpoints
Grants write access to the submissionsTable to the submitSiteHandler function

import * as core from "@aws-cdk/core";
import * as apigateway from "@aws-cdk/aws-apigateway";
import * as sm from "@aws-cdk/aws-secretsmanager";
import { NodejsFunction } from "@aws-cdk/aws-lambda-nodejs";
import { LambdaIntegration, RestApi } from "@aws-cdk/aws-apigateway";
import { Table } from "@aws-cdk/aws-dynamodb";

// ARN of a secret containing the slack webhook URL
const slackWebhookSecret =
  "arn:aws:secretsmanager:eu-west-1:178183757879:secret:cursed/slack_webhook_url-MwQ0dY";

// required properties to instantiate our construct
// here we pass in a reference to our DynamoDB table
interface CursedSitesServiceProps {
  submissionsTable: Table;
}

export class CursedSitesService extends core.Construct {
  constructor(
    scope: core.Construct,
    id: string,
    props: CursedSitesServiceProps
  ) {
    super(scope, id);

    // load our webhook secret at deploy-time
    const secret = sm.Secret.fromSecretCompleteArn(
      this,
      "SlackWebhookSecret",
      slackWebhookSecret
    );

    // our API Gateway with CORS enabled
    const api = new RestApi(this, "cursed-api", {
      restApiName: "Cursed Service",
      defaultCorsPreflightOptions: {
        allowOrigins: apigateway.Cors.ALL_ORIGINS,
      },
    });

    // defines the /sites/ resource in our API
    const sitesResource = api.root.addResource("sites");

    // get all sites handler, GET /sites/
    const getAllSitesHandler = new NodejsFunction(
      this,
      "GetCursedSitesHandler",
      {
        entry: "resources/cursedSites.ts",
        handler: "getAllHandler",
      }
    );
    sitesResource.addMethod("GET", new LambdaIntegration(getAllSitesHandler));

    // submit, POST /sites/
    const submitSiteHandler = new NodejsFunction(
      this,
      "SubmitCursedSiteHandler",
      {
        entry: "resources/cursedSites.ts",
        handler: "submitHandler",
        environment: {
          // let our function access the webhook and dynamoDB table
          SLACK_WEBHOOK_URL: secret.secretValue.toString(),
          CURSED_SITE_SUBMISSIONS_TABLE_NAME: props.submissionsTable.tableName,
        },
      }
    );
    // allow submit function to write to our dynamoDB table
    props.submissionsTable.grantWriteData(submitSiteHandler);
    sitesResource.addMethod("POST", new LambdaIntegration(submitSiteHandler));
  }
}

While there’s a lot going on here it is very readable if taken line-by-line. I think this showcases some of the real expressibility of CDK. That props.submissionsTable.grantWriteData(submitSiteHandler) stanza is really 👨🏻‍🍳👌🏻. It grants that one function permission to write to the DynamoDB table that we defined in our first construct. We didn’t have to write any IAM policy statements, reference CloudFormation resources, or even look up exactly which actions this statement needs to consists of. This gives you a bit of the flavor of CDK’s simplicity compared to writing CloudFormation by hand.

If you’d like to look at the source code of these Lambdas you can find it here. Fetching the list of sites is accomplished by loading a Google Sheet as a CSV (did I mention I’m really lazy?) and the submission handler does a simple DynamoDB Put call and hits the Slack webhook with the submission. I love this kind of web service setup because once it’s deployed it runs forever and I never have to worry about managing it again, and it costs roughly $0 per month. If a website is submitted I can evaluate it and decide if it’s shitty enough to be included, and if so I can just add it to the Google Sheet. And I have a record of all submissions in case I forget or one gets lost in Slack or something.

CloudFront CDN

Let’s take a look at one last construct I put together for this application, a CloudFront CDN distribution in front of a S3 static website bucket. I realized the need to mirror many of these lame websites because due to their inherent crappiness they were slow, didn’t support HTTPS (needed when iFraming), and might not stay up forever. A little curl --mirror magic fixed that right up.

It’s important to preserve these treasures

Typically defining a CloudFront distribution with HTTPS support is a bit of a headache. Again the high-level constructs you get included with CDK really shine here and I made use of the CloudFrontWebDistribution construct to define just what I needed:

import {
  CloudFrontWebDistribution,
  OriginProtocolPolicy,
} from "@aws-cdk/aws-cloudfront";
import * as core from "@aws-cdk/core";

// cursed.llolo.lol ACM cert
const certificateArn =
  "arn:aws:acm:us-east-1:1234567890:certificate/79e60ba9-5517-4ce3-8ced-2d9d1ddb1d5c";

export class CursedMirror extends core.Construct {
  constructor(scope: core.Construct, id: string) {
    super(scope, id);

    new CloudFrontWebDistribution(this, "cursed-mirrors", {
      originConfigs: [
        {
          customOriginSource: {
            domainName: "cursed.llolo.lol.s3-website-eu-west-1.amazonaws.com",
            httpPort: 80,
            originProtocolPolicy: OriginProtocolPolicy.HTTP_ONLY,
          },
          behaviors: [{ isDefaultBehavior: true }],
        },
      ],
      aliasConfiguration: {
        acmCertRef: certificateArn,
        names: ["cursed.llolo.lol"],
      },
    });
  }
}

This creates a HTTPS-enabled CDN in front of my existing S3 bucket with static website hosting. I could have created the bucket with CDK as well but, since there can only be one bucket with this particular domain that seemed a bit overkill. If I wanted to make this more reusable these values could be stack parameters.

The Stack

Finally the top-level Stack contains all of our constructs. Here you can see how we pass the DynamoDB table provided by the CursedDB construct to the CursedSitesService containing our Lambdas.

import * as cdk from "@aws-cdk/core";
import { CursedMirror } from "./cursedMirror";
import { CursedSitesService } from "./cursedSitesService";
import { CursedDB } from "./db";

export class CursedStack extends cdk.Stack {
  constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const db = new CursedDB(this, "CursedDB");
    new CursedSitesService(this, "CursedSiteServices", {
      submissionsTable: db.submissionsTable,
    });
    new CursedMirror(this, "CursedSiteMirrorCDN");
  }
}

Putting it all together, all that’s left to do is run cdk deploy to summon our cloud resources into existence and write our frontend.

Security Warnings

It’s great that CDK asks for confirmation before opening up ports:

Is This Better?

Going through this exercize of creating a real service using nothing but CDK was a great way for me to get more comfortable with the tools and concepts behind it. Once I wrapped my head around the way the constructs fit together and started discovering all of the high-level constructs already provided by the libraries I really started to dig it. Need to load some secrets? Need to define Lambda functions integrated to API Gateway? Need a CloudFront S3 bucket website distribution? Need CloudWatch canaries? It’s already there and ready to go along with strict compile-time checking of your syntax and properties. I pretty much never encountered a situation where my code compiled but the deployment was invalid, a vastly improved state of affairs from trying to write CloudFormation manually.

And what about Terraform? In my humble opinion if you’re going to build cloud-native software it’s a waste of effort to abstract out your cloud provider and their resources. Better to embrace the tooling and particulars of one provider and specialize instead of pursuing some idealistic cloud-agnostic setup at a great price of efficiency. Multi-cloud is the worst practice.

The one thing that I missed most from the Serverless framework was tailing my CloudWatch logs. When I had issues in my Lambda logic (not something the CDK can fix for you) I had to go into the CloudWatch console to look at the logs instead of simply being able to tail them from the command line. The upshot though is that CDK is simply code, and writing your own tooling around it using the AWS API should be straightforward enough. I expect SAM and the CDK CLI to only get more mature and user-friendly over time, so I imagine I’ll be building projects of increasing seriousness with them as time progresses.

If you want to learn more, start with the CDK docs. And if you know of any cursed websites please feel free to mash that submit button.

AWS Orchestration / Unemployed DevOps Professionals

Now that we live in the age of ~the cloud~ it strikes me that for many software projects the traditional roles of system administrator and their more recent rebranded “DevOps” are not strictly required.

I can’t speak much to other cloud hosting platforms but Amazon Web Services really is the flyest shit. All of Amazon’s internal infrastructure has been built with APIs to control everything for many years by decree of Jeff Bezos, the CEO. This was a brilliant requirement that he mandated because it allowed Amazon to become one of the first companies able to re-sell its spare server capacity and make their automated platform services available to everyday regular developers such as myself. They’ve been in the business of providing service-oriented infrastructure to anyone with a credit card longer than most everything, and their platform is basically unmatched. Whereas before setting up a fancy HTTPS load balancer or highly available database cluster with automated backups was a time-consuming process, now a few clicks or API calls will set you up with nearly any infrastructure your application requires, with as much reliability and horsepower as you’re willing to pay for.

I’ve heard some people try to argue that AWS is expensive. This is true if you use it like a traditional datacenter, which it’s not. If you try running all your own services on EC2 and pay an army of expensive “DevOps” workers to waste time playing with Puppet or Chef or some other nonsense then perhaps it’s a bit costly. Though compared with power, bandwidth, datacenter, sysadmin and hardware costs and maintenance overheard of running on your own metal I still doubt AWS is going to run you any more. In all likelihood your application really isn’t all that special. You probably have a webserver, a database, store some files somewhere, maybe a little memcached cluster and load balancer or two. All this can be had for cheap in AWS and any developer could set up a highly available production-ready cluster in a few hours.

Us software engineers, we write applications. These days a lot of them run on the internet and you wanna put them somewhere on some computers connected to the internet. Back in “the day" you might have put them on some servers in a datacenter (or your parents’ basement). Things are a little different today.

Some time ago I moved my hosting from a traditional datacenter to AWS. I really didn’t know a lot about it so I asked the advice of some smart and very experienced nerds. I thought it would be pretty much the same as what I was doing but using elastic compute instances instead of bare metal. They all told me “AWS is NOT a datacenter in the cloud. Stop thinking like that.”

For example, you could spin up some database server instances to run MySQL or PostgreSQL OR you could just have AWS do it for you. You could set up HAproxy and get really expensive load balancers, or simply use an elastic load balancer. Could run a mail server if you’re into that, but I prefer SES. Memcached? Provided by ElastiCache. Thinking of setting up nagios and munin? CloudWatch is already integrated into everything.

Point being: all the infrastructure you need is provided by Amazon and you don’t need to pay DevOps jokers to set it up for you. AWS engineers have already done all the work. Don’t let smooth-talking Cloud Consultants talk you into any sort of configuration management time-wasters like Puppet. Those tools impose extra overhead to make your systems declaratively configured rather than imperatively because they are designed for people who maintain systems. In EC2-land you can and should be able to kill off any instance at any time and a new one will pop up in its place, assuming you’re using autoscaling groups. You are using ASgroups, right? You will be soon!

When you can re-provision any instance at will, there is no longer any need to maintain and upgrade configuration. Just make a new instance and terminate the old ones. Provision your systems using bash. Or RPMs if you want to get really fancy. You really don’t need anything else.

I’m a fan of Amazon Linux, which is basically just CentOS. I use a nifty yum plugin that lets me store RPMs in the Simple Storage Service (S3) and have instances authenticate via IAM instance roles. This is a supremely delightful way of managing dependencies and provisioning instances.

The last piece of the puzzle is orchestration; once you have all of your infrastructure in place you still need to perform tasks to maintain it. Updating your launch configurations and autoscaling groups, deploying code to your servers, terminating and rebuilding clusters, declaring packages to be installed on EC2 with cloud-init and so on. You could do all of this by hand maybe or script it, except that you don’t have to because I already did it for you!

To be totally honest, my AWS setup is pretty freaking sweet. The reason it is freaking sweet is because I listened to grumpy old AWS wizards and took notes and built their recommendations into a piece of software I call Udo – short for Unemployed DevOps.

Udo is a pretty straightforward application. It essentially provides a configuration-driven command-line interface to Boto, which is the python library for interfacing with the AWS APIs. It is mostly centered around autoscaling groups, which are a very powerful tool not only for performing scaling tasks but also for logically grouping your instances. In your configuration file you can define multiple clusters to group your ASgroups, and then define “roles” within your clusters. I use this system to create clusters for development, QA, staging and production, and then in each cluster I have “webapp” roles and “worker” roles, to designate instances which should handle web requests vs. asynchronous job queue workers. You can of course structure your setup however you want though.

Using Udo is as simple as it gets. It’s a python module you can install like any other (sudo easy_install udo). Once it’s installed you create a configuration file for your application’s setup and a Boto credentials file if you don’t already have one. Then you can take it for a spin.

The cluster/role management feature is central to the design. It makes it so you never have to keep track of individual instances or keep track of IP addresses or run any sort of agents on your instances. Finding all of your stage webapp server IPs for example is as easy as looking up the instances in the stage-webapp autoscaling group. You can easily write tools to automate tasks with this information. We have a script that allows you to send commands to an autoscaling group via SSH, which works by reading the external IPs of the instances in the group. This is so useful we plan on adding it to Udo sometime in the near future, but it’s an example of the sort of automation that would normally require fancy tools and daemons or keeping track of IPs in some database somewhere, but is totally simplified by making use of the tools which Amazon already provides you.

Udo has a few nifty features on offer. One handy command is “updatelc” – update launchconfiguration. Normally you cannot modify a launch configuration attached to an autoscaling group, so Udo will instead create a copy of your existing launchconfig and then replace the existing launchconfig on your asgroup, allowing you to apply udo.yml configuration changes without terminating your asgroup. Very handy for not having to bring down production to make changes.

Another powerful feature is tight integration with CodeDeploy, a recent addition to the AWS ops tools suite. As far as I’m aware Udo is the first and only application to support CodeDeploy at this time and I actually have an epic support ticket open with a sizable pile of feature requests and bug reports. Despite its rather alpha level of quality it is extremely handy and we are already using it in production. It allows AWS to deploy a revision straight from GitHub or S3 to all instances in an autoscaling group or with a particular tag, all without any intervention on your part other than issuing an API call to create a deployment. You can add some hooks to be run at various stages of the deployment for tasks like making sure all your dependencies are installed or restarting your app. I’d honestly say it’s probably the final nail in the coffin for the DevOps industry.

Mischa Spiegelmock

All the text that's fit to blog

Tag: aws