A Balanced Look at GraphQL

Is GraphQL a tool that will make your life better or worse?

TL;DR: as with any technology it depends on your situation and needs. There are real benefits for developer experience and solving many related problems with a cohesive architecture, but it’s not just a simple drop-in replacement for REST, HATEOAS or {g,t}RPC. GraphQL is better for serious projects with interactive user interfaces and a rich relational data model. If you are willing to adopt a new paradigm and use it as a foundation for a new application, it can be worthwhile and make your life moderately easier.

I’ll summarize the tradeoffs here and go into more detail with examples and explanations afterwards.

Pros:

Can obviate the need for a web framework or server entirely
End-to-end type safety
Great for language-agnostic APIs (vs. something like tRPC) and code generation
Presents a unified client interface that can weave together independent services or functions on the backend
Implement simple handlers without even talking to your application backend
Add (simple) declarative access controls
Smart caching
Real-time push events (WebSockets) are solved for you
Documentation generation for free (like OpenAPI but better)
Client has more control over what data is fetched retrieved
Designed for data structured as a graph (obviously) or hierarchical/relational data
Fetch all the data the application needs in a single request (great for mobile apps)
Can segregate your application services into independent units to get the benefits of a microservice architecture without the overhead and complexity

Cons:

Requires some specialized tools to work with compared to REST
Not as universally understood and adopted as REST
Can be more strict than necessary for your project
Requires some sort of extra schema definition step to add a new “endpoint”
Not necessarily suitable for public/unauthenticated APIs
May involve code generation, an extra build step
Unusual HTTP semantics
- All requests are to the same endpoint
- All responses are 200 OK
- Everything is a POST
No support for untyped/unstructured fields, just “string”

Caveats

I’m writing based on my experience with particular implementations of GraphQL servers and clients. My setup has been AWS AppSync, graphql-code-generator, TypeScript, React, and Apollo Client. Some of my opinions are relevant to this setup and some are broadly applicable. I think AWS and React are common enough technologies that it’s worth looking more at some concrete implementation details there.

Other Opinions

GraphQL was designed and created by Facebook, a company that seems to make a lot of money and build performant software and complex user interfaces that work across a variety of platforms. They are also behind React and are worth taking seriously when it comes to web technologies, even if they were originally cursed with PHP and MySQL.

Anecdotally I’ve seen some articles and videos across the transom bagging on GraphQL. I recently listened to a terrific interview on the Changelog podcast diving deep into API design, and the design of hypermedia and HATEOAS which I highly recommend. There was some very light and admitted highly subjective criticism of GraphQL on it but they are keeping an open mind. I hope to fill some open minds with my arguments for and against GraphQL, which is the best you can do with any technology other than say “it depends.” The episode and transcript can be found here:

Changelog & Friends 24: HATEOAS corpus – Listen on Changelog.com

Amazon has some arguments for GraphQL over REST as well.

What is GraphQL?

GraphQL is an attempt to add another layer of abstraction and standardization on top of traditional HTTP-based APIs. There’s nothing amazingly groundbreaking about it, but it provides for a standard architecture that enforces many practices that can be beneficial for people building moderately serious apps. You must define a schema for the types of requests your clients can make, what fields appear in the request and response bodies, what changes can be subscribed to (think WebSockets), and extensions that can be implemented by the gateway (think permissions or @deprecated).

This schema can be derived from code or written before code. It contains enough information to generate types so your clients and servers know exactly what will be in a given request, which is also enabled by the validation that will be automatically performed to ensure all requests and responses strictly conform to the declared schema. So as far as checking types of data or getting only the fields you explicitly allow in a request, that is one less thing to worry about on the server side and is a more secure default than REST APIs which allow the client to send any sort of data and fields it pleases with a request. The schema also enables documentation, client generation, a fully discoverable API, and a playground for testing requests. Very similar to what you get if you use OpenAPI (aka Swagger) generation tools with REST APIs, but in my opinion it’s much smoother and pleasant compared with OpenAPI. When you define an operation (equivalent of an endpoint), it must be a query, mutation, or subscription. This is very much like HTTP verbs like GET and POST, but more helpfully explicit and strict about the semantics.

The API discovery of fields and operations gives a bit more flexibility in API design and the actual work that must be done to make sure the client can retrieve all the data it needs in a single request. I’m going to borrow the examples from graphql.org here:

On the left is a query which is asking for two separate pieces of information at the same time: give me the `name` of `hero` and give me the `name` of the `droid` with `id` `2000`.

The schema definition and server implementation can make many fields on objects like hero or droid available to be queried, optionally spawning additional requests to fulfill the request. This lets the API designer expose all the data they deem necessary for the client to have at some point without having to define endpoints or parameters to handle every permutation of data fetches and information a client may wish to display in a given interface. It pushes a little more work onto the client side of things but in a very natural way. Let the client decide for a given situation what data it needs, ask the server, which then will issue the appropriate requests to fulfill the request. This may sound a bit like a “backend for frontend” often implemented in microservice architectures. GraphQL is a wonderful replacement for a BFF.

A Replacement for Web Frameworks

This is going to be controversial statement but you can replace your web server, web framework, DTOs (data transfer objects, also called schemas), OpenAPI generation tooling, BFF, and application load balancer with GraphQL. It solves all of these problems for you in an elegant and well-thought-out cohesive architecture. But it is a different paradigm than your typical backend developer is used to.

Boring History

As a certified Old Person building web applications since about 2000, I’ve encountered a good number of web frameworks. Perl, Python, Java, Node, Ruby, they’re all the exact same thing from Catalyst to Flask to Spring to Express to Nest to Rails. In a nutshell they:

Let you map routes to function handlers
Define what shape data should be in for requests and responses
Validate API requests
Run in some container to service requests

They typically have some other helpful things like ORM integration and middleware but these are trivial to incorporate if desired.

What? How Can I Live Without A Web Framework??

Do I detect a note of panic? Relax. GraphQL handles these things for you. It maps requests to handlers, strictly defines what shape data will be in going in and out of your endpoints, and takes care of the HTTP request handling layer for you. So you can define a schema and then functions which are invoked as needed to service client requests, possibly in parallel or in response to certain fields being requested by the client.

Arguably at the end of the day we have in its end result the same functionality whether we use a web framework or GraphQL, but I believe the bundling of this functionality with the extra thought and care that has gone into making GraphQL a more modern solution is worth serious consideration for new projects. Web application frameworks haven’t changed in twenty years, best practices have evolved since then. In the end you can achieve the same results with CGI scripts or PHP 3, or <form> tags and jQuery, but we know that there have been advancements in tooling and architecture since then.

Maybe I Do Need A Web Framework?

As I started off with, I am absolutely not arguing GraphQL is better in every way or should be used in all situations. There are a great many valid reasons for not choosing to adopt GraphQL for your project. A few off the top of my head:

Adding a new GraphQL-based service when you already have other APIs (REST or otherwise). GraphQL is a terrific unifier of services but a big ask of teams to adopt incrementally. Managing tooling for two API paradigms sounds like extra suffering. I will point out that you can use a REST API Gateway as a data source for an AppSync GraphQL service though.
Small projects. Often I am working on projects that are business-critical, involve large relational databases, will be worked on and maintained actively for years. If you want to throw together a small project or a prototype, the extra overhead is not worth it. The payoffs come with larger applications.
When your frontend and backend are tightly coupled. GraphQL adds an extra layer of abstraction which is extraneous if your backend and frontend are part of the same application. For example if you are using NextJS as both client and server, I would recommend server actions or tRPC.
Public APIs. If you are building an API which does not require authentication, or is designed to be easy to consume by clients outside your organization, I would advise sticking to something everyone understands and can be easily tested with tools like CURL. If you want other developers to get up and running and consuming your API with a minimum of extra obstacles, just stick with REST like everyone else.

However if you are embarking on a greenfield, serious project and you expect your API to scale with your organization and provide a high level of strictness and clarity and documentation, GraphQL may be a strong candidate.

Type Safety

I have similar feelings about GraphQL as I do about TypeScript. Sure, people have been building web applications just fine thank you with REST and JavaScript for many years, maybe around 25 at this point which is a lifetime in an engineer’s career. But there comes a point, maybe after another late-night debugging session where you eventually discover some mistyped parameter or property that shouldn’t have existed at runtime is ruining your weekend. Maybe you wonder, why do all these things have to be runtime errors? Maybe the Python heads don’t know there is a better way and think all errors should be runtime errors. I don’t know.

GraphQL and TypeScript ask a bit more of you, the person designing the API or writing the code to implement it. It’s more steps to solve the same problem and you just want to close this ticket. Why waste my time? The answer is that our tools can yell at us before we ship the code, not after. Obviously not in all cases, but many classes of bugs become less likely to happen at runtime, which does save us time and suffering in the long run. They increase maintainability, readability, code completion, AI copilots, and documentationability.

Same reason we write tests. It takes more time to write tests than to not write tests, but it pays off in the long run if we think about the runtime bugs we avoid and the safety we get when refactoring code, and the lower likelihood of someone less familiar with the system breaking other things in a few years time when they try to change something. Again not every software project needs tests, but you know who you are.

GraphQL’s strictness is especially helpful if annoying in the context of an API. A published API is a contract with your clients you must not break. If a client relies on a username field being present in some endpoint, you must keep that field around. If you want to move or rename the field or make it a different type, you risk breaking every old version of your clients which still expect that field to exist with a particular type. GraphQL doesn’t solve the problem by allowing you to just change schemas up on your clients as you like, but it can alert you to the fact you are making a backwards-incompatible schema change and yell at you. This is a very useful CI action to run. For example the graphql-inspector:

Because there’s nothing worse than accidentally pushing an API change that breaks every client. These are the kinds of validations we can leverage with GraphQL.

GraphQL Server

You do of course need some software that performs the duties of a GraphQL API server. There are many very featureful and robust options like Hasura and Apollo, with features like taking your existing Postgres database and generating operations and types, or federation where you can have multiple servers handling different pieces of your graph.

Myself, I always default to using an AWS service if it is good enough to suit my needs. It’s not because AWS has the greatest version of every service you will never need. Reasons why:

The more infrastructure Amazon runs instead of me, the better.
- It will be more reliable than if I run it myself
- I don’t have to maintain it
- They have great documentation and support and monitoring
- I don’t ever have to think about scaling anything
AWS services are designed to work with each other. You end up with a cohesive architecture that is more than the sum of its parts.
- Integration with distributed tracing
- Integration with other AWS data sources
  - Can run queries on RDS directly
  - Can query DynamoDB directly
- Integrated authentication with Cognito
Pay-as-you-go billing. Great for prototyping on the cheap but being able to scale to handle serious volume without changing anything.

AWS AppSync

So I’ve been using AWS AppSync. It’s a bit oddly named but it’s a serverless GraphQL API. You upload a schema to it, define how users authenticate to it, and then connect operations to resolvers. You can tell it to directly query RDS or DynamoDB, you can write JavaScript resolvers that are executed inside AppSync (this is very cool but also maybe unwieldy to maintain or debug, I would compare it to using stored procedures in your DB instead of application code), and you can use lambda functions to handle requests. It can handle caching and tracing for you.

A powerful architecture is to use a serverless GraphQL server and functions defined for most of your application query, mutation, and subscription operations. Depending on your setup and needs this can give you many of the benefits of a microservices-style architecture without all the headaches and overhead introduced. You may have different problems (mostly cold starts) but I am a big of it regardless.

To give an example, suppose a client runs a getProfile query. You want some business logic to grab the currently authenticated user’s profile information from the database and return it. You would create a standalone lambda function in whatever language you like that is called when the getProfile query is performed, do your business logic, and return the result. AppSync handles the request/response validation and authentication and dispatch. This function is self-contained, not connected to anything else in your application, which has some consequences:

You can deploy new versions of this function without changing anything else in your application, if you want.
You can define scaling limits, memory limits, and timeout for this function, if you want.
You can define exactly what permissions the function has if you want to, what security groups it’s in, its IAM role. This is the absolute maximum application of the principle of least privilege, if fine-grained security is important for you.
You can write it in whatever language you want, build it with whatever tooling you want.
Different teams and projects can manage different subsets of functions, enabling teams to work independently but each be contributing to a unified service.

These are all optional benefits. You can also use a single IAM role and toolchain and language and limits for all functions in your application, but you also have the flexibility to slice up pieces of your services into independent units to avoid a monolith or big ball of mud situation. You can create a “monolith” library of all of your business logic and repositories but package it as separate functions. Again, if you want. You can also create a single resolver function that handles all requests if you prefer simplicity. Or do both!

This architecture is most suited to compiled languages (like Rust or Go) or JavaScript/TypeScript because each function should be small and self-contained. This is where JavaScript’s tree-shaking tools really shine. If your function loads some data from the database and imports some functions, that is all of the code that will be in your function. Your codebase may grow but the resolver functions stay the same size. If you want you can have one large monolithic-style library, or a set of libraries, where each resolver function only pulls out the pieces it needs from the library and compiles or bundles it into a small piece of code that is invoked on-demand by your GraphQL server. Basically any language should be fine for this setup unless it’s Python.

In my opinion this is a wonderful modern application of the UNIX tool philosophy: do one thing and do it pretty well. Create small self-contained units that can be assembled by the user to suit their needs.

Another thing AppSync handles for you are subscriptions. These let your client application subscribe to events or data updates in real-time. Setting up and administering your own WebSocket services can be a giant pain in the ass, especially if behind a load balancer. GraphQL and AppSync make this Just Work and Not Your Problem.

My biggest problem with AppSync is that it does not support unauthenticated requests. Any application will have a need for these, for example handling signup or forgot password flows (arguably these can be handled out of the box by Cognito but you will have other needs). So your application may need something like a REST API to fall back on for these types of requests or to use an unauthenticated Cognito identity pool role (cool although a pain to set up). I hope AWS adds a better solution some day.

Just REST With Extra Steps?

Some feedback from my coworkers when building a project making heavy use of GraphQL was “it feels like REST with extra steps.” Which is absolutely true, though the number of extra steps can depend quite a bit on your setup, like if you’re doing code-first or schema-first, need to run a code generation step, and depending on how you consume the API.

Speaking as generally as possible, it’s likely building a REST API will be faster and easier than GraphQL. If these are your priority, you probably want REST. It’s standard, you can build new endpoints quickly without having to define the request and response shapes, polymorphic types are much easier to deal with, you probably don’t need any code generation or extra tooling, and everyone’s familiar with it.

GraphQL will probably be more complex but this complexity pays off depending on your application. Your technical and business requirements are the deciding factor. If end-to-end type safety and validation is important to you, or self-documenting APIs, or real-time subscriptions, or you need to do relational queries via your API, or want client generation for different platforms, GraphQL may save you time and be more maintainable.

If your application isn’t so serious, or is small and unlikely to grow into a large API surface, if it’s just you or a small team, or if your frontend and backend can be tightly coupled, I would probably do a REST API due to its simplicity and low overhead.

GraphQL vs. REST: When to Use Each

When to Use GraphQL

GraphQL is particularly effective in scenarios with:

Complex Data Requirements: When applications need to retrieve complex, nested data structures in a single request, GraphQL provides a more efficient way to query the exact data required.
Dynamic Queries: In cases where clients may need to change their data requirements frequently, GraphQL enables clients to request only the fields they need, reducing the payload size and improving performance.
Multiple Resources: If your application architecture involves multiple resources that often link together, GraphQL allows for a more streamlined approach to fetching related data in a single round trip.
Real-Time Functionality: Applications that benefit from real-time updates, such as chat applications or live dashboards, can leverage GraphQL’s subscription capabilities for handling real-time data effortlessly.

When to Use REST

If you’re looking for:

Simplicity and Speed: For simple applications with straightforward API needs, REST provides a quick and easy method to implement CRUD operations with standard HTTP verbs.
Less Frequent Changes: If your API endpoints and data requirements are stable and unlikely to change frequently, REST’s fixed endpoints reduce complexity and can be more straightforward for consumer integration.
Well-Defined Resource Models: When your application is built around a clear resource structure that maps directly to HTTP verbs, REST’s resource-oriented approach is a natural fit, making it easy to understand and implement.
Support for Caching: RESTful services often leverage HTTP caching mechanisms effectively. If caching responses is a key consideration for performance, REST can provide built-in support for caching through standard HTTP headers.

In conclusion, evaluating your application’s specific needs will guide your choice between GraphQL and REST. For complex, dynamic data interactions with a focus on developer flexibility and efficiency, opt for GraphQL. For simpler use cases with stable requirements and resource-oriented designs, REST remains a robust and effective solution.

Mastering JavaScript Tree-Shaking

One fantastic feature of JavaScript (compared to say, Python) is that it is possible to bundle your code for packaging; removing everything that is not needed to run your code. We call this “tree-shaking” because the stuff you aren’t using falls out leaving the strongly-attached bits. This reduces the size of your output, whether it’s a NPM module, code to run in a browser, or a NodeJS application.

By way of illustration, suppose we have a script that imports a function from a library. Here it calls apple1() from lib.ts:

If we bundle script.ts using esbuild:

esbuild --bundle script.ts > script-bundle.js

Then we get the resulting output:

The main point here being that only apple1 is included in the bundled script. Since we didn’t use apple2 it gets shaken out like an overripe piece of fruit.

Motivation

There are many reasons this is a valuable feature, the main one is performance. Less code means less time spent parsing JavaScript when your code is executed. It means your web app loads faster, your docker image is smaller, your serverless function cold start time is reduced, your NPM module takes up less disk space.

A robust application architecture can be a serverless architecture where your application is composed of discrete functions. These functions can function like an API if you put an API gateway or GraphQL server that invokes the functions for different routes and queries, or can be triggered by messages in queues or files being uploaded to a bucket or as regularly scheduled events. In this setup each function is self-contained, only containing whatever code is needed for its specific functionality and no unrelated code. This is in contrast to a monolith or microservice where the entire application must be loaded up in order to handle a request. No matter how large your project gets, each function remains about the same size.

I build applications using Serverless Stack, which has a terrific developer experience focused on building serverless applications on AWS with TypeScript and CDK in a local development environment. Under the hood it uses esbuild. Let’s peek under the hood.

Mechanics

Conceptually tree-shaking is pretty straightforward; just throw away whatever code our application doesn’t use. However there are a great number of caveats and tricks needed to debug and finesse your bundling.

Tree-shaking is a feature provided by all modern JavaScript bundlers. These include tools like Webpack, Turbopack, esbuild, and rollup. If you ask them to produce a bundle they will do their best to remove unused code. But how do they know what is unused?

The fine details may vary from bundler to bundler and between targets but I’ll give an overview of salient properties and options to be aware of. I’ll use the example of using esbuild to produce node bundles for AWS Lambda but these concepts apply generally to anyone who wants to reduce their bundle size.

Measuring

Before trying to reduce your bundle size you need to look at what’s being bundled, how much space everything takes up, and why. There are a number of tools at our disposal which help visualize and trace the tree-shaking process.

Bundle Buddy

This is one of the best tools for analyzing bundles visually and it has very rich information. You will need to ask your bundler to produce a meta-analysis of the bundling process and run Bundle Buddy on it (it’s all local browser based). It supports webpack, create-react-app, rollup, rome, parcel, and esbuild. For esbuild you specify the --metafile=meta.json option.

When you upload your metafile to Bundle Buddy you will be presented with a great deal of information. Let’s go through what some of it indicates.

Let’s start with the duplicate modules.

This section lets you know that you have multiple versions of the same package in your bundle. This can be due to your dependencies or your project depending on different versions of a package which cannot be resolved to use the same version for some reason.

Here you can see I have versions 3.266.0 and 3.272 of the AWS SDK and two versions of fast-xml-parser. and The best way to hunt down why different versions may be included is to simply ask your package manager. For example you can ask pnpm:

$ pnpm why fast-xml-parser

dependencies:
@aws-sdk/client-cloudformation 3.266.0
├─┬ @aws-sdk/client-sts 3.266.0
│ └── fast-xml-parser 4.0.11
└── fast-xml-parser 4.0.11
@aws-sdk/client-cloudwatch-logs 3.266.0
└─┬ @aws-sdk/client-sts 3.266.0
  └── fast-xml-parser 4.0.11
...
@prisma/migrate 4.12.0
└─┬ mongoose 6.8.1
  └─┬ mongodb 4.12.1
    └─┬ @aws-sdk/credential-providers 3.282.0
      ├─┬ @aws-sdk/client-cognito-identity 3.282.0
      │ └─┬ @aws-sdk/client-sts 3.282.0
      │   └── fast-xml-parser 4.1.2
      ├─┬ @aws-sdk/client-sts 3.282.0
      │ └── fast-xml-parser 4.1.2
      └─┬ @aws-sdk/credential-provider-cognito-identity 3.282.0
        └─┬ @aws-sdk/client-cognito-identity 3.282.0
          └─┬ @aws-sdk/client-sts 3.282.0
            └── fast-xml-parser 4.1.2

So if I want to shrink my bundle I need to figure out how to get it so that both @aws-sdk/client-* and @prisma/migrate can agree on a common version to share so that only one copy of fast-xml-parser needs to end up in my bundle. Since this function shouldn’t even be importing @prisma/migrate (and certainly not mongodb) I can use that as a starting point for tracking down an unneeded import which will discuss shortly. Alternatively you can open a PR for one of the dependencies to use a looser version spec (e.g. ^4.0.0) for fast-xml-parser or @aws-sdk/client-sts.

With duplicate modules out of the way, the main meat of the report is the bundled modules. This will usually be broken up into your code and stuff from node_modules:

When viewing in Bundle Buddy you can click on any box to zoom in for a closer look. We can see that of the 1.63MB that comprises our bundle, 39K is for my actual function code:

This is interesting but not where we need to focus our efforts.

Clearly the prisma client and runtime are taking up sizable parcels of real-estate. There’s not much you can do about this besides file a ticket on GitHub (as I did here with much of this same information).

But looking at our node_modules we can see at a glance what is taking up the most space:

This is where you can survey what dependencies are not being tree-shaken out. You may have some intuitions about what belongs here, doesn’t belong here, or seems too large. For example in the case of my bundle the two biggest dependencies are on the left there, @redis-client (166KB) and gremlin 97KB). I do use redis as a caching layer for our Neptune graph database, of which gremlin is a client library that one uses to query the database. Because I know my application and this function I know that this function never needs to talk to the graph database so it doesn’t need gremlin. This is another starting point for me to trace why gremlin is being required. We’ll look at that later on when we get into tracing. Also noteworthy is that even though I only use one redis command, the code for handling all redis commands get bundled, adding a cost of 109KB to my bundle.

Finally the last section in the Bundle Buddy readout is a map of what files import other files. You can click in for what looks like a very interesting and useful graph but it seems to be a bit broken. No matter, we can see this same information presented more clearly by esbuild.

esbuild –analyze

Your bundler can also operate in a verbose mode where it tells you WHY certain modules are being included in your bundle. Once you’ve determined what is taking up the most space in your bundle or identified large modules that don’t belong, it may be obvious to you what the problem is and where and how to fix it. Oftentimes it may not be so clear why something is being included. In my example above of including gremlin, I needed to see what was requiring it.

We can ask our friend esbuild:

esbuild --bundle --analyze --analyze=verbose script.ts --outfile=tmp.js 2>&1 | less

The important bit here being the --analyze=verbose flag. This will print out all traces of all imports so the output gets rather large, hence piping it to less. It’s sorted by size so you can start at the top and see why your biggest imports are being included. A couple down from the top I can see what’s pulling in gremlin:

   ├ node_modules/.pnpm/gremlin@3.6.1/node_modules/gremlin/lib/process/graph-traversal.js ─── 13.1kb ─── 0.7%
   │  └ node_modules/.pnpm/gremlin@3.6.1/node_modules/gremlin/index.js
   │     └ backend/src/repo/gremlin.ts
   │        └ backend/src/repo/repository/skillGraph.ts
   │           └ backend/src/repo/repository/skill.ts
   │              └ backend/src/repo/repository/vacancy.ts
   │                 └ backend/src/repo/repository/candidate.ts
   │                    └ backend/src/api/graphql/candidate/list.ts

This is extremely useful information for tracking down exactly what in your code is telling the bundler to pull in this module. After a quick glance I realized my problem. The file repository/skill.ts contains a SkillRepository class which contains methods for loading a vacancy’s skills which is used by the vacancy repository which is eventually used by my function. Nothing in my function calls the SkillRepository methods which need gremlin, but it does include the SkillRepository class. What I foolishly assumed was that the methods on the class I don’t call will get tree-shaken out. This means that if you import a class, you will be bringing in all possible dependencies any method of that class brings in. Good to know!

@next/bundle-analyzer

This is a colorful but limited tool for showing you what’s being included in your NextJS build. You add it to your next.config.js file and when you do a build it will pop open tabs showing you what’s being bundled in your backend, frontend, and middleware chunks.

The amount of bullshit @apollo/client pulls in is extremely aggravating to me.

Modularize Imports

It was helpful for learning that using top-level Material-UI imports such as import { Button, Dialog } from "@mui/material" will pull in ALL of @mui/material into your bundle. Perhaps this is because NextJS still is stuck on CommonJS, although that is pure speculation on my part.

While you can fix this by assiduously doing import { Button } from "@mui/material/Button" everywhere this is hard to enforce and tedious. There is a NextJS config option to rewrite such imports:

  modularizeImports: {
    "@mui/material": {
      transform: "@mui/material/{{member}}",
    },
    "@mui/icons-material": {
      transform: "@mui/icons-material/{{member}}",
    },
  },

Webpack Analyzer

Has a spiffy graph of imports and works with Webpack.

Tips and Tricks

CommonJS vs. ESM

One factor that can affect bundling is using CommonJS vs. EcmaScript Modules (ESM). If you’re not familiar with the difference, the TypeScript documentation has a nice summary and the NodeJS package docs are quite informative and comprehensive. But basically CommonJS is the “old, busted” way of defining modules in JavaScript and makes use of things like require() and module.exports, whereas ESM is the “cool, somewhat less busted” way to define modules and their contents using import and export keywords.

Tree-shaking with CommonJS is possible but it is more wooley due to the more procedural format of defining exports from a module whereas ESM exports are more declarative. The esbuild tool is specifically built around ESM, in the docs it says:

This way esbuild will only bundle the parts of your packages that you actually use, which can sometimes be a substantial size savings. Note that esbuild’s tree shaking implementation relies on the use of ECMAScript module import and export statements. It does not work with CommonJS modules. Many packages on npm include both formats and esbuild tries to pick the format that works with tree shaking by default. You can customize which format esbuild picks using the main fields and/or conditions options depending on the package.

So if you’re using esbuild, it won’t even bother trying unless you’re using ESM-style imports and exports in your code and your dependencies. If you’re still typing require then you are a bad person and this is a fitting punishment for you.

As the documentation highlights, there is a related option called mainFields which affects which version of a package esbuild resolves. There is a complicated system for defining exports in package.json which allows a module to contain multiple versions of itself depending on how it’s being used. It can have one entrypoint if it’s require‘d, a different one if imported, or another if used in a browser.

The upshot is that you may need to tell your bundler explicitly to prefer the ESM (“module“) version of a package instead of the fallback CommonJS version (“main“). With esbuild the option looks something like:

esbuild --main-fields=module,main --bundle script.ts

Setting this will ensure the ESM version is preferred, which may result in improved tree-shaking.

Minification

Tree-shaking and minification are related but distinct optimizations for reducing the size of your bundle. Tree-shaking eliminates dead code, whereas minification rewrites the result to be smaller, for example replacing a function identifier e.g. “frobnicateMajorBazball” with say “a1“.

Usually enabling minification is a simple option in your bundler. This bundle is 2.1MB minified, but 4.5MB without minification:

❯ pnpm exec esbuild --format=esm --target=es2022 --bundle --platform=node --main-fields=module,main  backend/src/api/graphql/candidate/list.ts --outfile=tmp.js

  tmp.js  4.5mb ⚠️

⚡ Done in 239ms


❯ pnpm exec esbuild --minify --format=esm --target=es2022 --bundle --platform=node --main-fields=module,main  backend/src/api/graphql/candidate/list.ts --outfile=tmp-minified.js

  tmp-minified.js  2.1mb ⚠️

⚡ Done in 235ms

Side effects

Sometimes you may want to import a module not because it has a symbol your code makes use of but because you want some side-effect to happen as a result of importing it. This may be an import that extends jest matchers, or initializes a library like google analytics, or some initialization that is performed when a file is imported.

Your bundler doesn’t always know what’s safe to remove. If you have:

import './lib/initializeMangoids'

In your source, what should your bundler do with it? Should it keep it or remove it in tree-shaking?

If you’re using Webpack (or terser) it will look for a sideEffects property in a module’s package.json to check if it’s safe to assume that simply importing a file does not do anything magical:

{
  "name": "your-project",
  "sideEffects": false
}

Code can also be annotated with /*#__PURE__ */ to inform the minifier that this code has no side effects and can be tree-shaken if not referred to by included code.

var Button$1 = /*#__PURE__*/ withAppProvider()(Button);

Read about it in more detail in the Webpack docs.

Externals

Not every package you depend on needs to necessarily be in your bundle. For example in the case of AWS lambda the AWS SDK is included in the runtime. This is a fairly hefty dependency so it can shave some major slices off your bundle if you leave it out. This is done with the external flag:

❯ pnpm exec esbuild --minify --format=esm --target=es2022 --bundle --platform=node --main-fields=module,main  backend/src/api/graphql/candidate/list.ts --outfile=tmp-minified.js

  tmp-minified.js  2.1mb 


❯ pnpm exec --external:aws-sdk --minify --format=esm --target=es2022 --bundle --platform=node --main-fields=module,main  backend/src/api/graphql/candidate/list.ts --outfile=tmp-minified.js

  tmp-minified.js  1.8mb

One thing worth noting here is that there are different versions of packages depending on your runtime language and version. Node 18 contains the AWS v3 SDK (--external:@aws-sdk/) whereas previous versions contain the v2 SDK (--external:aws-sdk). Such details may be hidden from you if using the NodejsFunction CDK construct or SST Function construct.

On the CDK slack it was recommended to me to always bundle the AWS SDK in your function because you may be developing against a different version than what is available in the runtime. Or you can pin your package.json to use the exact version in the runtime.

Another reason to use externals is if you are using a layer. You can tell your bundler that the dependency is already available in the layer so it’s not needed to bundle it. I use this for prisma and puppeteer.

Performance Impacts

For web pages the performance impacts are instantly noticeable with a smaller bundle size. Your page will load faster both over the network and in terms of script parsing and execution time.

Another way to get an idea of what your node bundle is actually doing at startup is to profile it. I really like the 0x tool which can run a node script and give you a flame graph of where CPU time is spent. This can be an informative visualization and let you dig into what is being called when your script runs:

For serverless applications you can inspect the cold start (“initialization”) time for your function on your cloud platform. I use the AWS X-Ray tracing tool. Compare before and after some aggressive bundle size optimizations:

The cold-start time went from 2.74s to 1.60s. Not too bad.

Frameworkless Web Applications

Since we have (mostly) advanced beyond CGI scripts and PHP the default tool many people reach for when building a web application is a framework. Like drafting a standard legal contract or making a successful Hollywood film, it’s good to have a template to work off of. A framework lends structure to your application and saves you from having to reinvent a bunch of wheels. It’s a solid foundation to build on which can be a substantial “batteries included” model (Rails, Django, Spring Boot, Nest) or a lightweight “slap together whatever shit you need outta this” sort of deal (Flask, Express).

The idea of a web framework is that there are certain basic features that most web apps need and that these services should be provided as part of the library. Nearly all web frameworks will give you some custom implementation of some or all of:

Configuration
Logging
Exception trapping
Parsing HTTP requests
Routing requests to functions
Serialization
Gateway adaptor (WSGI, Rack, WAR)
Middleware architecture
Plugin architecture
Development server

There are many other possible features but these are extremely common. Just about every framework has its own custom code to route a parsed HTTP request to a handler function, as in “call hello() when a GET request comes in for /hello.”

There are many great things to say about this approach. The ability to run your application on any sort of host from DigitalOcean to Heroku to EC2 is something we take for granted, as well as being able to easily run a web server on your local environment for testing. There is always some learning curve as you learn the ins and outs of how you register a URL route in this framework or log a debug message in that framework or add a custom serializer field.

But maybe we shouldn’t assume that our web apps always need to be built with a framework. Instead of being the default tool we grab without a moment’s reflection, now is a good time to reevaluate our assumptions.

Serverless

What struck me is that a number of the functions that frameworks provide are not needed if I go all-in on AWS. Long ago I decided I’m fine with Bezos owning my soul and acceded to writing software for this particular vendor, much as many engineers have built successful applications locked in to various layers of software abstraction. Early programmers had to decide which ISA or OS they wanted to couple their application to, later we’re still forced to make non-portable decisions but at a higher layer of abstraction. My python or JavaScript code will run on any CPU architecture or UNIX OS, but features from my cloud provider may restrict me to that cloud. Which I am totally fine with.

I’ve long been a fan of and written about serverless applications on this blog because I enjoy abstracting out as much of my infrastructure as possible so as to focus on the logic of my application that I’m interested in. My time is best spent concerning myself with business logic and not wrangling containers or deployments or load balancer configurations or gunicorn.

I’ve had a bit of a journey over the years adopting the serverless mindset, but one thing has been holding me back and it’s my attachment to web frameworks. While it’s quite common and appropriate to write serverless functions as small self-contained scripts in AWS Lambda, building a larger application in this fashion feels like trying to build a house without a foundation. I’ve done considerable experimentation mostly with trying to cram Flask into Lambda, where you still have all the comforts of your familiar framework and it handles all the routing inside a single function. You also have the flexibility to easily take your application out of AWS and run it elsewhere.

There are a number of issues with the approach of putting a web framework into a Lambda function. For one, it’s cheating. For another, when your application grows large enough the cold start time becomes a real problem. Web frameworks have the side-effect of loading your entire application code on startup, so any time a request comes in and there isn’t a warm handler to process it, the client must wait for your entire app to be imported before handling the request. This means users occasionally experience an extra few seconds of delay on a request, not good from a performance standpoint. There are simple workarounds like provisioned concurrency but it is a clear sign there is a flaw in the architecture.

Classic web frameworks are not appropriate for building a truly serverless application. It’s the wrong tool for the architecture.

The Anti-Framework

Assuming you are fully bought in to AWS and have embraced the lock-in lifestyle, life is great. AWS acts like a framework of its own providing all of the facilities one needs for a web application but in the form of web services of the Amazonian variety. If we’re talking about RESTful web services, it’s possible to put together an extremely scalable, maintainable, and highly available application.

Logging, monitoring: CloudWatch
Tracing: X-Ray
Alerting: Incident Manager
Configuration, exception trapping, execution: Lambda
HTTP request parsing, request routing: API Gateway
Relational database: Aurora Serverless
Configuration, secrets: Secrets Manager

No docker, kubernetes, or load balancers to worry about. You can even skip the VPC if you use the Aurora Data API to run SQL queries.

The above list could go on for a very long time but you get the point. If we want to be as lazy as possible and leverage cloud services as much as possible then what we really want is a tool for composing these services in an expressive and familiar fashion. Amazon’s new Cloud Development Kit (CDK) is just the tool for that. If you’ve never heard of CDK you can read a friendly introduction here or check out the official docs.

In short CDK lets you write high-level code in Python, TypeScript, Java or .NET, and compile it to a CloudFormation template that describes your infrastructure. A brief TypeScript example from cursed-webring:

// API Gateway with CORS enabled
const api = new RestApi(this, "cursed-api", {
  restApiName: "Cursed Service",
  defaultCorsPreflightOptions: {
    allowOrigins: apigateway.Cors.ALL_ORIGINS,
  },
  deployOptions: { tracingEnabled: true },
});
// defines the /sites/ resource in our API
const sitesResource = api.root.addResource("sites");
// get all sites handler, GET /sites/
const getAllSitesHandler = new NodejsFunction(
  this,
  "GetCursedSitesHandler",
  {
    entry: "resources/cursedSites.ts",
    handler: "getAllHandler",
    tracing: Tracing.ACTIVE,
  }
);
sitesResource.addMethod("GET", new LambdaIntegration(getAllSitesHandler));

Is CDK a framework? It depends how you define “framework” but I consider more to be infrastructure as code. By allowing you to effortlessly wire up the services you want in your application, CDK more accurately removes the need for any sort of traditional web framework when it comes to features like routing or responding to HTTP requests.

While CDK provides a great way to glue AWS services together it has little to say when it comes to your application code itself. I believe we can sink even lower into the proverbial couch by decorating our application code with metadata that generates the CDK resources our application declares, specifically Lambda functions and API Gateway routes. I call it an anti-framework.

@JetKit/CDK

Why TypeScript?

As an aside, TypeScript is now my preferred choice for backend development. JavaScript no, but TypeScript yes. The rapid evolution and improvements in the language with Microsoft behind it have been impressive. The language is as strict as you want it to be. Having one set of tooling, CI/CD pipelines, docs, libraries and language experience in your team is much easier than supporting two. All the frontends we work with are React and TypeScript, why not use the same linters, type checking, commit hooks, package repository, formatting configuration, and build tools instead of maintaining say, one set for a Python backend and another for a TypeScript frontend?

Python is totally fine except for its lack of type safety. Do not even attempt to blog at me ✋🏻 about mypy or pylance. It is like saying a Taco Bell is basically a real taqueria. Might get you through the day but it’s not really the same thing 🌮

Construct Generation

So we’ve seen the decorated application code, how does it get turned into cloud resources? With the ResourceGeneratorConstruct, a CDK construct that takes your functions and classes as input and generates AWS resources as output.

import { CorsHttpMethod, HttpApi } from "@aws-cdk/aws-apigatewayv2"
import { Construct, Duration, Stack, StackProps, App } from "@aws-cdk/core"
import { ResourceGeneratorConstruct } from "@jetkit/cdk"
import { aliveHandler, AlbumApi } from "../backend/src"  // your app code
export class InfraStack extends Stack {
  constructor(scope: App, id: string, props?: StackProps) {
    super(scope, id, props)
    // create API Gateway
    const httpApi = new HttpApi(this, "Api", {
      corsPreflight: {
        allowHeaders: ["Authorization"],
        allowMethods: [CorsHttpMethod.ANY],
        allowOrigins: ["*"],
        maxAge: Duration.days(10),
      },
    })
    // transmute your app code into infrastructure
    new ResourceGeneratorConstruct(this, "Generator", {
      resources: [AlbumApi, aliveHandler], // supply your API views and functions here
      httpApi,
    })
  }
}

It is necessary to explicitly pass the functions and classes you want resources for to the generator because otherwise esbuild will optimize them out of existence.

Try It Out

I’m a big fan of Serverless Stack, a lightweight toolkit for doing CDK-driven development with a few very useful features like the most advanced local development environment for AWS serverless applications.

Also please take a look at my starter kit for new serverless TypeScript applications.

Woodworker Designs and Builds the Perfect Tiny House Boat called the Le Koroc — Maybe a foundation isn’t needed after all

Web Services with AWS CDK

If you want to build a cloud-native web service, consider reaching for the AWS Cloud Development Kit. CDK is a new generation of infrastructure-as-code (IaC) tools designed to make packaging your code and infrastructure together as seamless and powerful as possible. It’s great for any application running on AWS, and it’s especially well-suited to serverless applications.

The CDK consists of a set of libraries containing resource definitions and higher-level constructs, and a command line interface (CLI) that synthesizes CloudFormation from your resource definitions and manages deployments. You can imperatively define your cloud resources like Lambda functions, S3 buckets, APIs, DNS records, alerts, DynamoDB tables, and everything else in AWS using TypeScript, Python, .NET, or Java. You can then connect these resources together and into more abstract groupings of resources and finally into stacks. Typically one entire service would be one stack.

class HelloCdkStack extends Stack {
  constructor(scope: App, id: string, props?: StackProps) {
    super(scope, id, props);

    new s3.Bucket(this, 'MyFirstBucket', {
      versioned: true
    });
  }
}

CDK doesn’t exactly replace CloudFormation because it generates CloudFormation markup from your resource and stack definitions. But it does mean that if you use CDK you don’t really ever have to manually write CloudFormation ever again. CloudFormation is a declarative language, which makes it challenging and cumbersome to do simple things like conditionals, for example changing a parameter value or not including a resource when your app is being deployed to production. When using a typed language you get the benefit of writing IaC with type checking and code completion, and the ability to connect resources together with a very natural syntax. One of the real time-saving benefits of CDK is that you can group logical collections of resources into reusable classes, defining higher level constructs like CloudWatch canary scripts, NodeJS functions, S3-based websites with CloudFront, and your own custom constructs of whatever you find yourself using repeatedly.

The CLI for CDK gives you a set of tools mostly useful for deploying your application. A simple cdk deploy parses your stacks and resources, synthesizes CloudFormation, and deploys it to AWS. The CLI is basic and relatively new, so don’t expect a ton of mature features just yet. I am still using the Serverless framework for serious applications because it has a wealth of built-in functionality and useful plugins for things like testing applications locally and tailing CloudWatch logs. AWS’s Serverless Application Model (SAM) is sort of equivalent to Serverless, but feels very Amazon-y and more like a proof-of-concept than a tool with any user empathy. The names of all of these tools are somewhat uninspired and can understandably cause confusion, so don’t feel bad if you feel a little lost.

Sample CDK Application

I built a small web service to put the CDK through its paces. My application has a React frontend that fetches a list of really shitty websites from a Lambda function and saves them in the browser’s IndexedDB, a sort of browser SQL database. The user can view the different shitty websites with previous and next buttons and submit a suggestion of a terrible site to add to the webring. You can view the entire source here and the finished product at cursed.lol.

To kick off a CDK project, run the init command: cdk init app --language typescript.

This generates an application scaffold we can fill in, beginning with the bin/cdk.ts script if using TypeScript. Here you can optionally configure environments and import your stacks.

#!/usr/bin/env node
import "source-map-support/register";
import * as cdk from "@aws-cdk/core";
import { CursedStack } from "../lib/stack";

const envProd: cdk.Environment = {
  account: "1234567890",
  region: "eu-west-1",
};

const app = new cdk.App();
new CursedStack(app, "CursedStack", { env: envProd });

The environment config isn’t required; by default your application can be deployed into any region and AWS account, making it easy to share and create development environments. However if you want to pre-define some environments for dev/staging/prod you can do that explicitly here. The documentation suggests using environment variables to select the desired AWS account and region at deploy-time and then writing a small shell script to set those variables when deploying. This is a very flexible and customizable way to manage your deployments, but it lacks the simplicity of Serverless which has a simple command-line option to select which stage you want. CDK is great for customizing to your specific needs, but doesn’t quite have that out-of-the-box user friendliness.

DynamoDB

Let’s take a look at a construct that defines a DynamoDB table for storing user submissions:

import * as core from "@aws-cdk/core";
import * as dynamodb from "@aws-cdk/aws-dynamodb";

export class CursedDB extends core.Construct {
  submissionsTable: dynamodb.Table;

  constructor(scope: core.Construct, id: string) {
    super(scope, id);

    this.submissionsTable = new dynamodb.Table(this, "SubmissionsTable", {
      partitionKey: {
        name: "id",
        type: dynamodb.AttributeType.STRING,
      },
      billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
    });
  }
}

Here we create a table that has a string id primary key. In this example we save the table as a public property (this.submissionsTable) on the instance of our Construct because we will want to reference the table in our Lambda function in order to grant write access and provide the name of the table to the function so that it can write to the table. This concept of using a class property to keep track of resources you want to pass to other constructs isn’t anything particular to CDK – it’s just something I decided to do on my own to make it easy to connect different pieces of my service together.

Lambda Functions

Here I declare a construct which defines two Lambda functions. One function fetches a list of websites for the user to browse, and the other handles posting submissions which saved into our DynamoDB submissionsTable as well as Slacked to me. I am extremely lazy and manage most of my applications this way. We use the convenient NodejsFunction high-level construct to make our lives easier. This is the most complex construct of our stack. It:

Loads a secret containing our Slack webhook URL
Defines a custom property submissionsTable that it expects to receive
Defines an API Gateway with CORS enabled
Creates an API resource (/sites/) to hold our function endpoints
Defines two Lambda NodeJS functions (note that our source files are TypeScript – compilation happens automatically)
Connects the Lambda functions to the API resource as GET and POST endpoints
Grants write access to the submissionsTable to the submitSiteHandler function

import * as core from "@aws-cdk/core";
import * as apigateway from "@aws-cdk/aws-apigateway";
import * as sm from "@aws-cdk/aws-secretsmanager";
import { NodejsFunction } from "@aws-cdk/aws-lambda-nodejs";
import { LambdaIntegration, RestApi } from "@aws-cdk/aws-apigateway";
import { Table } from "@aws-cdk/aws-dynamodb";

// ARN of a secret containing the slack webhook URL
const slackWebhookSecret =
  "arn:aws:secretsmanager:eu-west-1:178183757879:secret:cursed/slack_webhook_url-MwQ0dY";

// required properties to instantiate our construct
// here we pass in a reference to our DynamoDB table
interface CursedSitesServiceProps {
  submissionsTable: Table;
}

export class CursedSitesService extends core.Construct {
  constructor(
    scope: core.Construct,
    id: string,
    props: CursedSitesServiceProps
  ) {
    super(scope, id);

    // load our webhook secret at deploy-time
    const secret = sm.Secret.fromSecretCompleteArn(
      this,
      "SlackWebhookSecret",
      slackWebhookSecret
    );

    // our API Gateway with CORS enabled
    const api = new RestApi(this, "cursed-api", {
      restApiName: "Cursed Service",
      defaultCorsPreflightOptions: {
        allowOrigins: apigateway.Cors.ALL_ORIGINS,
      },
    });

    // defines the /sites/ resource in our API
    const sitesResource = api.root.addResource("sites");

    // get all sites handler, GET /sites/
    const getAllSitesHandler = new NodejsFunction(
      this,
      "GetCursedSitesHandler",
      {
        entry: "resources/cursedSites.ts",
        handler: "getAllHandler",
      }
    );
    sitesResource.addMethod("GET", new LambdaIntegration(getAllSitesHandler));

    // submit, POST /sites/
    const submitSiteHandler = new NodejsFunction(
      this,
      "SubmitCursedSiteHandler",
      {
        entry: "resources/cursedSites.ts",
        handler: "submitHandler",
        environment: {
          // let our function access the webhook and dynamoDB table
          SLACK_WEBHOOK_URL: secret.secretValue.toString(),
          CURSED_SITE_SUBMISSIONS_TABLE_NAME: props.submissionsTable.tableName,
        },
      }
    );
    // allow submit function to write to our dynamoDB table
    props.submissionsTable.grantWriteData(submitSiteHandler);
    sitesResource.addMethod("POST", new LambdaIntegration(submitSiteHandler));
  }
}

While there’s a lot going on here it is very readable if taken line-by-line. I think this showcases some of the real expressibility of CDK. That props.submissionsTable.grantWriteData(submitSiteHandler) stanza is really 👨🏻‍🍳👌🏻. It grants that one function permission to write to the DynamoDB table that we defined in our first construct. We didn’t have to write any IAM policy statements, reference CloudFormation resources, or even look up exactly which actions this statement needs to consists of. This gives you a bit of the flavor of CDK’s simplicity compared to writing CloudFormation by hand.

If you’d like to look at the source code of these Lambdas you can find it here. Fetching the list of sites is accomplished by loading a Google Sheet as a CSV (did I mention I’m really lazy?) and the submission handler does a simple DynamoDB Put call and hits the Slack webhook with the submission. I love this kind of web service setup because once it’s deployed it runs forever and I never have to worry about managing it again, and it costs roughly $0 per month. If a website is submitted I can evaluate it and decide if it’s shitty enough to be included, and if so I can just add it to the Google Sheet. And I have a record of all submissions in case I forget or one gets lost in Slack or something.

CloudFront CDN

Let’s take a look at one last construct I put together for this application, a CloudFront CDN distribution in front of a S3 static website bucket. I realized the need to mirror many of these lame websites because due to their inherent crappiness they were slow, didn’t support HTTPS (needed when iFraming), and might not stay up forever. A little curl --mirror magic fixed that right up.

It’s important to preserve these treasures

Typically defining a CloudFront distribution with HTTPS support is a bit of a headache. Again the high-level constructs you get included with CDK really shine here and I made use of the CloudFrontWebDistribution construct to define just what I needed:

import {
  CloudFrontWebDistribution,
  OriginProtocolPolicy,
} from "@aws-cdk/aws-cloudfront";
import * as core from "@aws-cdk/core";

// cursed.llolo.lol ACM cert
const certificateArn =
  "arn:aws:acm:us-east-1:1234567890:certificate/79e60ba9-5517-4ce3-8ced-2d9d1ddb1d5c";

export class CursedMirror extends core.Construct {
  constructor(scope: core.Construct, id: string) {
    super(scope, id);

    new CloudFrontWebDistribution(this, "cursed-mirrors", {
      originConfigs: [
        {
          customOriginSource: {
            domainName: "cursed.llolo.lol.s3-website-eu-west-1.amazonaws.com",
            httpPort: 80,
            originProtocolPolicy: OriginProtocolPolicy.HTTP_ONLY,
          },
          behaviors: [{ isDefaultBehavior: true }],
        },
      ],
      aliasConfiguration: {
        acmCertRef: certificateArn,
        names: ["cursed.llolo.lol"],
      },
    });
  }
}

This creates a HTTPS-enabled CDN in front of my existing S3 bucket with static website hosting. I could have created the bucket with CDK as well but, since there can only be one bucket with this particular domain that seemed a bit overkill. If I wanted to make this more reusable these values could be stack parameters.

The Stack

Finally the top-level Stack contains all of our constructs. Here you can see how we pass the DynamoDB table provided by the CursedDB construct to the CursedSitesService containing our Lambdas.

import * as cdk from "@aws-cdk/core";
import { CursedMirror } from "./cursedMirror";
import { CursedSitesService } from "./cursedSitesService";
import { CursedDB } from "./db";

export class CursedStack extends cdk.Stack {
  constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const db = new CursedDB(this, "CursedDB");
    new CursedSitesService(this, "CursedSiteServices", {
      submissionsTable: db.submissionsTable,
    });
    new CursedMirror(this, "CursedSiteMirrorCDN");
  }
}

Putting it all together, all that’s left to do is run cdk deploy to summon our cloud resources into existence and write our frontend.

Security Warnings

It’s great that CDK asks for confirmation before opening up ports:

Is This Better?

Going through this exercize of creating a real service using nothing but CDK was a great way for me to get more comfortable with the tools and concepts behind it. Once I wrapped my head around the way the constructs fit together and started discovering all of the high-level constructs already provided by the libraries I really started to dig it. Need to load some secrets? Need to define Lambda functions integrated to API Gateway? Need a CloudFront S3 bucket website distribution? Need CloudWatch canaries? It’s already there and ready to go along with strict compile-time checking of your syntax and properties. I pretty much never encountered a situation where my code compiled but the deployment was invalid, a vastly improved state of affairs from trying to write CloudFormation manually.

And what about Terraform? In my humble opinion if you’re going to build cloud-native software it’s a waste of effort to abstract out your cloud provider and their resources. Better to embrace the tooling and particulars of one provider and specialize instead of pursuing some idealistic cloud-agnostic setup at a great price of efficiency. Multi-cloud is the worst practice.

The one thing that I missed most from the Serverless framework was tailing my CloudWatch logs. When I had issues in my Lambda logic (not something the CDK can fix for you) I had to go into the CloudWatch console to look at the logs instead of simply being able to tail them from the command line. The upshot though is that CDK is simply code, and writing your own tooling around it using the AWS API should be straightforward enough. I expect SAM and the CDK CLI to only get more mature and user-friendly over time, so I imagine I’ll be building projects of increasing seriousness with them as time progresses.

If you want to learn more, start with the CDK docs. And if you know of any cursed websites please feel free to mash that submit button.

Serverless WebSockets

WebSockets, the standard for doing real-time bidirectional communication typically between a browser and a server, is a fair attempt to create a standard to supplant the previously employed hacky solutions and continues to evolve in terms of implementation.

The basic idea has primarily been to establish some sort of channel in which a server can “push” events to a client, rather than the client “polling” every so often to see if there is new information. This was until fairly recently a relatively obscure concept, but now any smartphone owner is extremely well-acquainted with push notifications. This real-time channel has been used for not just notifications but also services like VOIP and gaming.

In the days before the WebSocket standard various semi-clever attempts to implement push notifications were devised. The first was using <iframe>s to load an HTML document using chunked encoding, where the server would write a script tag with some new data in the form of JavaScript commands when the data became available. When the browser encountered a closing script tag it would execute the JS immediately even though the document was still streaming.

The next scheme was using XML HTTP Request (aka XHR [aka AJAX]) to do something similar but without needing an <iframe>. This was known as “long-polling”, or “comet.” This was still mostly a unidirectional channel and suffered from timeouts and reconnection issues with potential race conditions.

Now with WebSockets we have a much improved system and wide browser support. But what about the backend? What happens when a browser or other client connects to a WebSocket server?

Previously we’ve developed and hosted WebSocket servers written in Perl, Go, and Python, using PostgreSQL asynchronous events as the message passing system. Deploying WebSocket servers is not as straightforward as HTTP servers because of the long-lived connections and having to perform TCP load balancing. Depending on your hosting setup you may have to deal with internal timeouts or getting events from your message bus to the right backend via some subscription mechanism.

Architecture

Since I love not running servers I’ve been excited about the chance to use serverless WebSockets via AWS API Gateway. In this new scheme you define Lambda functions that react to events such as authentication, connect, disconnect, and user-defined events that can be read from JSON message bodies.

Infrastructure-wise the setup is extremely basic. All of the real work to handle authorization and events and done in code, which we will look at shortly. Let’s use a concrete example of a typical WebSocket use case – sending notifications from the server to the client to inform it of some data change in order for the client to update some information in real time or notify the user.

For my application I created an authorizer function that validates a JWT encoded in the WebSocket URL query parameters (there is no good way in a browser to set headers when opening a WebSocket connection). This function denies or grants access to proceed and saves the authenticated user ID in the principalId response field, which is passed along to subsequent event handlers.

Once the authorization check is successful the special $connect route is called if there is a handler defined. In this handler we have the user ID in the invocation event passed along from the authorizer response and we have a connectionId. We save this user ID and connection ID pair in our database so that we can know who is connected and have the ability to send them a notification later on using their connectionId.

The API Gateway makes a best-effort attempt to detect disconnections and invokes the special $disconnect route whereupon our handler removes the connection record from the database.

Putting all of these pieces together with actual working code required me gathering a fair bit of information from different sources and working out the proper request fields and response formats but it all worked out wonderfully in the end. I’d like to share the working code examples for the handlers and some sample client code as well.

The Code

To define your handlers and when they get invoked you need to configure API Gateway to register your authorizer handler and the assorted route handlers. Using the Serverless toolkit this is straightforward and nicely documented. My configuration looks something like:

functions:
  # websocket authorizer
  wsAuth:
    handler: notifier.ws.handler.authorizer

  # websocket $connect
  wsConnect:
    handler: notifier.ws.handler.connect
    events:
      - websocket:
          route: $connect
          authorizer:
            name: wsAuth
            identitySource:
              - route.request.querystring.token  # token query param

  # websocket $disconnect
  wsDisconnect:
    handler: notifier.ws.handler.disconnect
    events:
      - websocket:
          route: $disconnect

And the authorizer:

def authorizer(event, context):
    method_arn = event.get("methodArn")
    def deny(msg):
        return {"message": msg,
                "policyDocument": gen_policy(method_arn=method_arn, allow=False)
        }

    # get access token from query string
    query_params = event.get("queryStringParameters")
    if not query_params:
        return deny("missing queryStringParameters")
    if "token" not in query_params:
        return deny("missing token in query string")
    token = query_params["token"]
    if not token:
        return deny("empty token")

    # decode and verify JWT token
    decoded = None
    try:
        decoded = decode_token(token)
    except ExpiredSignatureError:
        return deny("Expired token")

    identity = decoded.get("identity")
    if not identity:
        raise Exception("invalid JWT; missing identity")

    # allow access
    policy = gen_policy(method_arn=method_arn, allow=True)
    context = {}  # can add more auth context info here if desired
    res = {
        "principalId": identity,
        "policyDocument": policy,
        "context": context
    }
    return res

def gen_policy(method_arn: str, allow: bool):
    effect = "Allow" if allow else "Deny"
    return {
        "Version": "2012-10-17",
        "Statement": [{
            "Action": "execute-api:Invoke",
            "Effect": effect,
            "Resource": method_arn
        }],
    }

This looks for a JWT in the query string and attempts to parse and validate it. If successful then an IAM policy is returned along with the decoded identity ID. The details of the event and policy can be found in the Lambda REQUEST WebSocket authorizer documentation.

If the client is granted Invoke access to the execute-api service then API Gateway will call our $connect route next:

def connect(event, context):
    ctx = event.get("requestContext", {})
    # get user and connection id
    conn_id = ctx.get("connectionId")
    auth = ctx.get("authorizer", {})
    user_id = auth.get("principalId")

    if not user_id:
        return make_response(401, "Not authorized")

    if not conn_id:
        raise Exception("missing connectionId")

    # save the connection id/user id pair in DB
    WebsocketClient.save_connection(
        user_id=user_id,
        connection_id=conn_id,
        domain_name=ctx["domainName"],
        stage=ctx["stage"],
    )
    db.session.commit()

    return make_response(200, "ok")

def make_response(status_code, body):
    if not isinstance(body, str):
        body = json.dumps(body)
    return {"statusCode": status_code, "body": body}

The purpose of this route is to store the user ID and connection ID in the database along with the connection’s domain and stage. We will use this to send our notification to the client.

def send_ws(user_id, message):
    """Push a notification to the user if they have an active websocket connection."""
    connections = WebsocketClient \
        .query \
        .filter_by(user_id=user_id) \
        .all()

    for conn in connections:
        conn.send(message)

And conn.send():

import boto3
import json
from notifier.db import db, Model
from botocore.exceptions import ClientError

class WebsocketClient(Model):

    ...

    def send(self, message):
        """Send a message to an active connection.

        :param message: can be anything that is JSON-serializable."""
        # get APIGW management client
        apigw_mgmt_client = boto3.client(
            "apigatewaymanagementapi",
            endpoint_url=f"https://{self.domain_name}/{self.stage}",
        )
        try:
            # send message
            apigw_mgmt_client.post_to_connection(
                Data=json.dumps(message).encode("utf-8"),
                ConnectionId=self.connection_id,
            )
        except ClientError as err:
            # gracefully handle case where client is no longer connected
            code = int(err.response["Error"]["Code"])
            if code == 410:
                # client gone, cleanup
                db.session.delete(self)
                db.session.commit()
                return
            raise

This is the where the real action happens. When we want to send a message from the server to the client we do it with the PostToConnection call. We need to provide the API Gateway domain and stage for it to construct the URL needed for the API call. Boto is simply doing HTTP requests to interact with the WebSocket connection as documented here. And you can use an HTTP client directly if you like to get connection info, send a message, and close the connection.

For completeness let’s look at handling the $disconnect route:

def disconnect(event, context):
    # get connection ID
    ctx = event.get("requestContext", {})
    conn_id = ctx.get("connectionId")
    if not conn_id:
        raise Exception("no connection id found")

    # delete the connection record from our DB
    WebsocketClient.delete_connection(connection_id=conn_id)
    db.session.commit()
    return make_response(200, "ok")

Client ➞ Server Messages

But wait, there’s more!

Our application is now ready to send notifications to our client, but if we want to be able to receive messages from the client we can support this case as well. We can define custom routes that are matched based on a route key as documented here and here. In practice this means that if API Gateway receives a JSON message it looks for the route name by default in a field called "action" and decides which Lambda to call based on that value. You can also create a $default route to catch any unhandled message if you prefer to do things that way as well.

Client Code

I implemented a basic WebSocket client in TypeScript using the standard WebSocket API. The only special thing it does is append your access token (managed with axios-jwt) to the WebSocket connection URL.

import { refreshTokenIfNeeded } from 'axios-jwt'

export const WEBSOCKET_EVENT = 'onwebsocketmessage'

export class WSEvent extends Event {
  message: object

  constructor(msg: object) {
    super(WEBSOCKET_EVENT)
    this.message = msg
  }
}

export type WSEventHandler = (ev: WSEvent) => void

export default class WSClient extends EventTarget {
  ws: WebSocket | undefined
  public isConnected: boolean = false
  reconnectTime: number = 1 // time in seconds before reconnect

  // connect
  public open = async () => {
    if (this.ws) {
      if (this.ws.readyState === WebSocket.CONNECTING || this.ws.readyState === WebSocket.OPEN)
        // already open/opening
        return

      this.ws.close() // do reconnect
    }

    // config from create-react-app+dotenv
    if (!process.env.REACT_APP_WS_URL) throw new Error('REACT_APP_WS_URL missing')
    const host = new URL(process.env.REACT_APP_WS_URL)

    // make sure auth token is fresh
    // requestRefresh defined elsewhere - see axios-jwt documentation
    const accessToken = await refreshTokenIfNeeded(requestRefresh)

    // add auth token to URL
    if (accessToken) host.searchParams.set('token', accessToken)

    // create new websocket client
    if (!this.ws) {
      this.ws = new WebSocket(String(host))
      this.ws.onopen = this.handleOpen
      this.ws.onclose = this.handleClose
      this.ws.onmessage = this.handleMessage
    }
  }

  // disconnect
  public close = () => {
    if (this.ws) this.ws.close()
  }

  public reconnect() {
    if (this.ws) this.ws.close()
    this.open()
  }

  // CALLBACKS

  protected handleOpen = (ev: Event) => {
    this.isConnected = true
    this.reconnectTime = 1 // reset reconnect timer

    const ws = this.ws
    if (!ws) return
  }

  protected handleClose = (ev: Event) => {
    this.isConnected = false

    // do reconnect
    setTimeout(() => {
      this.reconnectTime *= 2 // exponential backoff

      this.open()
    }, this.reconnectTime * 1000)

    // reconnect?
    this.open()
  }

  protected handleMessage = (ev: MessageEvent) => {
    // handle message received on WS
    const data = ev.data
    if (!data) return

    // try to parse as JSON
    const msg = JSON.parse(data)

    // create new websocket event and dispatch it to listeners
    const msgEvt = new WSEvent(msg)
    this.dispatchEvent(msgEvt)
  }
}

And as a bonus here’s a React hook that lets you register an event handler for WebSocket messages:

import * as React from 'react'
import WSClient, { WEBSOCKET_EVENT, WSEvent } from './api'

// singleton
let client: WSClient

interface IUseWebSocketClientArgs {
  onEvent?: (evt: WSEvent) => void
}

const useWebSocketClient = ({ onEvent }: IUseWebSocketClientArgs) => {
  React.useEffect(() => {
    if (!client) client = new WSClient()

    // listen for events
    if (onEvent) client.addEventListener(WEBSOCKET_EVENT, onEvent as EventListener)

    // ensure client is connected
    client.open()

    // cleanup handler
    return () => {
      if (onEvent) client.removeEventListener(WEBSOCKET_EVENT, onEvent as EventListener)
    }
  })
  return { client }
}

export default useWebSocketClient

Conclusion

Like many other serverless technologies this approach is certainly not practical for every use case but it is quite reasonable for a lot of common cases. While API Gateway WebSockets kind of support binary data payloads the serverless approach is probably best suited to your application if you’re passing occasional JSON messages around and dealing with relatively low throughput and volume.

Web Application Boring Stack: 2019 Edition

At JetBridge we enjoy developing software applications with our clients that we can take pride in while expanding our areas of knowledge and expertise at the same time. Because we are frequently starting on new projects we have standardized on a harmonious and expressive set of tools and libraries and frameworks to help us rapidly lift off new applications and deliver as much value as we can with minimal repetition.

Our setup isn’t perfect or the end-all stack for every project, but it’s something we’ve evolved over years and it works quite well for us. We continue to learn about new tools and techniques and evolve our workflow so consider this more of a snapshot in time. If you aren’t reading this in July of 2019 then we have probably modified at least some parts of the stack.

Methodology

Our theory of software development is: don’t overcomplicate things.

Pragmatism and business value are the overriding concerns, not the latest and coolest and hippest frameworks or tech. We love playing with new cool stuff as much as any geek but we don’t believe in using something new just for the sake of being new or feeling unhip. Maturity and support should factor into deciding on a library or framework to base your application on, as should maintainability, community, available documentation and support, and of course what actual value it brings for us and our clients.

There is a tendency a lot of engineers have to make software more complex than it needs to be. To use non-standard tools when widely available and known tools exist that might already do the job. To try to shoehorn some neat piece of tech someone read about on Hacker News into something it isn’t really suited for. To depend on extra external services when there are already existing services that can be extended to perform the desired task. Using something too low-level when more abstraction would really simplify things, or using something too fancy and complicated when a simple system-level tool or language would accomplish things more expediently.

Simplicity is a strategy that when used wisely can greatly increase your code readability and maintainability, as well as result in easy to manage operational environments.

Frontend

By the time I am writing this all frameworks and libraries we use have likely been superseded by cool new hip JS jams and you will sneer at our unfashionable choices. Nevertheless, this is what is working well for us today:

React: Vue may have more stars on GitHub but React is still the standard and is used and supported actively by Facebook, among others. Writing apps with React hooks really feels like we are getting closer and closer to functional programming, adding a new level of composibility and code reuse that was clumsily achieved with HOCs before.
Material-UI for React is a toolkit that has almost every sort of widget and utility you might need, powerful theming and styling options, integrates CSS-in-JS very smoothly and looks solid out of the box. It is essentially an implementation of the UI paradigms promulgated by Google so working within its constraints and visual language gives you a reasonable starting point.
Create-React-App/react-scripts: This really does everything you need and configures your new React app with sane defaults. You never need to monkey around with Webpack or HMR again. We have extended CRA/r-s to spit out new frontend projects with extra ESlint and prettier options and Storybook.
Storybook: We prefer to build a component library of small and larger components implemented in isolation using mock data, rather than always coding and testing the layout and design inside the complete app. This allows UI devs to work without being blocked on completion of backend endpoints, helps to enforce the concept of reusable and self-contained components, and lets us preview the various interface states easily.
TypeScript: Everyone uses TypeScript now because it’s good and you should too. It does take some getting used to and learning how to use it properly with React and Redux requires some small amount of learning, but it’s entirely worth it. Remember: you should never need to use any. And when you think you need to use any – you probably just need to add a type argument (generic).
ESLint: ESlint works great with TypeScript now! Don’t forget to set extends: ['plugin:@typescript-eslint/recommended', 'plugin:react/recommended', 'react-app']
Prettier: Set up your editor to run Prettier on your code when you hit save. Not only does it enforce a consistent style, but it also means you can be way way lazier about formatting your code. Less typing but better formatting.
Redux: Redux is nice… I guess. You do need some central place to store your user authentication info and stuff like that, and redux-persist is super handy. In the spirit of keeping things simple though, really ask yourself if you need redux for what you’re doing. Maybe you do, or maybe you can just use a hook or state instead. Sure maybe you think at first that you want to cache some API response in redux, but if you start adding server-side filtering or search or sorting, then it really is better off just as a simple API request inside your component.
Async/await: Stop using the Promise API! Catch exceptions in your UI components where you can actually present an error to the user rather than in your API layer.
Axios: The HTTP client of choice. We use JWT for authentication and recommend our axios-jwt interceptor module for taking care of token storage, authorization headers, and refresh.
Cypress: A popular tool for writing end-to-end tests. Cypress makes it easy to mock API responses and fully test your application as an automated web browser, either headless or used interactively. Can record videos and screenshots of every state and step of your tests to review what your UI looks like and how it reacts even after automated test runs.

I don’t believe there’s anything crazy or unusual here and that’s sort of the point. Stick with what’s standard unless you have a good reason not to.

Backend

Our backend services are always designed around the 12-factor app principles and always built to be cloud-native and when appropriate, serverless.

Most projects involve setting up your typical REST API, talking to other services, and performing CRUD on a PostgreSQL DB. Our go-to stack is:

Python 3.7. Python is clean, readable, has an impressively massive repository of community modules on PyPI, active core development, and a pretty good balance of high-level dynamic features without getting too obtuse or distracting.
Type annotations and type linting with mypy. Python does have type annotations, but they are very limited, not well integrated, and not usually very useful for catching mistakes. I hope the situation improves because many errors have to be discovered at runtime in Python when compared with languages like TypeScript or Go. This is the biggest drawback to Python in my opinion, but we do our best with mypy.
Flask, a lightweight web application framework. Flask is very nicely suited to building REST APIs, providing just enough structure to your application for handling WSGI, configuration, database connections, reusable API handlers, tracing/debugging (with AWS X-Ray), logging, exception handling, authentication, and flexible URL routing. We don’t lean on Flask for much besides providing the glue to hold everything together in a coherent application without imposing too much overhead or boilerplate.
SQLAlchemy for declarative ORM. Has nice features for handling Postgres dialect features such as UPSERT and JSONB. Ability to compose mixins for model and query classes is very powerful and something we are using more and more for features like soft deletion. Polymorphic subtypes are one of the most interesting SQLAlchemy features, allowing you to define a type discriminator column and instantiate appropriate model subclasses based on its value.
Testing: subtransactions wrapping each test, pytest-factoryboy for generating fixtures from our model classes for pytest and for generating mock data for development environments. CircleCI. Pytest fixtures. Flask test client.
Flask-REST-API with Marshmallow helps succinctly define REST endpoints and serialization and validation with a minimum of boilerplate, making heavy use of decorators for a declarative feel when appropriate. As a bonus it also generates OpenAPI spec documents and comes with Swagger-UI to automatically provide documentation of every API endpoint and its arguments and response shapes without any extra effort required.
We are currently developing Flask-CRUD to further reduce boilerplate in the common cases for CRUD APIs and mandating strict data model access control checks.

In projects that require it we can use Heroku or just EC2 for hosting but all of our recent projects have been straightforward enough to build as serverless applications. You can read about our setup and the benefits this brings us in more detail in this article.

We have built a starter kit that ties together all of our backend pieces together in a powerful template to bootstrap new serverless Flask projects called sls-flask. If you’re thinking of building a database-backed REST API in Python, give it a try! You get a lot of power and flexibility in a small bundle. There isn’t anything particularly special or exotic included in it, but we believe the foundation it provides adds up to an extremely streamlined and modern development toolkit.

All of our tooling and templates are open source, and we often contribute bug reports and fixes upstream to modules that we make use of. We encourage you to try out our stack or let us know what you’re using if you’re happy with what you’re doing. Share and enjoy!

Video Encoding on AWS

Adding video encoding support to your application is relatively straightforward with Amazon’s Video On Demand encoding pipeline infrastructure template.

This CloudFormation template provides you with:

A S3 media source bucket where video files get uploaded, with an option to phase out media source files to long-term storage in Glacier.
A DynamoDB table to track the status of the encoding and store all metadata about the source and output files.
A series of Step Functions (Lambda state machines) to manage the stages of the pipeline.
MediaConvert to do the actual video encoding work.
An output S3 bucket for the encoded files and playlists, with a CloudFront CDN distribution in front.
A SNS topic which publishes events to subscribers when media ingestion begins and when it completes, as well as if there is an error.

The one deficiency in the CloudFormation template provided by AWS is that it does not include the SNS topic as a stack output, which makes it harder to tie it into other applications. JetBridge hosts a version of the stack which includes the SNS topic output at https://ext.jetbridge.com.s3.amazonaws.com/vod/video-on-demand-on-aws.template.

You can deploy the stack here:

Once the stack has finished launching, you can try uploading a video file into the source S3 bucket.

When files are added to the bucket a Lambda is automatically triggered that begins the ingestion and kicks it over to MediaConvert after generating a GUID to track the progress of the encoding.

After the encoding is complete you will have an entry in the DynamoDB table with information about the media files and the outputs, including a HLS M3U8 (HTTP Live Streaming MP3 URL UTF-8 playlist) which can be used by any web or mobile client to stream your video at adaptable bitrates.

Integrating To Your Application

The VOD encoder pipeline is a pretty nifty example of how to use ready-made stacks of infrastructure, but what if you want to integrate this pipeline into your application? Let’s look at one way you can accomplish this.

Say you are building a CMS where you want users to be able to upload videos that can be streamed by clients. You will need a user interface for performing the upload and then a way to associate the results with that object when the encoding process completes or errors.

The flow from the application’s perspective will look like this:

Register a Lambda for handling notifications from the VOD SNS topic.
Create an object in your database to store the uploaded video. A row in a video table would suffice just fine. Make up a S3 key for this row (based on the video’s ID or better, UUID) and store it in the video row as well.
Generate a pre-signed S3 PutObject request URL (Python docs) for the media source bucket.
On the browser side, upload the video file to the pre-signed S3 upload URL. Once the upload is complete the Lambda trigger will be automatically invoked, kicking off the encoding job.
Process ingestion notification received from the SNS topic. This notification includes the UUID generated by the pipeline to keep track of your job and the original S3 key of the video file that was just uploaded. Store the VOD task UUID in your video database row associated with the S3 key.
When you receive a completion or error notification from the SNS pipeline, update the video row appropriately. You now have either a HLS playlist URL associated with your video or an error message.

Registering For SNS Notifications

You can set up everything above by hand, but making reusable infrastructure is easier and more powerful. If you are using the Serverless toolkit you can use the SNS topic CloudFormation output (remember the one mentioned above that we had to add to the template?) to register a Lambda to listen for events:

functions:
  vodSnsUpdateHandler:
    handler: myapp.handler.vod_sns_update.handler
    events:
      - sns: ${cf:vod.SnsNotificationTopic}  # cloudformation output

This will invoke the function myapp.handler.vod_sns_update.handler whenever a new message is published on the SNS topic in the CloudFormation stack named vod (that’s what I called it, you can change it if you really want).

Other CloudFormation Stack Outputs

Your application will also need to know the name of the source media S3 bucket to generate the presigned upload request as well as the name of the DynamoDB table to fetch the results from. Again, this example is for Serverless:

provider:
  name: aws
  ...
  environment:
    S3_VOD_SOURCE_BUCKET: ${cf:vod.Source}
    VOD_TABLE: ${cf:vod.DynamoDBTable}

This has the effect of passing the source S3 bucket and DynamoDB table names from the VOD stack outputs into your application as environment variables.

S3 Presigned Upload

You can create a URL that you can give to a client to permit it to upload a file to a designated S3 key:

    s3 = boto3.client("s3")
    put_params = dict(Bucket=os.environ['S3_VOD_SOURCE_BUCKET'], Key=s3key)
    expire = 3600  # one hour
    url = s3.generate_presigned_url(
        ClientMethod="put_object", 
        Params=put_params, 
        ExpiresIn=expire,
    )

This URL can then be returned to a web browser which can then do a PUT to the URL with the contents of the file as the body of the request.

I recommend generating a S3 key in the form of: f"/video/{video.uuid}/media.mp4"

Processing SNS Notifications

This should be a Lambda handler that looks up the associated video entry in your database and updates it with the status published by the VOD pipeline. Some rough sample code:

import boto3
import os
import json
from myapp.db import db
from myapp.model.video import Video
from enum import Enum, unique
from typing import Optional
import logging

log = logging.getLogger(__name__)

@unique
class EncodingStatus(Enum):
    new = "new"
    ingest = "Ingest"
    complete = "Complete"
    error = "Error"

table = os.environ["VOD_TABLE"]
dynamodb = boto3.resource("dynamodb")
table = dynamodb.Table(table)


def handler(event, context):
    records = event.get("Records", [])
    with app.app_context():  # if you use Flask-SQLAlchemy
        for record in records:
            log.debug(f"Processing VOD SNS event...")
            process_event_record(record)
        db.session.commit()
    return "ok"


def process_event_record(record: dict):
    assert "Sns" in record
    assert "Message" in record["Sns"]
    message = json.loads(record["Sns"]["Message"])

    # look up asset by key/bucket
    src_video = message.get("srcVideo")

    status = EncodingStatus(message.get("status", message.get("workflowStatus")))
    guid = message.get("guid")
    log.debug(f"Video: {src_video}, status={status}, guid={guid}")

    if not src_video:
        # this is missing in case of error
        if status == EncodingStatus.error:
            video = db.session.query(Video).filter_by(vod_guid=guid).one_or_none()
            if not video:
                log.warning(f"Got video GUID for unknown video {record}")
            else:
                video.encoding_status = status
        log.warning(f"Got video encoding without video src {record}")
        return None

    # look up video by S3 key
    video = Video.query.filter_by(s3key=src_video).one_or_none()
    if not video:
        log.warning(f"Could not find video {src_video}")
        return None

    # update video
    video.vod_guid = guid
    video.encoding_status = status
    video.vod_last_message = message
    video.hls_url = message.get("hlsUrl") if message.get("hlsUrl") else video.hls_url
    thumbnail_urls = message.get("thumbNailUrl", [])
    video.placeholder_url = thumbnail_urls[0] if thumbnail_urls else None
    video_data_info = get_video_data_info(guid)

    if not video_data_info:
        if status == EncodingStatus.complete:
            log.warning(f"Could not find data about encoding {record}")
        return asset

    src_media_info = video_data_info.get("srcMediainfo")
    encoding_details = json.loads(src_media_info) if src_media_info else None

    if not encoding_details:
        log.warning(f"Could not find encoding info {record} // {encoding_details}")
    video.duration = encoding_details["container"]["duration"]  # ms

    print(f"Media info: {src_media_info}")
    db.session.commit()

def get_video_data_info(guid: str) -> Optional[dict]:
    result = table.get_item(Key={"guid": guid})
    return result.get("Item")

Conclusion

And now you have a powerful media encoding pipeline integrated into your application. Some features to note are :

Thumbnail URLs are automatically generated.
Media info is output which contains everything from duration to dimensions to colorspace.
HLS, DASH, and MP4 outputs are produced.
Quality-Defined Variable Bitrate encoding is used by default.
Microsoft Smooth Streaming (MSS) and Common Media Application Format (CMAF) are also supported.

Hope that was helpful!

Lambda Function To Route Twilio Incoming Phone Calls

The absolute fastest way to create a serverless Twilio incoming call webhook Lambda:

npm install -g serverless
sls install --url https://github.com/revmischa/slspy --name twilcall
echo "twilio" > requirements.txt

handler.py:

from twilio.twiml.voice_response import VoiceResponse
from urllib.parse import parse_qsl


def hello(event, context):
    call = dict(parse_qsl(event['body']))

    resp = VoiceResponse()
    resp.say("hello world!", voice='alice')

    response = {
        "headers": {"content-type": "text/xml"},
        "statusCode": 200,
        "body": str(resp),
    }

    return response

Deploy, then set webhook handler for your phone number:

sls deploy

That’s all!

Customize your response with TwiML.

See this gist to check out more advanced handling with loading the Twilio API key from Secrets Manager, doing lookups with Twilio Add-Ons to detect spam/robocalling, and detailed caller lookup info that is output to a slack channel every call.

Making Use Of AWS Secrets Manager

One of the many new services re-invented at AWS’s re:invent conference was the storage of secrets for applications. Secrets in essence are generally things your application may need to run but you don’t really want to put in source control. Things like API keys, password salt, database connection strings and the like.

Current ways of providing secrets to applications are things like configuration files deployed separately, Heroku’s config which exposes them as environment variables, HashiCorp Vault, CloudFormation variables, and plenty of other solutions. AWS already had a dedicated mechanism for storing secrets and making them available via an API call in it’s SSM Parameter Store service, in which you could store values, with optional encryption.

AWS Secrets Manager is slightly but not very different from SSM Parameter Store; it adds secret rotation capabilities and isn’t as buried deep down inside the obliquely named “AWS Systems Manager” service. I think those are the main features. Oh it also lets you package a set of key/value pairs into one secret, whereas Param Store makes you create separate values that require individual API calls to retrieve.

Using Secrets With Serverless

To store encrypted secrets in the AWS Secrets Manager and make them available to your serverless application, you need to do the following:

Create a secret in Secrets Manager. Select “Other type of secrets” unless you are storing database connection info, in which case click one of those buttons instead.
Select an encryption key to use. Probably best to create a key per application/stage.
Create key/value pairs for your secrets.
After creating the secret, there will be sample code for different languages that shows you how to read it in in your application.

Encryption Key Access

To create encryption keys, look in IAM (click “Encryption Keys on the bottom left in IAM) and assign admins and users of the keys. I manually assign the lambda role that’s been created as a user of the encryption key it needs access to in the console. There may be a way to automate this but it feels like an appropriate step to have some small manual intervention.

Your lambda’s role will look like $service-$stage-$region-lambdaRole

IAM Role

In addition to granting your lambda role access to decrypt your secret, you also need to grant it the ability to access your secret.:

plugins:
  - serverless-python-requirements
  - serverless-pseudo-parameters

provider:
  name: aws
  runtime: python3.7
  stage: ${opt:stage, 'dev'}  # default to dev if not specified on command-line

  iamRoleStatements:
    - Effect: Allow
      Action: secretsmanager:GetSecretValue
      Resource: arn:aws:secretsmanager:#{AWS::Region}:#{AWS::AccountId}:secret:myapp/${self:provider.stage}/*

  environment:
    LOAD_SECRETS: true

This is taken from my serverless.yml file. Note the secret:myapp/dev/* bit – it’s a good idea to use prefixes in your key names so that you can restrict access to resources by stage (i.e. dev only has access to dev secrets, prod only has access to prod secrets). In general prefixing AWS resources with application/stage identifiers is a crude but readable and simple way to create IAM policies for access groups of related resources.

serverless-pseudo-parameters is a handy plugin that lets you interpolate things like AWS::Region in your configuration so that you don’t need to do an unwieldy CloudFormation Fn::Join array to generate the ARN.

Reading Secrets In Your Application

Now that your application has access to GetSecretValue and is a user of the encryption key used to encrypt your secrets, you need to access the secrets in your application.

Sample code is provided in the Secrets Manager console to read your secret. One thing to note is that if you store key/value pairs in your secret, which is most likely what you’ll want to do almost all the time, you get it back as JSON. The sample code provided doesn’t decode it. The general idea is something like:

import boto3
import base64
import json


def get_secret(secret_name):
    """Fetch secret via boto3."""
    client = boto3.client(service_name='secretsmanager')
    get_secret_value_response = client.get_secret_value(SecretId=secret_name)

    if 'SecretString' in get_secret_value_response:
        secret = get_secret_value_response['SecretString']
        return json.loads(secret)
    else:
        decoded_binary_secret = base64.b64decode(get_secret_value_response['SecretBinary'])
        return decoded_binary_secret

And then you can do as you please with the secrets.

App Configuration

I like consuming secrets directly into my application’s configuration at startup. With Flask it looks something like this:

    if os.getenv('LOAD_SECRETS'):
        # fetch config secrets from Secrets Manager
        secret_name = app.config['SECRET_NAME']  # dev, prod
        secrets = get_secret(secret_name=secret_name)
        if secrets:
            log.debug(f"{len(secrets.keys())} secrets loaded")
            app.config.update(secrets)
        else:
            log.debug("Failed to load secrets")

First, only try to load secrets if we’re executing under serverless/lambda. LOAD_SECRETS is defined as true in the serverless.yml snippet above. This should not be true when running tests, as you don’t want tests to access secrets and you don’t want to pay for all the additional accesses anyhow.
SECRET_NAME is in application config and can be different for dev and prod.
app.config.update(secrets) adds whatever you stored in Secrets Manager to your app config.
I should probably make a dumb PyPI module to do this automatically.

Serverless Python API Development

In a previous article I discussed how to interact with the serverless AWS Lambda platform using only tools provided by Amazon. This was a valuable experiment that I suggest applying to any new technology or interesting new system you’d like to learn. Start with the basics and try doing a project without too many extra tools or abstractions so that you can get an idea of how the underlying system works and what’s unpleasant or boilerplate-y or requires too much effort. Once you have an idea of how the pieces fit together you can have a much better appreciation for the abstractions that go on top because you understand how they work, what problems they are solving, and what pain they are saving you from.

AWS services are powerful but generally need to be put together in coherent ways to achieve your goals. They’re modules that provide the functionality you need but still require some glue to make a nice developer experience. Fortunately because the entire platform is scriptable, software tools and additional layers of abstraction are rapidly increasing the capabilities of software engineers on their own to manage configuration without the need for any hardware or humans in between. CloudFormation (CF) allows declaration of your infrastructure with JSON or YAML. CF templates like the Serverless and CodeStar transforms make it easier to write less CloudFormation code to describe a serverless configuration. And then tools like the Serverless toolkit add another layer of automation on top of CF and provide a really excellent developer experience. Not to be outdone, Amazon provides an even higher level toolkit called Amplify (subject of a future article) to further increase the leverage of effort to available hardware and software muscle.

Serverless Toolkit

After going through the process of building some toy applications using AWS SAM and the Serverless CF transform, I quickly saw some of the drawbacks of not using a more advanced system to automate things:

Viewing logs. Looking at CloudWatch logs in the AWS Console is not a great way to view the output of your application in real time, or in any time really.
It wasn’t clear to me how to save some pieces of a serverless application architecture for re-use in later projects. I posed a question to the Flask mailing list and IRC channel about how to make an extension based around it and didn’t get a useful response.
Defining stuff like API gateways, S3 buckets for code, and domains in CF is tedious. It can be automated further.
It would be nice to have some information readily available, such as what URL my application is deployed at.
Deployments, including to different stages.
Telling me when a deployment is finished, especially when using CodeStar.
Invoking functions for testing and via automation.
Managing dependencies.

And some other general stuff like keeping track of the correct AWS configuration profile and region.

As happens so often in the field of Computers, I’m not the first one to encounter these issues and some other people have already solved most of the problem for me.

To ensure a steady supply of confusion when discussing the relatively recent trend of serverless application architecture, there exists a collection of tools called Serverless, which resides on serverless.com. This should not be confused with serverless the adjective or the Serverless Application Model (SAM) or the AWS Serverless CF transform.

Every one of the issues mentioned above is simply handled by Serverless. I believe it’d be an unnecessary expenditure of time and effort to continue to develop serverless applications without it, based upon my recent experience trying to do so. Unless you’re just starting out and want to get a feel for the basics first, that is.

I won’t reiterate the Serverless quickstart here, go try it out yourself on their site. It takes very little effort, especially if you already have AWS credentials set up. I will instead talk about what advantages it gives you:

Logging

This is easy. You can view (and tail) the logs for any function with

sls logs -f myfunction -t

Reusability

# immediately create a Flask app based on my template
sls install --url https://github.com/revmischa/serverless-flask --name myapp

Some of what people have been doing is going the same route as Create-React-App and creating templates for Serverless projects that can be accessed with “sls install.” On the one hand this does make it very easy to create and share reusable setups and allows for divergence as templates evolve, but it makes it much harder for projects started with older templates to incorporate new refinements. In the realm of Flask and Python, I don’t feel this problem is solved just by templates and some sort of python module that can co-evolve is needed. Something analogous to the react-scripts package that goes along with Create-React-App would likely be the way to go.

Configuration And CloudFormation

Now you declare your resources and functions in the serverless.yml configuration file, along with lots of other useful stuff.

Nearly all of the boilerplate CF needed for serverless like a S3 bucket for code, IAM permissions for invoke and CloudWatch, API Gateway, etc are totally hidden from you and you never need to care about them. Only the minimum configuration and CF needed to describe what’s unique about your setup is required from you. On a scale of sendmail.conf to .emacs, serverless.yml is fairly high on the configuration file sublimity scale.

Info

This is easy. Where’d I park my domain again?

$ sls info
Service Information
service: myapp
stage: dev
region: eu-central-1
stack: myapp-dev
api keys:
  None
endpoints:
  ANY - https://di1baaanvc.execute-api.eu-central-1.amazonaws.com/dev
  ANY - https://di1baaanvc.execute-api.eu-central-1.amazonaws.com/dev/{proxy+}
  GET - https://di1baaanvc.execute-api.eu-central-1.amazonaws.com/dev/openapi
functions:
  app: myapp-dev-app
  openapi: myapp-dev-openapi
Serverless Domain Manager Summary
Domain Name
  myappmyapp.net
Distribution Domain Name
  dcwyw3gslhqw1.cloudfront.net

Deployment

This is easy too! Too easy!

$ sls deploy
$ sls deploy -s prod  # specify stage

This bundles requirements if needed, packages the service, uploads to S3, and kicks off a CloudFormation stack update.

Notice that sweet Serverless Domain Manager Summary section?
That, my friend, is the serverless-domain-manager plugin. If you want your endpoints to be deployed under a domain name you already have in a Route53 zone (and hopefully have an ACM certificate in us-west-1 to go with it) you can have Serverless automatically fire up the domain or subdomain for you along with a CloudFront distribution and API Gateway domain mapping.

I discovered an issue with the domain manager plugin selecting the ACM certificate for your domain at random among a list of matching domain names. This was picking an expired previous certificate, so I fixed it to filter out any unusable certificates. My PR was quickly and politely merged. Always a positive sign.

Waiting / Notifications

The aforementioned deploy command tells you when it’s done. Then you can test it out right away. You can speed it by only deploying a specific function, or using the S3 accelerate option to speed up uploading of your artifacts. Don’t waste time deploying stuff you don’t need or watching the CodeStar web UI.

Invoking Functions

AWS SAM is pretty easy, and so is Serverless. If developing a python webapp with the serverless-wsgi plugin, you can also serve your app up locally.

Managing Dependencies

(This part is python-specific)

How to manage dependencies for your python lambda? Well, just stick them in requirements.txt. Duh, right? With Serverless, more or less right. Remember that any dependencies have to be bundled in your lambda’s zip file. Need to build binary dependencies and not on a linux amd64 platform? Just add “dockerizePip: true” to the serverless-python-requirements plugin configuration in serverless.yml and you’re good to go.

Note that if invoking functions locally or starting the WSGI server, you still need a local virtualenv. One wacky non-Serverless template I looked at used pipenv instead to manage both local and lambda dependencies, but I couldn’t advise it; it’s pretty weaksauce.

Extending Serverless

Mostly what I’ve been doing with AWS Lambda is making small web API services using Python and the Flask microframework. With serverless providing exactly the tooling I need, I also want to be able to start new projects with a minimum of effort and have some pieces already in place that I can build on for my application.

I forked a serverless-flask template I found and started building on top of it. I made it not ask if you want to use python 2 or 3 (why not ask if I want UTF-8 or EBCDIC while you’re at it?) and defaulted dockerizing pip to false.

If building an API server in Flask, your life can be made much nicer with the addition of marshmallow to handle serializing and deserializing requests, flask-apispec to integrate marshmallow with OpenAPI (“swagger”) and Flask, and CORS. My version of the template includes all of this to make it as easy as humanly possible to make a documented serverless python REST API with the absolute minimal amount of effort and typing. And as a bonus it generates client libraries for your API from the OpenAPI definition in any language you desire.

Instructions for using the template and getting started quickly can be found here.

Serverless? Why Not

This article is a marker in the path our journey so far has taken us. Improving how we build applications and services is an ongoing process. Our previous milestone was unassisted AWS services, this present adventure was improved tooling for those services, and the next level to up may be AWS Amplify and GraphQL. Or maybe not. Stay tuned.