Русский – Изучать Как Иностранный Язык

Click here to read the English version of this article.

Изучать русский язык – изучать исключения и ненужную сложность.

Когда студент начинает учить язык, первое слово – «Здравствуйте!»

И сразу же он знает, что не будет легко. Чтобы просто сказать “hello” нужно пять слогов, и постоянно хочется найти недостающие гласные, как будто их не хватает. Как произносить “здр-”? Это очень пугающее начало, и дальше только труднее. 

Например, когда человек хочет сказать о идти куда-то, ему нужно учитывать или “идти” это одно место или много мест – ”идти” или ходить – пешком или на колесном транспорте (ехать/ездить), “в” или “под” или “у” или “из” (уйти, выйти, прийти, …), и если совершенный или несовершенный вид (“пойти”), или на самолёте или на корабле.

После того как студент выучил “идти”, он может выучить как использовать существительные “на каждый день”. В отличие от английского, у русских существительных есть падежи. Есть простые правила, как например винительный падеж – ”я читаю книгу.”  Окей, есть три рода, и это не так трудно, например окончания слов в мужском и среднем родах похожи, но когда студент изучает дальше, у него возникают сложности побольше .

Родительный падеж (кого?/чего?) – проклятие каждого изучающего русский язык. Родительный падеж был аблатив падежом давным давно, но сейчас они оба – только родительный. Поэтому, он используется  во многих случаях. Когда чего-то нет, родительный падеж – ”нет книги.” Когда считаете, родительный падеж – ”12 стульев.”  Образовывать множественные существительные в родительном падеже – возможно самое сложное в русском языке. Студенты русского практикуют это часто.

Форма родительного падежа тоже появляется когда используется одушевлённый объект в винительном падеже (только если мужской) – ”Я вижу коня”, а когда неодушевлённый объект – “Я вижу дом”, и тоже самое с числами два-четыре – ”Я вижу три вилки.”

Использовать числа тоже очень сложно. Они склоняются конечно, а также сохраняют свои архаичные двойсвестние формы из старославянского языка для чисел 2-4. Когда несчастный студент хочет сказать о количествах, ему обязательно нужно думать или число заканчивается на один, два, три, четыре, или больше.

Один стул – (форма именительного падежа)

Два/три/четыре стула – (форма родительного падежа – двойсвестный)

Пять и т.д. стульев – (форм множественного родительного)

Дальше, «двенадцать стульев» но «двадцать один стул». Ой вей.

И если бы хотел говорить о собирательных числительных, также используют  множественный родительный даже для чисел 2-4: «двое друзей». Смотрите ещё  больше сложностей здесь

Русская орфография и фонология менее трудна, немного исключений для носителей английского языка. Буквы <Ы> (делается из «ъ + і») и её звука /ɨ/ нет в английском языке. Трудно произносить, и слышать разницу между <И> и <Ы>, так же как и между <Ш> и <Щ>. Мягкий знак <Ь> и твёрдый знак <Ъ> просто сбивают с толку. А буква <Ё> часто пишется  как <Е>, интересно, что людей которые пишут её правильно называют ёфикаторами.

Кириллица – алфавит назван в честь Кирилла и Мефодия главным образом состоит из Греческого алфавита, с немногими символами из Глаголицы «Аз Буки Веди» представляющими звуки, которых нет в Греческом алфавите (как <Ш>). Носители русского языка не верят мне, когда я говорю русский алфавит – это почти всё из Греческого алфавита. Ещё я покажу эту таблицу.

Некоторые вещи легче в русском языке чем у большинства индоевропейских языков. Например глаголы. Спряжение глагола обычно регулярное, и только одна основная форма, это легко запомнить.

Так же глаголов только три времени (прошлое, настоящее, будущее), реже используемые деепричастия, а наиболее совершенные глаголы формируются  префиксами (писать ⇾ написать).

А много слов таких же  как и в романском, немецком, латинском и английском языках. Немного примеров:

Есть – как “est” на латином (to be)

Ярмарка – как “jahr” на немецком

Проект – как “projekt” на немецком

Бровь – “brow”

Бить – “beat”

Быть – “be” 

Ложь – “lie”

Носок – “sock”

Щека – “cheek”

Пламя, плыть, плот, полёт – (flame, fleet, float, flight) – общий индоевропейский сдвиг из /ф/ от /п/ (c.f. Lat. “pater” ⇾ “father”)

О мате, я знаю мало. Не понимаю русские ругательства – это так сложно. В английском языке ругательства не очень сильные или оскорбительные, но люди советуют мне не говорить эти слова на русском. На самом деле в английской Википедии – ”In modern Russia, the use of mat is censored in the media and the use of mat in public constitutes petty hooliganism, a form of disorderly conduct, punishable under article 20.1.1 of the Offences Code of Russia.” Мало информации в английской Википедии о мате, но в русской Википедии – длинная статья, которую я не могу понять и Гугл Перевод не помогает.

Изучать русский язык – это приключение и довольно не быстро. С практикой, возможно обращаться к людям, но сложнее понимать когда они отвечают. Если я читаю со словарём  или кто-то говорит медленно, я могу понять много. Говори со мной и помогай мне учить!

Studying Russian as a Foreign Language

Click here to read the Russian version of this article.

I started studying the Russian language in university for a short time, then picked up formally studying it again since living and spending a lot of time in Eastern Europe. I love languages and find them super fascinating, especially syntax and orthography. For whatever reason these things are greatly interesting to me and fun to learn about. Like all nice theoretical studies, actual practical applications and reality dampen the fun somewhat.

To study practical everyday Russian is to study exceptions and unnecessary complexity. The language it most resembles for me is Latin both grammatically and in terms of the extra work you have to do that just seems somehow extra or not adding much.

When a student begins to study the Russian language, the first word they encounter as in every other language study is “hello!” In Russian this word is “Здравствуйте!” (Zdrastvuyte!) Which is five or six syllables depending how you count that preliminary imposing consonant cluster with a rolled “R”. The student immediately gets the sense that this isn’t going to be so easy. A first impression is that some key vowels seem to be missing from most words. How does one say “zdr”? From here it goes downhill.

When you want to say you are going somewhere in Russian it’s not enough to express the concept of travel, but it’s almost required to be accompanied by information about the mode of travel. There are words for specifying walking, going by means of wheeled conveyance, sailing, flying. Unlike many other languages, not only should the mode be specified but each verb consists of a unidirectional/multidirectional pair that must be discriminated (“I went to the store on foot end of story” vs “I went on foot to the store and then somewhere else”) as well as an imperfective/perfective pair that forces the speaker to consider if the action was completed or ongoing (“I walked” vs “I was walking”). On top of having to pick the right verb for the transport modality/perfective aspect/directionality combination one also should often prefix the verb to indicate if the motion is into, around, out of, on top of, under, through, out from, up to but not inside, and so on.

Along with learning the verbs of motion another fundamental aspect of the language is the noun inflection system. There are three genders, two numbers, and six main cases. Being an Indo-European language this system roughly similar to Latin, German, Old English and lots of other languages in this family.

While Russian’s case system is not particularly unique there is the case of the particularly unpleasant Russian genitive. There are a couple of historical reasons for this; the Proto-Indo-European ablative case was folded into the modern genitive case, and the genitive is used for expressions of quantity which still retain an archaic dual number form used for numbers of things between two and four. The abessive construction (“there is no book”) also uses the genitive form. The formation of the genitive plural is famously complicated and difficult and as a student of Russian you devote considerable time and practice to this very common and challenging puzzle.

As if that wasn’t enough to keep straight, the genitive form (but not construction) comes into play when a direct object has a soul, though only if it is a masculine noun. When declining a masculine animate direct object one uses the genitive form not the accusative.

The declension of numerals in Russian is its own special challenge which is hard to convey in English but involves a lot of genitive cases. You can look at the Wikipedia section to get a little taste though.

Most numbers ending with “1” (in any gender: оди́н, одна́, одно́) require Nominative singular for a noun: два́дцать одна́ маши́на (21 cars), сто пятьдеся́т оди́н челове́к (151 people). Most numbers ending with “2”, “3”, “4” (два/две, три, четы́ре) require Genitive singular: три соба́ки (3 dogs), со́рок два окна́ (42 windows). All other numbers (including 0 and those ending with it) require Genitive plural: пять я́блок (5 apples), де́сять рубле́й (10 rubles). Genitive plural is also used for numbers ending with 11 to 14 and with inexact numerals: сто оди́ннадцать ме́тров (111 meters); мно́го домо́в (many houses). Nominative plural is used only without numerals: э́ти дома́ (these houses); cf. три до́ма (3 houses; G. sg.). These rules apply only for integer numbers. For rational numbers see below.

https://en.wikipedia.org/wiki/Russian_declension#Numerals

The Russian writing and sound system isn’t so hard in my opinion for native English speakers, once you get used to the Cyrillic alphabet. There are some things that stand out as distinct compared to English however.

The first is the letter <Ы> which one pronounces a bit like the “i” in “bill”, but pronounced much lower in your throat. There is a distinction made between this sound and “И” (sounds like “ee”) although when the two sounds are spoken it’s damn hard to hear the difference normally. Also important distinctions are made between voiced and unvoiced “sh” and palletized consonants, which again I think are extremely difficult for a native English speaker to pick up on or enunciate without a great deal of practice. There is a letter “Ё” (“yo”) which is often just written as “E” (“ye”) which a Russian student just has to accept and assume they are being tricked sometimes. Finally there are the unspoken letters “Ь” and “Ъ” which are called soft sign and hard sign. The soft sign palletizes the letter in front of it and the hard sign mostly doesn’t do anything (occasionally depalletizing or demarcating a stop), and used to be extremely common in writing until the Bolsheviks ruthlessly purged it almost entirely from the language.

The Cyrillic alphabet is named after St. Cyrill who did not invent it but did invent an alphabet called Glagolitic (“speaking”) of which a few letters were taken for the Cyrillic alphabet, though aside from those characters Cyrillic is mostly cribbed from the Greek alphabet.

One of the easiest things to learn in Russian is the conjugation of verbs. In fact there is only really one set of personal endings you have to learn (well, two conjugations but they’re the same pattern). On top of this there are only three tenses to contend with (past, present, and future). The past is formed by adding gender markers (a standard masculine/feminine/neuter or one shared plural suffix) to the root, which is refreshingly simple although it seems strange that the gender is relevant for the conjugation of verbs and only in the past. Participles do exist but as far as I can tell they are basically never encountered and only as sort of a final footnote in most textbooks.

The only real complication with verbs is the previously mentioned perfective/imperfective aspect and prefixes. Verbs can have things stuck on the front to indicate if it is perfect or imperfect, and a future tense is simply formed by using a perfective form in what would normally look like the present, but since you can’t have a perfect action in the present, it means it’s the future instead. A bit odd but does make a certain kind of sense when you think about it.

Finally one last notable difference between English and Russian is swearing. The level of expressivity and creativity that appears to go into swearing in Russian is on another level, and probably beyond any sort of deep comprehension to a non-native speaker. Swear words also have vastly more force in Russian, and would almost never be uttered in any sort of semi-polite company and some probably would get you ejected from respectable society.

I am unable to really grasp the subtleties and complexity of the different prefixed and suffixed versions of “fuck” in Russian, and Google Translate doesn’t even try.

Studying Russian is a challenge, but getting all of the grammar right is really not the most important thing for daily usage. Even if you screw up most of the inflections people can still understand your meaning, and even native speakers get things wrong frequently. For me the hardest part is not speaking Russian but understanding what people say, mostly due to my unimpressive vocabulary and the speed in which people speak, but that can be true in any language and isn’t specific at all to Russian.

Lambda Function To Route Twilio Incoming Phone Calls

The absolute fastest way to create a serverless Twilio incoming call webhook Lambda:

npm install -g serverless
sls install --url https://github.com/revmischa/slspy --name twilcall
echo "twilio" > requirements.txt

handler.py:

from twilio.twiml.voice_response import VoiceResponse
from urllib.parse import parse_qsl


def hello(event, context):
    call = dict(parse_qsl(event['body']))

    resp = VoiceResponse()
    resp.say("hello world!", voice='alice')

    response = {
        "headers": {"content-type": "text/xml"},
        "statusCode": 200,
        "body": str(resp),
    }

    return response

Deploy, then set webhook handler for your phone number:

sls deploy

That’s all!

Customize your response with TwiML.

See this gist to check out more advanced handling with loading the Twilio API key from Secrets Manager, doing lookups with Twilio Add-Ons to detect spam/robocalling, and detailed caller lookup info that is output to a slack channel every call.

Making Use Of AWS Secrets Manager

One of the many new services re-invented at AWS’s re:invent conference was the storage of secrets for applications. Secrets in essence are generally things your application may need to run but you don’t really want to put in source control. Things like API keys, password salt, database connection strings and the like.

Current ways of providing secrets to applications are things like configuration files deployed separately, Heroku’s config which exposes them as environment variables, HashiCorp Vault, CloudFormation variables, and plenty of other solutions. AWS already had a dedicated mechanism for storing secrets and making them available via an API call in it’s SSM Parameter Store service, in which you could store values, with optional encryption.

AWS Secrets Manager is slightly but not very different from SSM Parameter Store; it adds secret rotation capabilities and isn’t as buried deep down inside the obliquely named “AWS Systems Manager” service. I think those are the main features. Oh it also lets you package a set of key/value pairs into one secret, whereas Param Store makes you create separate values that require individual API calls to retrieve.


Using Secrets With Serverless

To store encrypted secrets in the AWS Secrets Manager and make them available to your serverless application, you need to do the following:

  • Create a secret in Secrets Manager.  Select “Other type of secrets” unless you are storing database connection info, in which case click one of those buttons instead.
  • Select an encryption key to use. Probably best to create a key per application/stage.
  • Create key/value pairs for your secrets.
  • After creating the secret, there will be sample code for different languages that shows you how to read it in in your application.

Encryption Key Access

To create encryption keys, look in IAM (click “Encryption Keys on the bottom left in IAM) and assign admins and users of the keys. I manually assign the lambda role that’s been created as a user of the encryption key it needs access to in the console. There may be a way to automate this but it feels like an appropriate step to have some small manual intervention.

Your lambda’s role will look like $service-$stage-$region-lambdaRole

IAM Role

In addition to granting your lambda role access to decrypt your secret, you also need to grant it the ability to access your secret.:

plugins:
  - serverless-python-requirements
  - serverless-pseudo-parameters

provider:
  name: aws
  runtime: python3.7
  stage: ${opt:stage, 'dev'}  # default to dev if not specified on command-line

  iamRoleStatements:
    - Effect: Allow
      Action: secretsmanager:GetSecretValue
      Resource: arn:aws:secretsmanager:#{AWS::Region}:#{AWS::AccountId}:secret:myapp/${self:provider.stage}/*

  environment:
    LOAD_SECRETS: true

This is taken from my serverless.yml file. Note the secret:myapp/dev/* bit – it’s a good idea to use prefixes in your key names so that you can restrict access to resources by stage (i.e. dev only has access to dev secrets, prod only has access to prod secrets). In general prefixing AWS resources with application/stage identifiers is a crude but readable and simple way to create IAM policies for access groups of related resources.

serverless-pseudo-parameters is a handy plugin that lets you interpolate things like AWS::Region in your configuration so that you don’t need to do an unwieldy CloudFormation Fn::Join array to generate the ARN.

Reading Secrets In Your Application

Now that your application has access to GetSecretValue and is a user of the encryption key used to encrypt your secrets, you need to access the secrets in your application.

Sample code is provided in the Secrets Manager console to read your secret. One thing to note is that if you store key/value pairs in your secret, which is most likely what you’ll want to do almost all the time, you get it back as JSON. The sample code provided doesn’t decode it. The general idea is something like:

import boto3
import base64
import json


def get_secret(secret_name):
    """Fetch secret via boto3."""
    client = boto3.client(service_name='secretsmanager')
    get_secret_value_response = client.get_secret_value(SecretId=secret_name)

    if 'SecretString' in get_secret_value_response:
        secret = get_secret_value_response['SecretString']
        return json.loads(secret)
    else:
        decoded_binary_secret = base64.b64decode(get_secret_value_response['SecretBinary'])
        return decoded_binary_secret

And then you can do as you please with the secrets.

App Configuration

I like consuming secrets directly into my application’s configuration at startup. With Flask it looks something like this:

    if os.getenv('LOAD_SECRETS'):
        # fetch config secrets from Secrets Manager
        secret_name = app.config['SECRET_NAME']  # dev, prod
        secrets = get_secret(secret_name=secret_name)
        if secrets:
            log.debug(f"{len(secrets.keys())} secrets loaded")
            app.config.update(secrets)
        else:
            log.debug("Failed to load secrets")
  • First, only try to load secrets if we’re executing under serverless/lambda. LOAD_SECRETS is defined as true in the serverless.yml snippet above. This should not be true when running tests, as you don’t want tests to access secrets and you don’t want to pay for all the additional accesses anyhow.
  • SECRET_NAME is in application config and can be different for dev and prod.
  • app.config.update(secrets) adds whatever you stored in Secrets Manager to your app config.
  • I should probably make a dumb PyPI module to do this automatically.

Serverless Python API Development

In a previous article I discussed how to interact with the serverless AWS Lambda platform using only tools provided by Amazon. This was a valuable experiment that I suggest applying to any new technology or interesting new system you’d like to learn. Start with the basics and try doing a project without too many extra tools or abstractions so that you can get an idea of how the underlying system works and what’s unpleasant or boilerplate-y or requires too much effort. Once you have an idea of how the pieces fit together you can have a much better appreciation for the abstractions that go on top because you understand how they work, what problems they are solving, and what pain they are saving you from.

AWS services are powerful but generally need to be put together in coherent ways to achieve your goals. They’re modules that provide the functionality you need but still require some glue to make a nice developer experience. Fortunately because the entire platform is scriptable, software tools and additional layers of abstraction are rapidly increasing the capabilities of software engineers on their own to manage configuration without the need for any hardware or humans in between. CloudFormation (CF) allows declaration of your infrastructure with JSON or YAML. CF templates like the Serverless and CodeStar transforms make it easier to write less CloudFormation code to describe a serverless configuration. And then tools like the Serverless toolkit add another layer of automation on top of CF and provide a really excellent developer experience. Not to be outdone, Amazon provides an even higher level toolkit called Amplify (subject of a future article) to further increase the leverage of effort to available hardware and software muscle.

Serverless Toolkit

After going through the process of building some toy applications using AWS SAM and the Serverless CF transform, I quickly saw some of the drawbacks of not using a more advanced system to automate things:

  1. Viewing logs. Looking at CloudWatch logs in the AWS Console is not a great way to view the output of your application in real time, or in any time really. 
  2. It wasn’t clear to me how to save some pieces of a serverless application architecture for re-use in later projects. I posed a question to the Flask mailing list and IRC channel about how to make an extension based around it and didn’t get a useful response.
  3. Defining stuff like API gateways, S3 buckets for code, and domains in CF is tedious. It can be automated further.
  4. It would be nice to have some information readily available, such as what URL my application is deployed at.
  5. Deployments, including to different stages.
  6. Telling me when a deployment is finished, especially when using CodeStar.
  7. Invoking functions for testing and via automation.
  8. Managing dependencies.

And some other general stuff like keeping track of the correct AWS configuration profile and region. 

As happens so often in the field of Computers, I’m not the first one to encounter these issues and some other people have already solved most of the problem for me. 

To ensure a steady supply of confusion when discussing the relatively recent trend of serverless application architecture, there exists a collection of tools called Serverless, which resides on serverless.com. This should not be confused with serverless the adjective or the Serverless Application Model (SAM) or the AWS Serverless CF transform.

Every one of the issues mentioned above is simply handled by Serverless. I believe it’d be an unnecessary expenditure of time and effort to continue to develop serverless applications without it, based upon my recent experience trying to do so. Unless you’re just starting out and want to get a feel for the basics first, that is.

I won’t reiterate the Serverless quickstart here, go try it out yourself on their site. It takes very little effort, especially if you already have AWS credentials set up. I will instead talk about what advantages it gives you:

Logging

This is easy.  You can view (and tail) the logs for any function with

sls logs -f myfunction -t

Reusability

# immediately create a Flask app based on my template
sls install --url https://github.com/revmischa/serverless-flask --name myapp

Some of what people have been doing is going the same route as Create-React-App and creating templates for Serverless projects that can be accessed with “sls install.” On the one hand this does make it very easy to create and share reusable setups and allows for divergence as templates evolve, but it makes it much harder for projects started with older templates to incorporate new refinements. In the realm of Flask and Python, I don’t feel this problem is solved just by templates and some sort of python module that can co-evolve is needed. Something analogous to the react-scripts package that goes along with Create-React-App would likely be the way to go.

Configuration And CloudFormation

Now you declare your resources and functions in the serverless.yml configuration file, along with lots of other useful stuff

Nearly all of the boilerplate CF needed for serverless like a S3 bucket for code, IAM permissions for invoke and CloudWatch, API Gateway, etc are totally hidden from you and you never need to care about them. Only the minimum configuration and CF needed to describe what’s unique about your setup is required from you. On a scale of sendmail.conf to .emacs, serverless.yml is fairly high on the configuration file sublimity scale.

Info

This is easy. Where’d I park my domain again?

$ sls info
Service Information
service: myapp
stage: dev
region: eu-central-1
stack: myapp-dev
api keys:
  None
endpoints:
  ANY - https://di1baaanvc.execute-api.eu-central-1.amazonaws.com/dev
  ANY - https://di1baaanvc.execute-api.eu-central-1.amazonaws.com/dev/{proxy+}
  GET - https://di1baaanvc.execute-api.eu-central-1.amazonaws.com/dev/openapi
functions:
  app: myapp-dev-app
  openapi: myapp-dev-openapi
Serverless Domain Manager Summary
Domain Name
  myappmyapp.net
Distribution Domain Name
  dcwyw3gslhqw1.cloudfront.net

Deployment

This is easy too! Too easy!

$ sls deploy
$ sls deploy -s prod # specify stage

This bundles requirements if needed, packages the service, uploads to S3, and kicks off a CloudFormation stack update. 

Notice that sweet Serverless Domain Manager Summary section?
That, my friend, is the serverless-domain-manager plugin. If you want your endpoints to be deployed under a domain name you already have in a Route53 zone (and hopefully have an ACM certificate in us-west-1 to go with it) you can have Serverless automatically fire up the domain or subdomain for you along with a CloudFront distribution and API Gateway domain mapping.

I discovered an issue with the domain manager plugin selecting the ACM certificate for your domain at random among a list of matching domain names. This was picking an expired previous certificate, so I fixed it to filter out any unusable certificates. My PR was quickly and politely merged. Always a positive sign.

Waiting / Notifications

The aforementioned deploy command tells you when it’s done. Then you can test it out right away. You can speed it by only deploying a specific function, or using the S3 accelerate option to speed up uploading of your artifacts. Don’t waste time deploying stuff you don’t need or watching the CodeStar web UI.

Invoking Functions

AWS SAM is pretty easy, and so is Serverless. If developing a python webapp with the serverless-wsgi plugin, you can also serve your app up locally.

Managing Dependencies

(This part is python-specific)

How to manage dependencies for your python lambda? Well, just stick them in requirements.txt. Duh, right? With Serverless, more or less right. Remember that any dependencies have to be bundled in your lambda’s zip file. Need to build binary dependencies and not on a linux amd64 platform? Just add “dockerizePip: true” to the serverless-python-requirements plugin configuration in serverless.yml and you’re good to go.

Note that if invoking functions locally or starting the WSGI server, you still need a local virtualenv. One wacky non-Serverless template I looked at used pipenv instead to manage both local and lambda dependencies, but I couldn’t advise it; it’s pretty weaksauce.

Extending Serverless

Mostly what I’ve been doing with AWS Lambda is making small web API services using Python and the Flask microframework. With serverless providing exactly the tooling I need, I also want to be able to start new projects with a minimum of effort and have some pieces already in place that I can build on for my application.

I forked a serverless-flask template I found and started building on top of it. I made it not ask if you want to use python 2 or 3 (why not ask if I want UTF-8 or EBCDIC while you’re at it?) and defaulted dockerizing pip to false.

If building an API server in Flask, your life can be made much nicer with the addition of marshmallow to handle serializing and deserializing requests, flask-apispec to integrate marshmallow with OpenAPI (“swagger”) and Flask, and CORS. My version of the template includes all of this to make it as easy as humanly possible to make a documented serverless python REST API with the absolute minimal amount of effort and typing. And as a bonus it generates client libraries for your API from the OpenAPI definition in any language you desire.

Instructions for using the template and getting started quickly can be found here.

Serverless? Why Not

This article is a marker in the path our journey so far has taken us. Improving how we build applications and services is an ongoing process. Our previous milestone was unassisted AWS services, this present adventure was improved tooling for those services, and the next level to up may be AWS Amplify and GraphQL. Or maybe not. Stay tuned.

Serverless Python Web Applications With AWS Lambda and Flask

This is the first part of a series on serverless development with Python. Next part.

Serverless applications are great from the perspective of a developer – no infrastructure to manage, automatically scaling to meet requests without ever having to think about it, pay by the RAM gigabyte/second, and the ability to deploy via code however you desire. Logging comes free. For DevOps folks it’s a nightmare, as it represents the rapidly approaching obsolescence of their skills involving setting up web servers, load balancing, monitoring, logging, hanging out in datacenters, and other such quaint aspects of deploying web applications. Making a traditional web application run on AWS Lambda is not quite trivial yet, but is well worth understanding and considering next time you need a web service somewhere and it will surely get smoother and easier with time.

Oh yeah, what’s serverless mean here? It means you don’t manage any servers or long-lived processes. You put code in the cloud, and you run it whenever you want, however much you want, and don’t have to worry about the scaling part, while paying for only the CPU, memory, network traffic, and other services you consume. It’s a completely different way of deploying an application compared to managing daemons and servers and infrastructure and load balancing and all fun stuff that has very little to do with the code you want to write and deploy.

Python And Flask

There’s no shortage of options for web frameworks, and you can do a lot worse than Python 3 with Flask. Flask is a nice mixture of being able to create a simple web service with very little boilerplate, but also can also be used as a component for building more complex web applications, with the caveat that it’s more suited to APIs rather than server-side rendered apps when compared with something like Django. You should probably be writing single-page apps these days anyway.

Python is a very well-supported, readable, maintainable language with a vast amount of libraries available on the python module index. If you take the time to set up lint with mypy and flake8 you can even have some up-front checking of types and common mistakes. And it’s supported by AWS Lambda.

What A Serverless Flask Application Looks Like

To build a serverless Lambda application you should have a CloudFormation configuration template that describes your architecture.

The Lambda itself is one piece of this architecture; it is a function that can be invoked and can return a result. To use it as a web service, there needs to be a way to reach it from the web. AWS provides the API Gateway service which can listen for HTTP(S) requests on an endpoint and do something when requested. APIGW can either have an entry for each endpoint you desire (POST /api/foo,  GET /api/bar, etc…) or it can proxy any request at a given host and path prefix to a Lambda and then interpret the response as an HTTP response to send back to the requestor. In this way it works like CGI or WSGI, and as long as your web framework knows how to deserialize an API Gateway proxy request and then serialize the response back into the format the APIGW expects, it can appear as any other web application “container” to your app.

There is a very simple Flask extension for this – AWSGI. The example there shows everything you need to do to make a web application that runs as a serverless app:

Screen Shot 2018-08-29 at 21.54.26.png

The only thing that is really Lambda-specific is the lambda_handler, which gives Flask the WSGI request object it expects and then translates the response appropriately into the format AWSGI expects.

So between the Lambda hosting your application code and the API Gateway acting as its reverse proxy, your infrastructure is pretty clear at this point. There are some associated bits needed like IAM permissions and the like, and maybe a database or S3 bucket or whatever else your app requires. This can all be specified in the CloudFormation template.

AWS has CloudFormation “transforms” that simplify this configuration by providing templated resources for your template to use. Templates on templates is a truly auspicious way to declare configuration. You will harness more slack than you had previously dreamed possible.

Resources:
    HelloWorldFunction:
        Type: AWS::Serverless::Function
        Properties:
            CodeUri: hello_world/build/
            Handler: app.lambda_handler
            Runtime: python3.6
            Events:
                HelloWorld:
                    Type: Api
                    Properties:
                        Path: /hello
                        Method: get

This gives you a Lambda that runs code uploaded to a S3 bucket when invoked at /hello.

Deployment

Deployment is not yet quite as smooth as say, deploying your application to Heroku, but it is getting there. There do exist some popular tools for doing serverless orchestration like Serverless Framework but I’ve been trying to see how far I can get just using AWS’s native tooling.

AWS has a service that I think no one but myself has actually used called CodeStar. This sets up a serverless deployment pipeline for you automatically, configuring CodePipeline, CodeBuild, and CloudFormation to give you an entire CI/CD system. You can easily configure it to run a build every time you check something into a GitHub repo, and update a CloudFormation deployment automatically. In addition, it even has another level of template laziness in the form of the CodeStar CloudFormation transform.

The documentation on CodeStar and the transform is more or less non-existent, which makes it actually kind of a special challenge to use. However it does appear to take care of the step of uploading your code to an S3 bucket and providing some roles. Your CodeStar template.yml may look something like:

  Flask:
    Type: AWS::Serverless::Function
    Properties:
      Timeout: 10
      Handler: myapp/index.lambda_handler
      Runtime: python3.6
      Role:
        Fn::ImportValue:
          !Join ['-', [!Ref 'ProjectId', !Ref 'AWS::Region', 'LambdaTrustRole']]

If you create a repository with the file myapp/index.py, with a lambda_handler function like the AWSGI handler previously shown, then you now have a serverless web application that will be updated every time you push to your GitHub repo and CodeBuild passes.

Note that CodeStar may have the potential to make developing serverless applications as easy as Heroku with the power of everything you get from AWS and CloudFormation, but it definitely feels today like something no one has actually tried to use. I attempted to add an IAM role to my Lambda function and I managed to stump AWS support. They finally admitted to me that there is no documented way to customize the IAM role:

“After some further testing and research into this I found that you cannot currently customize the serverless Lambda function’s IAM Role from the template.yml.

Instead, you will need to manually add the desired permissions to the IAM Role directly. As you pointed out though, the Lambda function’s role is created automatically and you do not have insight into how to identify the IAM Role being used.”

There is a workaround involving deducing how the role name is derived in the CodeStar transform (which is a rather opaque and mystical bit of machinery) and adding permissions to it, but as far as I’m aware none of this is really supported or documented. They said they plan to fix it though.

You absolutely do not have to use CodeStar for deploying Lambdas, but if you relish a little bit of a challenge with some of the more esoteric AWS services it can be a valuable tool depending on your needs. There’s always some satisfaction at watching unloved AWS services grow and mature over the years (CodeDeploy…) and you can tell war stories about arguing with AWS support reps for months on a single ticket.

Speaking of which, when I tried using CodeBuild for running tests for my project they didn’t have a python 3.6 image, making that tool completely useless. Looks like they have one now though, so maybe not useless anymore.

Other Deployment Options

If you really just want to whip up a quick Lambda, you really don’t need a whole CI/CD system or CloudFormation of course. If you just want to write some code and run it and then not worry about it anymore, you can set it all up manually as a one-off function. I do this for some things like Slack bots and random little web services. I use Sublime Text 3 with an AWS Lambda editor plugin that I made. It lets you edit a Lambda directly from within Sublime, and uploads a new version whenever you hit save. It lets you invoke it and view the output within Sublime, and it has a handy shortcut for adding dependencies via pip to your project. It’s incredibly simple and vastly superior to using the web-based function editor, or unzipping and rezipping your bundle every time you want to modify the code.

Dependencies

Screen Shot 2018-08-29 at 23.15.50

Your Lambda is actually just distributed as a zip file of a directory. Into this directory goes your application code as well as any data files or dependencies it needs, along with your hopes and dreams. If you depend on other libraries (besides boto3, which Lambda already has for your convenience), you need to include them.

For a simple deployment, you can copy in libraries installed into a virtualenv for your project. If the libraries include native code, you must compile it on a linux amd64 machine because that’s what Lambda runs on. Some tools automate this with docker.

If you want something a little more friendly to use, you can set up a directory to stuff things in.

For my project, I made a dumb little “local pip” script that I can use to install packages with pip into a directory (“vendor/”). It’s nothing special or fancy. Just runs “pip install -t …” and deletes some unnecessary files afterwards.

In my application’s __init__.py file at the top I add vendor and the root path to PYTHONPATH, sort of giving me a mini-venv where I can just use any module installed to vendor/:

import os
import sys
vendor_path = os.path.abspath(os.path.join(__file__, '..', '..', 'vendor'))
lib_path = os.path.abspath(os.path.join(__file__, '..', '..'))
sys.path.append(lib_path)
sys.path.append(vendor_path)
from flask import Flask
...

And then any dependencies I package up can be imported easily.

Putting It Together

To experiment with CodeStar and serverless Flask I made a simple web application. It lets people ask a question or answer a question. It was created initially as a Slack app, although that didn’t quite go as I hoped.

An Aside: My goal was to allow anyone on Slack to receive the questions and answer them, but as I had it only hooked up to a Slack team full of trolls and degenerates the Slack app reviewer was extremely unimpressed with the quality and thoughtfulness of the responses he got when testing it. Which is entirely my fault, but whatever. It’s still available on the Slack app repository but since they wouldn’t permit it to work across teams (something about not being “appropriate for the workplace”?) the Slack interface is of limited usefulness.

Anywhoozlebee, the application is a simple one: allow users to ask questions, or respond to questions. It was implemented first as a web service compatible with the Slack webhooks and Slashcommand HTTP APIs, and then later as a REST API for the web.

If this was a serious project I would use PostgreSQL for a database, but in addition to trying to teach myself how to best design a serverless Flask application, I also wanted to spend as little money as possible hosting it. Unfortunately PostgreSQL is not exactly serverless at this point in time, and you can expect to spend at least tens of dollars a month on AWS if you want a PostgreSQL server on anything except a free tier micro EC2 instance. So I decided to try using AWS’s DynamoDB nosql… thing. It’s a pretty unpleasant key-value store and the boto3 documentation is written by a sadist, but it is cheap and can also scale a lot without having to care much. In theory anyway. Though apparently it sucks?

DynamoDB costs a few bucks a month for tables and indexes, although you can probably get by with one or two if you don’t try to do things like you would in a relational database. Between that and a million requests and 400,000 GB-seconds of compute time a month for free for Lambda, you can have some code run and a place to store data for peanuts. And it should scale horizontally without any effort or thought. I’m sure it’s not that simple in reality, but it’s nice to imagine. At least I never have to configure a webserver or administer a machine just to deploy a web application, and can do it on the (hella) cheap. One of the real values of serverless applications is the ability to just set something up once and then never worry about it again. If it works the first time, it’ll keep working. You don’t need to worry about disks dying or backups or dealing with traffic spikes or downtime. Sure AWS isn’t absolutely perfect but I sure trust them to keep my lambdas running day and night more than I trust most people, including myself. Especially including myself.

Secrets

With any application deployment, you will likely need to store some secrets. You don’t need to give your Lambda an AWS API key as it is invoked with an IAM role that you can grant access to the services it needs. For external services, you can use the AWS SSM Parameter Store. It just lets you store secrets and retrieve them if your role or user is granted permissions to read them. It’s a great place to store things like API keys and tokens.

Screen Shot 2018-08-29 at 23.47.30

Since we’re using Flask, we can easily integrate SSM Parameter Store with the Flask config.py:

import boto3
ssm = boto3.client('ssm')

def get_ssm_param(param_name: str, required: bool = True) -> str:
    """Get an encrypted AWS Systems Manger secret."""
    response = ssm.get_parameters(
        Names=[param_name],
        WithDecryption=True,
    )
    if not response['Parameters'] or not response['Parameters'][0] or not response['Parameters'][0]['Value']:
        if not required:
            return None
        raise Exception(
            f"Configuration error: missing AWS SSM parameter: {param_name}")
    return response['Parameters'][0]['Value']

TWILIO_API_SID = get_ssm_param('qanda_twilio_account_sid')
TWILIO_API_SECRET = get_ssm_param('qanda_twilio_account_secret')
SLACK_OAUTH_CLIENT_ID = get_ssm_param('qa_slack_oauth_client_id')
SLACK_OAUTH_CLIENT_SECRET = get_ssm_param('qa_slack_oauth_client_secret')
SLACK_VERIFICATION_TOKEN = get_ssm_param('qanda_slack_verification_token')
SLACK_LOG_ENDPOINT = get_ssm_param('qanda_slack_log_webhook', required=False)

Secrets status: secreted.

Running Locally

Because Lambdas run inside of AWS, you might think that it would be very cumbersome to have to deploy and test every code change you make using AWS. And that would suck, if you actually had to do that. There’s an AWS project called SAM-CLI – Serverless Application Model Command Line Interface. Using docker Lambda images, you can invoke your application within the same environment it would be running under on Lambda. You can either feed it a JSON file describing a Lambda request and view the response, or you can start it up as a server that you can connect to like any other local development webserver. You do have to provide an AWS API key though if you want your app to make use of AWS services, as it’s running on your local machine and not under the auspices of an instance role in AWS.

Further Examples

In summary the above are considerations that are necessary for creating and deploying a serverless web application. I’m pretty pleased with the way everything fit together in my learning project QandA and I invite you to look at the project structure and source code for a complete working example. There are some more details I could go into about how I structured the Flask application, but they aren’t really Lambda- or serverless-specific and if you’re interested, really just check out the code.

Serverless API

Lambda is still in the early days and far from mature, and not yet as easy to work with as Heroku. But there is a high upside to being able to just have “code running in the cloud” without having to think about it or manage any server, and for basically free. Once you’ve taken the small amount of effort to set up a serverless application, you’re rewarded with an easy way to run code on the internet without having to worry about anything below the level of “request -> application code -> response”. I prefer worrying about code and configuration files over managing infrastructure and servers.

projectM: OpenGL and Shader Modernization

I wrote recently about the interesting work being done on the open source music visualizer projectM, and that progress is being made on modernizing the graphics capabilities of this library. Since it was first created, there have been changes made in the way OpenGL is to be used for widespread compatibility and support for GPU shader programs.

Now the efforts are in the home stretch; OpenGL ES support is close to complete, and there is support for handling a decent percentage of the shader programs found in the visualization presets. The vast majority of the work has been done (by @deltaoscarmike), and mostly bug fixing remains.

After setting up vertex buffers and attributes and reworking the texturing and sampling code, the final step for full modern compatibility was dealing with the HLSL shaders. MilkDrop is the visualizer for Milkdrop that projectM is built for compatibility with. It was made with DirectX and the shader language that it supported in presets is HLSL, the language used by DirectX. Since projectM is designed to be run on other systems besides Windows, some form of support for HLSL was needed. Previously it was handled by the Cg toolkit from nVIDIA, but this project has been long abandoned and it was an unpleasant and sizeable dependency. The only other solution was to convert the HLSL code into GLSL, the OpenGL shader language. Use is now being made of hlsltranslator to transpile HLSL to GLSL, which is no small feat. Additional support for a #define preprocessor directive needed to be added due to its liberal use in previous code and presets.

Default warp and composite fragment shaders were created, switching to the preset versions if they are present and compile without problems. The blur shader was reimplemented in GLSL, and some helper routines recreated in GLSL for the presets that expect things like computing angle or texture sampling. The combination of running preset shader code on the GPU with the recent intel SSE optimizations for CPU per-pixel math brings performance improvements as well.

The real amazing news is that projectM now builds on a raspberry pi. This is notable not just because of the popularity of them, including running media centers, but because if it builds on a raspberry pi it should (with some effort of course) eventually run anywhere. One of the biggest issues was that projectM could only be run on a desktop or laptop computer. But now by using modern graphics operations it can run on accelerated hardware on mobile and embedded devices via OpenGLES, opening up many new opportunities, and potentially making it much easier to re-integrate into downstream projects like Kodi and VLC. Running it on the web via Emscripten, compiling to wasm and using WebGL is doable again. It’s amazing what possibilities modernizing software can bring.

It should be mentioned that the vast majority of the OpenGL work was done by superstar deltaoscarmike who apparently really knows what he’s doing.

Some screenshots of the glsl branch: