This week the 2nd conference was held in Nice, France. The topic: the art, science, and engineering of programming.
It was my first brush with the world of serious academic computer scientists. I don’t have a degree myself, though I did study CS for a couple of years at USF before deciding work was more fun and interesting and lucrative. I was by far the least qualified person at the conference; most were Ph.D students or professors or postdocs. They all came from a world whose details I was hazy on, at best. It turned out that my world, the world of “industry” as they term it, also is something of a faraway land to them.
Throughout the workshops and presentations I was struck by a few things: how well-read and smart and knowledgable the participants were, the fun abstract and interesting nature of the problems they were solving, and the complete lack and regard of any (at least to me) practical applications of their work and research.
Partly due to being in an unfamiliar realm out of my depth on many topics, and maybe partly with a little bit of jealousy of the intellectual playground they get to spend their days in, I tried to keep an open mind about the talks and learn what I could. I did really want to know if there were applications of the problems they were solving outside the world of academic research into solving problems of other academic researchers, but I felt like it’d be improper to inquire. Let me give some concrete examples.
Researchers at Samsung built a prototype for sending code to be executed on other devices and making use of their resources. One demo was showing a game being played on a phone, and then having the game display seamlessly transfer to another device (that could be a smart TV, in theory). Pretty neat!
Well for one, this was for Tizen, an operating system that exists purely as a bargaining chip for Samsung and a backup strategy to not be solely dependent on Android. So there is no real-world application for this to run on any real devices. Furthermore, giving other devices the ability to make use of Tizen device resources is a huge avenue for security problems, as the presenter readily acknowledged. When combined with the fact that Tizen has more holes than swiss cheese this is doubly worrying. Additionally, one of the major unsolved huge problems with IoT (besides security) is interoperability between devices of different manufacturers. When I asked if there was any interest or plan to submit their work to a standards track, the presenter and host of the session got very confused.
Another talk was a study of running safe C/C++/Fortran code. It included implementations to provide safety of memory management, bounds checking, variadic arguments length checking, use-after-free, and double-free errors. Awesome! Fantastic! Just one catch – it’s only for running said code on the JVM. Since people don’t usually run C++ code on the JVM, this is of limited use, except possibly for tooling for people running Ruby on the JVM, one attendee told me. The talk and the paper had nothing to say on the performance overhead. Research was sponsored by Oracle Labs.
Actually, a vast amount of the research topics were relating to Java and the JVM, a situation I found scandalous. I had hoped and even assumed that academics would be proponents of Free Software, because of the massive contributions to learning, understanding, and implementation, while being unencumbered by profit-driven abuses of the legal system to the determent of progress. And yet they all live and breathe Java! Java is of course NOT free (not as in free beer, but as in libre – a French word meaning “Richard Stallman”), as was recently affirmed by a US district court which found Google liable for potentially 8 or 9 billion dollars for writing header files mimicking an interface of Java. Java is probably the least free language out there now.
In the first workshop I went to there was a session where we would all get to play with a new attribute-based grammar to compose a basic C language parser. Cool! But it was all done in Java. I said “sorry, I don’t have a JDK” and the entire room burst out laughing. “Who the fuck uses JAVA?” I asked, incredulous. “Uh, everybody!” came the smart-ass reply. Since there was no functioning internet at the conference venue, I couldn’t download the JDK, so I left. What a sad state of affairs for academia, to be so beholden to the most evil corporation in software today.
Research was presented on improving the efficiency of parsing ambiguities resulting from deep priority conflicts. An interesting and thoughtful study of helping compilers do a better job of catching a certain type of ambiguity and resolving it in an optimal fashion. They applied their analysis to 10,000 Java files on GitHub and 3,000 OCaml files, and found three conflicts in two Java files, but a great many in the OCaml source files.
So for all the folks out there doing serious work with OCaml, you’re in luck!
My favorite talk and the winner of an award at the conference was simply about Lisp, Jazz, and Aikido. And how they’re all cool and similar.
One of the student research projects sounded at first like it might be getting dangerously close to some sort of potentially useful application one day. One student talked about his system for dynamic access control for database applications. Unfortunately it requires using a contract language of his own devising in a lisp-lisp environment.
Don’t get me wrong, I enjoyed the conference and was frankly intimidated by all the super smart folks there. But it left me with a feel that so much talent, brains and time was being spent solving problems purely for the benefit and respect of other academics, instead of trying to solve serious problems facing us vulgārēs who have deadlines and business objectives and real-world problems to solve. The best part of the conference by far was just talking to the people there. I had a lot of interesting and thoughtful conversations. The research, eh.
As I’ve ended up with de facto maintainership of the illustrious projectM open source music visualizer I’ve seen a fair bit of interest in the project. I think I at least owe a blog post to update folks on where it’s at, what needs working on, and how to help make it better.
What is projectM?
projectM is a music visualizer program. In short it makes cool animations that are synchronized and reactive to any music input. I say music and not audio because it includes beat detection for making interesting things happen on the beat.
Some of you may remember the old windows mp3 player WinAmp. It contained a supremely amazing and innovative music visualizer called Milkdrop written by a gentleman from nVidia named Ryan Geiss, known just as Geiss. The visualizer was not a single set of rules for visualizing audio but rather a mathematical interpreter that would read in “preset” files which were sets of equations. You can read the very illuminating description here of how the files are defined if you’re interested. In short there is a set of per-frame equations describing colors and FFT waveforms and simple transformations, and there is a set of per-vertex equations for more detailed transformations and deformations.
Due to the popularity of WinAmp and Milkdrop there have been many thousands of presets authored and shared with really stunning and innovative visual effects ranging from animated fractals to dancing stick figures to bizarre abstract soups. The files are often named things like:
shifter – cellular_Phat_YAK_Infusion_v2.milk
[dylan] cube in a room -no effects – code is very messy nz+ finally some serious stfu (loavthe).milk
NeW Adam Master Mashup FX 2 Zylot – In death there is life (Dancing Lights mix)+ Tumbling Cubes 3d.milk
suksma + aderassi geiss – the sick assumptions you make about my car [shifter’s esc shader] nz+.milk
flexi + cope – i blew you a soap bubble now what – feel the projection you are, connected to it all nz+ wrepwrimindloss w8.milk
And so on.
As I understand it, possibly incorrectly, there were two major problems with Milkdrop. First that it was implemented with DirectX, win32 APIs and assembler, and secondly that it was not open source (though it was made open source fairly recently). So some enterprising folks in 2003 created projectM as an open source reimplementation that would be Milkdrop preset-compatible.
I didn’t work on projectM originally and I am not responsible for the vast majority of it. However the previous authors and contributors have for whatever reason mostly abandoned the project so it was left to random people to make it work. The code is quite old although the core Milkdrop preset parsing, beat detection, most of the OpenGL (more on that later) calls, and rendering is in fine shape. projectM is really just a library though, designed to be used by applications. In the past there have been XMMS and VLC plugins, a Qt application, pulseaudio and jack-based applications, and more.
OSX iTunes Plugin
Not really having a good solution for OSX I went ahead and ported the ancient iTunes visualizer code to work on a then-modern version of iTunes and voila! projectM on OSX. Though I did have to deal with the very unfortunate Objective-C++ “language” to make it work. Not Objective-C, Objective-C++. No I didn’t know that existed either.
I tried to submit the plugin to the Mac App store as a free download. Not to make money or anything, just to make it easy for people to get it. The unpleasantness of this experience with Apple and their rejection is actually what spurred me to start this blog so I could complain about it.
I decided that what would be better is a cross-platform standalone application that simply listens to audio input and visualizes it. This dream was made possible by a very recent addition to the venerable cross-platform libsdl2 media library adding support for audio capture. I quickly hacked together a passable but very basic SDL2-based application that runs on Linux and macOS and in theory windows and other platforms as well. Some work needs to be done to add key commands, text overlays (preset name, help, etc), better fullscreen support and easy selection of which audio input device to use.
The main application code demonstrates how simple libprojectM is to use. All one must do is set up an OpenGL rendering context, set some configuration settings, and start feeding in audio PCM data to the projectM instance. It automatically performs beat detection and drawing to the current OpenGL context. It’s really ideal for being integrated into other applications and I hope people continue to do so.
You can obtain source, OSX and linux builds from the releases page. This is super crappy and experimental and needed some configuration tuning to make it look good, and you need to drop the presets folder in. But it’s a start.
In their infinite wisdom the original authors chose the cmake build system. After wasting many hours of my life I will not get back and almost giving up on the software profession altogether I decided it would be easier to switch to GNU autotools, the same build system almost all other open source projects use, than to deal with cmake’s bullshit. So now it uses autotools (aka the “./configure && make && make install” system everyone knows and loves).
This is where you come in. If you like music visualizers and want to help the software achieve greater things there is some work to be done modernizing it.
The most important task by far is getting rid of the OpenGL immediate-mode calls and replacing them with vertex buffer object instructions. VBO is a “new” (not new at all) way of doing things that involves creating a chunk of memory containing vertices and pushing it to the GPU so it can decide how and when to render your triangles. The old-school way was “immediate mode” where you would tell OpenGL things like glBegin(GL_QUADS) (“I’m going to give you a sequence of vertices for quadrilaterals”) and give it vertices one at a time. This is tremendously inefficient and slow so it isn’t supported on the newer OpenGL ES which is what any embedded device (like a phone or raspberry pi) supports, as well as WebGL.
Astute readers may note that there already are iOS and Android projectM apps. They are made by one of the old developers who has made the decision to not share his modern OpenGL modifications with the project because he makes money off of them.
Another similar effort is to replace the very old dependency on the nVidia Cg framework for enabling shaders. Cg was used because it matches Directx’s shader syntax. GLSL, the standard OpenGL shader language is not the same, and requires manual conversion of the shaders in each preset.
The Cg framework has been deprecated and unsupported for many years and work needs to be done to use the built-in GLSL compilation calls instead of Cg and convert the preset shaders. I already did some work on this but it’s far from finished.
The reason I’m writing this blog post is because of the community interest in the project. People do send pull requests and file issues, and we definitely could use more folks involved. I am busy with work and can’t spend time on it right now but I’m more than happy to guide and help out anyone wishing to contribute. We got an official IRC channel on irc.freenode.net #projectm so feel free to hang around there and ask any questions you have. Or just start making changes and send PRs.
At this year’s FOSDEM in Brussels, Jan Tobias Mühlberg gave a talk on the latest work on Sancus, a project that was originally presented at the USENIX Security Symposium in 2013. The project is a fully open-source hardware platform to support “trusted computing” and other security functionality. It is designed to be used for internet of things (IoT) devices, automotive applications, critical infrastructure, and other embedded devices where trusted code is expected to be run.
A common security practice for some time now has been to sign executables to ensure that only the expected code is running on a system and to prevent software that is not trusted from being loaded and executed. Sancus is an architecture for trusted embedded computing that enables local and remote attestation of signed software, safe and secure storage of secrets such as encryption keys and certificates, and isolation of memory regions between software modules. In addition to the technical specification [PDF], the project also has a working implementation of code and hardware consisting of compiler modifications, additions to the hardware description language for a microcontroller to add functionality to the processor, a simulator, header files, and assorted tools to tie everything together.
Many people are already familiar with code signing; by default, smartphones won’t install apps that haven’t been approved by the vendor (i.e. Apple or Google) because each app must be submitted for approval and then signed using a key that is shipped pre-installed on every phone. Similarly, many computers support mechanisms like ARM TrustZone or UEFI Secure Boot that are designed to prevent hardware rootkits at the bootloader level. In practice, some of those technologies have been used to restrict computers to boot only Microsoft Windows or Google Chrome OS, though there are ways to disable the enforcement for most hardware.
In somewhat of a contrast to more proprietary schemes that some argue restrict the freedom of end-users, the Sancus project is a completely open-source design built explicitly on open-source hardware, libraries, operating systems, crypto, and compilers. It can be used, if desired, in specialized contexts where it is of critical importance that trusted code runs in isolation, on say an automobile braking actuator attached to a controller area network bus, or a smart grid system such as the type that was hacked in Ukraine during the attack by Russia. These are the opposite of general-purpose devices; instead, one specific function must be performed and integrity and isolation are critical.
The problem is that many medical devices, automotive controllers, industrial controllers, and similar sensitive embedded systems are made up of limited microcontrollers that may have software modules from different vendors. Misbehaving or malicious software can interfere in the operation of those other modules, expose or steal secrets, and compromise the integrity of the system. Integrity checks based in software are bypassed relatively easily compared to gate-level hardware checking; those checks also add considerable overhead and non-deterministic performance behavior.
Sancus 2.0 extends the openMSP430 16-bit microcontroller with a small and efficient set of strong security primitives, weighing in at under 1,500 lines of Verilog code and increasing power consumption by about 6%, according to Mühlberg. It can disallow jumps to undeclared entry points, provide memory isolation, and attestation for software modules.
Besides providing a key hierarchy and chain of trust for loading software modules, Sancus has a simple metadata descriptor for each module that stores the .text and .data ranges in memory; it then ensures that a .data section is inaccessible unless the program counter is in the .text range of the appropriate module. This is a simple but effective process isolation mechanism to ensure that secrets are not accessible from other software modules and that one module cannot disturb the memory of other modules.
Mühlberg mentioned that there is ongoing work on creating secure paths between peripherals for secure I/O, integration with common existing hardware solutions such as ARM TrustZone or Intel SGX, formal verification, and ensuring suitability for realtime applications.
To give a feel for the system in action, Mühlberg showed a demonstration video comparing two simulated automotive controller networks with malicious code running on a node. One can see the unsecured system behave erratically when receiving invalid messages, whereas the Sancus system gracefully slows down and safely disengages.
Much has been written about the upcoming IoTpocalypse: the lack of security in critical infrastructure and general despair about the dismal state of easily exploitable embedded systems as they multiply and get connected to the internet. A project based on open-source building blocks and free-software ethos that attempts to provide a layer of integrity and deterministic behavior to microcontrollers should be lauded and considered by anyone building hardware applications where security and reliability are strong requirements.
Budapest is one of my most favorite places of all. The architecture is some of the best in the world, there’s a mix of cultures from Austria-Hungary to Turks to Magyars. Everything I ate there was delicious. Great public transit. Turkish-style baths everywhere. Fun boat tour on the Danube river, separating Buda and Pest.
Good to visit as an American, too. At immigration most foreigners were being fingerprinted and questioned for long periods of time. When they saw my US passport they just waved me through.
Only real complaint is that all written words in Budapest are in Hungarian, a language I have absolutely zero chance of ever learning or understanding. It belongs to the Uralic family along with Finnish and Estonian, containing basically no relation to any other Greek, Latin, Romance, or Slavic tongues or words. Besides the unfortunate language it’s a great place.
Most people enjoy traveling, myself included. A relatively recent trend gaining populartiy however is turning travel from a vacation- or business-oriented experience to a general modus vivendi.
Having lived all my life in the San Francisco Bay Area I decided I should expand my horizons and try living elsewhere for a change. Anywhere. Because my work making software involves computers and the internet, I can work from anywhere as long as I have a computer and the internet, so why not take advantage of that?
Turns out it’s really not so hard. Traveling light is the key, really. All I’ve taken with me is one carry-on sized backpack with some clothes and a laptop, and mostly just stayed at Airbnbs. It’s that simple.
Oddly enough many of my friends, themselves often computer typers and Linux janitors living in the Bay Area, express a desire to travel around as well. They say “oh I could never afford that though.” To which I respond “fool, I can’t afford to live in fucking San Francisco, how the fuck can I afford not to travel?” I spend a lot less money seeing the world than I did suffering through the bleak dystopian dysfunctional morass that is modern-day San Francisco.
SF was a good deal more functional and cooler when I moved there in 2005 but now it’s beyond repair and hardly worth the astronomical cost of living there. Maybe the subject for another post sometime.
Because I’m not on vacation, I don’t do a ton of sightseeing. I try to hit one or two famous things in each city I go to but really I’m working most of the time. It’s like if I were at home, except that I’m not at home. Just working from different cities all the time.
Turns out plane travel can be extremely reasonable when you can be flexible about dates and destinations and can plan a couple months in advance. Plane tickets around the globe can be had for a song sometimes, and intra-Europe flight costs can be in the low double digits if flying by Wizz or Ryan Air. I wrote before about flying on the cheap, including my $116 one-way ticket from SFO to Amsterdam. And I haven’t even gotten in on the rewards cards and miles redemption schemes that are out there.
If you’re self-employed attending conferences is a solid plan for a few reasons. For one, if they are related to your work, you can claim conference-related expenses as deductible business expenses. Also it gives you a good reason to go to new destinations, meet new people, write articles covering the talks, and of course learn new stuff.
A few websites make the itinerant lifestyle much simpler. One is the Nomad List community, which has the most relevant list of destinations, measuring things like air quality, internet speed, safety, weather, friendliness and more. Also there’s a wonderful Slack chat associated with it where you can ask anything at all that’s travel-related, and even meet up with other people doing the same thing you are in basically any major city in the world. A couple weeks ago I met up with some very nice folks from there in Chiang Mai, Thailand, which is something like the digital nomad capital of the world if such a place can be said to exist.
Wifi isn’t always the greatest, but I’ve had fantastic luck getting local SIM cards wherever I travel. They almost always provide good speeds and decent latency, at a decent price. There’s even a helpful wiki that has everything you could ever want to know about data SIM cards anywhere in the world. One important thing to know though; even if you have an awesome free roaming plan like say, T-mobile’s, your normal SIM will be slooooow if you leave the continent. I learned by way of a telecom engineer at the IETF conference that your carrier tunnels your IP traffic through their network when you’re roaming. Meaning that all your traffic goes to the USA (or wherever your carrier is) and back. Get a local data SIM.
Airbnb, well, it’s just awesome. I’ve stayed at 40 of them so far, and it’s been mostly problem-free. The worst case has been a couple of times when I got canceled on last-minute, something a bit annoying but hardly the end of the world. You just take it in stride and get a new place to stay. I always make sure to get a place with a washing machine, so laundry isn’t a big deal. Sadly most countries aren’t into dryers like the USA, but you learn to live with these setbacks.
Actually, come to think of it, dealing with foreign washing machines is extremely challenging sometimes.
I think taking things in stride is really key to exploring the world. Maybe some places have traffic lights that change every ten minutes or so (I don’t know what’s up with that in Thailand), or people for whom communication, verbal or otherwise is an impossibility, or you leave a bank card in an ATM, or a million other things that can go wrong. I’ve had the good fortune to not encounter any catastrophes, and anything else can be dealt with by a generous application of calm and just asking yourself “okay, well what should I do now?” It all works out fine.
Above are the places in Europe that I’ve been in the past few years. I’ll write more about them in subsequent posts. (I wrote about my previous travels in Poland and Ukraine back in 2015 here).
Summary of Europe: Budapest is probably my favorite. Lviv and Kyiv in Ukraine are excellent and quite easy on the pocketbook. Berlin and Amsterdam are also great but definitely far on the pricier end. Serbia sucks don’t go there. Paris is Paris. Dublin has shit weather and bad food. Wasn’t that into Northern Italy but still want to visit Southern Italy pretty badly. Skip Warsaw but check out Wrocław, Prague and “Bohemian Switzerland” in Czech Republic. Brussels is boring and a bit too frenchy for my tastes but they got good beer and fries so can’t hate too much. Barcelona is hot. Croatia is a nice getaway that isn’t in Schengen if you’re running up against the limit of days you can spend visa-free in that part of Europe.
At the end of 2017 I veered off to new waters – Asia and Oceania.
Sydney and New Zealand are great places to go in December if you’re like me and hate the cold; ’cause it’s summertime in the Southern Hemisphere. Also they’re pretty great places and people speak English, though the timezone difference really complicates things if you’re working with other people or trying to keep in touch with friends. New Zealand is UTC+12, which puts you on the exact opposite of Western Europe, though it’s damn pretty there. Japan I wasn’t into so much; serious language barrier, hella cold, wack food, complicated getting around, expensive, tiny apartments. Hong Kong on the other hand is a fantastic place I plan on returning to as soon as I get the chance. I stayed in six different cities in Thailand and they were mostly very agreeable. Great food obviously, as well as a warm and pleasant climate, prices and exchange rate comparable to Ukraine, lots of smiling, friendly people, some of what’s probably in the top tier SCUBA diving in the world at Ko Tao, massive expat community in Chiang Mai, and plenty of great nature, temples, night markets, and things to see. Just be sure to not throw shade on the monarchy or the junta running the country while you’re there and you’ll be fine.
I’ve come to the end of this leg of my travels and will be heading back to Europe shortly. As you can see from Google Maps I’m writing this from the most perfect civilization ever created by man – Singapore. More on that later.
A concept about modern computing that often confuses people is the difference between some piece of data and the encoding, or representation of that data.
Everyone knows computers use binary. They use 1s and 0s to store and manipulate information. Do they use binary numbers?
Computers can only store information as patterns of electrical switches, set in the “on” or “off” position. There is no such thing as a “binary” number, only a number that is encoded as a binary pattern. Numbers are information, and they don’t actually exist. We can write down Arabic numerals like “42”, or write it in base-2 as “101010”, but these are merely different ways of encoding the same number. It’s up to us to come up with a scheme of encoding information using whatever is available.
Humans have all used base-10 numbering systems throughout history because we have ten fingers. In Roman times people used Roman numerals, which were pretty clumsy and not especially well-suited for arithmetic or algebra. Later, Europeans switched to Arabic numerals (0-9) while keeping the Latin writing system (A-Z).
So the number 42 is still the same number whether it’s written as XLII, “forty-two”, 4️⃣2️⃣, 0x2A, etc. All represent the same number, just encoded different ways. It’s up to the person interpreting the encoding using a particular scheme to translate it from the written-down form into useful information.
This doesn’t apply to only numbers but text, audio, video, web pages, hard disks, subtitles, and anything else one may want to be able to store in some hard copy form and represent digitally. Assuming lossless encoding, a FLAC of a song is the same information as an AIFF of a song is the same information as a zipped WAV of a song. They all represent the same PCM audio data just in different formats.
This blog post is a bunch of dumb words that anyone who understands English can make some sense of, but it’s stored as a sequence of bytes using the UTF-8 encoding standard which is a way of storing Unicode glyphs as a sequence of bytes (byte = 8 bits, hence “UTF-8”). Unicode is a mapping of codepoints (numbers) to glyphs, with some fancy rules about combining glyphs and things. Unicode is not a format, there are different ways to encode the codepoints into a machine-processable format.
As far as computers are concerned you can only deal with bits, grouped into bytes. The most convenient way to store and retrieve any data from RAM or storage or over a network is a stream of bytes. If you want to represent some information in a computer, you need some encoding scheme to translate it to and from a stream of bytes. How you want to accomplish this can be entirely up to you. The information only has the meaning you choose to imbue it with.
If you use heroku and AWS and want to customize your heroku application logging, you can hook Logplex up to AWS Lambda.
When a heroku application emits things to stdout or stderr they get shuttled to the magical world of Logplex. The logs enter as syslog messages, containing information like facility, priority, etc. Not only logs from your application but logs from heroku’s build and deploy systems, postgresql, and other add-ons as well. Shortly after arrival these logs are dispatched to whatever sinks your heroku app has configured which can go to add-ons like PaperTrail, and also to custom log sink URLs. The sink destinations can be syslog(+TLS) or syslog-over-HTTPS using octet counting framing.
One advantage of this setup is that you can have your application emit logs with a minimum of blocking. At one point I had my application sending logs to Slack directly but this caused latency in the application any time I logged anything. By sending to Logplex on the other hand, I can process the application messages asynchronously without doing anything remotely fancy in my application. Another benefit is that you can handle your application, database, build, and deploy logs all the same unified fashion.
Using AWS API Gateway and Lambda you can set up your own Logplex sink and can do whatever you desire with the logs coming out of Logplex. This includes your application’s output as well as add-ons and heroku platform messages. You can them send them into CloudWatch Logs, or even Slack as in this example:
There is one major deficiency in this system that is worth noting: there is no way for your application to alter the log message’s syslog fields. So even if your application logger knows a particular message is debug, or warn, or error, it all comes across as severity level 6 (info). Logs from other components such as postgresql preserve their log severities but your application is a second-class citizen and there is no mechanism to send actual syslog messages to Logplex even though add-ons and internal heroku machinery clearly does. I filed a ticket about this and complained at length and they told me they have no plans to allow users to send syslog-formatted messages to Logplex, and everyone is stuck with only stdout/stderr. This means if you wish to treat messages of differing severities differently in your Logplex sink you can’t, at least not with the existing out-of-band syslog data that your sink receives. As far as the sink can tell all of your application debug logs and error logs all look the same, which is frankly an impossible situation when it comes to logging. Hopefully they fix this some day.
The Internet Engineering Task Force (IETF) is an organization dedicated to stewardship of an ever-expanding body of technical standards to facilitate interoperation of machines and software connected to the internet. Pretty much everything you can do on the internet, including the functioning of the internet itself, is governed by the IETF “Request For Comments” documents known as RFCs. Some standards defined in the RFCs include TCP/IP (internet), SMTP (email), IRC (chat), XMPP (jabber), emergency telephone call information, live video streaming and multitudes more.
The Internet Society facilitates much of the IETF’s work by providing administrative and organizational resources. There is no formal membership roster or special recognition given to governments or corporations. While most of the roughly 1,200 IETF attendees (except for your correspondent) were sent on trips with all expenses paid by their employer or through the IETF fellowshipprogram, there is a strong understanding that everyone there is representing themselves in technical matters. They are all expected to only state opinions they are personally willing to stand behind. The criteria for acceptance of moving IETF drafts forward are “rough consensus and running code,” though the “running code” part is less of a thing these days than it used to be. To get involved in the process all you have to do is join a working group (WG) mailing list. Anyone can attend of the tri-annual meetings, which are usually held in North America, Europe, and Asia.
Everything at the meetings including WG notes, audience questions, and meeting materials are recorded and made publicly available along with a live video stream with remote participation.
. One of this year’s meetings was held in Prague, a frequent location for the Europe area. It was held at the Prague Hilton, and as part of the event contract the IETF replaced the hotel’s network with its own, setting up their BGP ASN and a multitude of wireless networks with 802.1X, IPv6-only and NAT64 experimental options, and a DHCP server handing out globally routable addresses with no firewall. As one should expect, the IETF doesn’t screw around when it comes to the meeting network.
The work of the IETF is divided into subject “areas” which are made up of many working groups related to the area. The areas are the internet, operational issues and network management, routing, security, transport, applications and real-time, and general for more meta work.
Each working group in an area has a well-defined charter describing its purpose, and background materials to help frame the discussion. The work done by a WG almost all happens on its dedicated mailing list, with updates and discussion that is much easier to do face-to-face taking place at the meetings in person or via remote video conference.
In addition to the WGs, there are BoF sessions. A BoF (pronounced boff) is a “birds of a feather” group where people who are interested in a topic can come discuss ideas and gauge interest and see if there is IETF-related work to be done on the topic. If so, a working group may emerge from the BoF.
And finally there are research groups which are set up for long-term collaboration on research topics. They have a less focused charter and pursue and share research about a particular topic instead of working towards explicit RFC publication deliverables.
The RG’s mailing lists are great places to learn about new developments and work being done by academics and in-the-field engineers in subjects of interest. Just this morning I got an email on the GAIA list from the president of the Internet Society of Togo stating that they are experiencing internet shutdowns in the country today.
Attending the IETF meeting in person I was able to see the working groups, research groups and a BoF in action. Allow me to share my first impressions and experiences as a total clueless newcomer.
The Internet Video Codec working group is attempting to subjectively and objectively test and compare several candidate video codecs for use on the internet. netvc is a follow-on to the remarkably successful work the IETF codec WG did on audio codecs, in particular the royalty-free, high-quality and efficient Opus codec.
The topic of non-proprietary codecs is near and dear to my heart and more important than most people realize. Right now if you want to put a video on the web and have it work in all browsers you have but one option: the h.264 video codec, licensed by the MPEG Licensing Association patent cartel. This codec is covered by many patents and is not free in any sense of the word. Mozilla and Google have support for more open and less patent-encumbered video codecs (Ogg Theora, VP8) in their browsers, with Google going far far out of their way to purchase the VP8 codec and release all patent claims in the hope of having an unrestricted and open codec for everyone on the internet to use without having to pay royalties or fear of getting sued. This didn’t work out quite as planned for two reasons, one being that Google wouldn’t indemnify codec users (and couldn’t reasonably do so under extremely perilous and burdensome US patent laws), and the other reason being that Microsoft and Apple refused to include support for this codec in their browsers. Not that it would have been a great amount of effort, as the code is freely available with open source implementations. Having a video format that would only work in some browsers doesn’t really cut it for content publishers so everyone is forced to use h.264 instead. Also by some unrelated weird coincidence Microsoft and Apple happen to belong to the MPEG-LA and get a share of royalties from encoder licenses.
This is a rather long-winded way of saying the standardization of a (relatively speaking) patent-unencumbered free codec is actually quite crucial in keeping basic modern internet functionality out of the greedy hands of a small number of corporations. This is the kind of hard work and battle that must be constantly fought to keep the internet as free and open to everyone that organizations like the IETF and Internet Society are always engaged in.
As a point of comparison between VP8 and the result of the netvc video codec selection, users will still unfortunately be in the exact same position with regards to patient indemnification. The IETF cannot guarantee to defend all users from patent trolls. Despite Google’s promotion of VP8/VP9 as an open standard for internet video many people have treated them as proprietary codecs and desired a non-proprietary alternative.
The netvc working group is evaluating the codecs AV1, VP9 and Thor. Part of the work of the group has been to establish requirements for comparing the codecs on metrics such as high- and low-latency performance (offline vs live encoding), decoder complexity (to optimize CPU/power consumption and hardware acceleration), perceptual quality, error resilience, and Weissman Score (just kidding about that last one).
The general requirements for the internet video codec are that it should be suitable for video calls, broadcast media, conferencing, telepresence, teleoperation, screencasting, and video storage. They are basically aiming to equal or best h.265 (successor to h.264) as far as quality and complexity.
There are double-blind tests that anyone can participate in to subjectively judge video and frame encoding quality in a split view. They test one quantization parameter at a time in both high- and low-latency modes. The gentleman presenting on subjective testing claimed that Mozilla has a 4k projector in the break room they make the interns do tests on for cookies, though I wasn’t super sure if he was serious or not. Approximately 12 viewers are required for each test to be statistically significant. Some of the test corpora include Minecraft Twitch videos, “netflix crosswalk” and “netflix tunnelflag”.
The codecs being compared are works in progress; AV1 has gained about 20% compression over the past year most of that in the past three months, though with about a 1000% increase in complexity.
AV1 complexity is best vs Thor and VP9. Thor and VP9 have similar profiles for complexity/speed tradeoff for mixed content. Thor measured better than VP9 for video conferences but not quite as good AV1. They believe it’s possible to get Thor to perform roughly as well as AV1 but with a fraction of the tools and added complexity.
Error resiliency was discussed quite a bit. Since video is open streamed at someone and decoded in near-realtime, ability to gracefully recover from packet loss is an important consideration. This is a complex problem involving careful trade-offs because a packet does not represent a frame that can be easily dropped. Most of the time the packets contain backwards-looking prediction information that is computing estimated motion vectors from previous frames and against reference frames that the decoder may or may not have received or decoded successfully. There is a certain amount of redundant information that can be part of the packetized payload but this is a tradeoff between resiliency and amount of video information that can be packed into a certain bitrate. VP9 can reference frame dependencies implicitly or explicitly (with RTP picture ID mappings); there’s no way to know from an RTP header if a dependent frame is available without parsing actual RTP packets. AV1 explicitly signals and codes frame IDs in the codec payload, there is a proposal to move to motion predictions from the most recent reference frame.
As far as color information in AV1, a technique is being adopted from Daala (a Xiph codec converging with Thor) called CfL – Chrome from Luma. There is a correlation between luma (brightness) and chroma (color) that can be used to predict chroma coefficients directly. It was reported that doing this in the frequency domain sucked, and they are currently proposing to do this in the spatial domain instead.
A notable thing about the netvc work has been the virtuous cycle of development it has brought. Simultaneous open development of AV1, Thor, VP9 and previous Daala with non-proprietary code and openly published test results has highlighted the ease and power of open-source collaborative development. Each project takes ideas from the others, improves upon them, and the improvements are fed back into the original project, in a cyclical fashion, with the work and results immediately available to everyone.
Overheard at IETF99: “The ‘S’ in IoT stands for ‘Security’”.
The Thing To Thing Research Group (t2trg) highlighted security and interoperability issues with Internet of Things (IoT) devices.
Will IoT networks be friendly to each other? Some concerns exist about interference between vendors in terms of wireless spectrum usage, IP networks (imagine buying devices from different vendors that both want to be DHCP servers), multicast issues, sharing resources like an external IP address. “Every device vendor sees the network they operate on as a wide, big, empty road on which they are the only driver.”
Like UNIX, IoT is awesome because there are so many standards to choose from! There are different areas that different bodies focus on, but with a lot of overlap between schema.org, W3C, LwM2M, ISPO semantics and more.
Data interoperability is an issue too. Some data models have license terms that are opaque and hard to find out. I would suggest that any vendor trying to license their data models should just… not, but that is just my opinion.
A long-standing question has been service and resource discovery on the network. Imagine if you have a smoke detector from one vendor that wants to flash lights or play an alarm on speakers from other vendors. Multicast DNS is pretty accepted for this but it is fairly limited semantically. We really could use a standard for machine-readable resource enumeration and metadata. Part of the problem here is the difficulty of agreeing on a shared definition of what “metadata” is (just ask the NSA); it took the IETF four attempts to define metadata for security management. There are privacy concerns about announcing what resources a network has. You probably don’t want your pacemaker advertising control capabilities to anyone on the network. Some common infrastructure would be helpful, like a centralized IoT identifier registry. Right now most of the work the RG is doing is stored on repositories and wikis on GitHub.
There is an as-yet unsolved problem: if you buy an internet-connected device, how do you bootstrap security identifiers and credentials for your network and cloud services? How do you connect something to your wireless network that has no screen, or keyboard?
Research and a reference implementation were also presented about one solution for authorizing network access for IoT devices. The proposal, called EAP-NOOB (really), utilizes out-of-band (OOB) communication for network authorization and user account setup. Examples they gave were a smart TV that displays a QR code the user scans with their phone, or a camera taking a picture of a QR code presented on a phone. They suggested other OOB mechanisms such as an audio cable or NFC NDEF message.
I attended the Privacy-Enhanced RTP conferencing WG.
The hard problem that the perc group is trying to solve is how to enable centralized Secure Real Time Protocol (SRTP) conferencing where the central device distributing the media is not required to be trusted with the keys to decrypt the participants’ media.
At the meeting they discussed obscure (to me) technical details regarding best ways to maintain and re-key Secure RTP communications for conferencing involving double-encrypting tunnel components and allowing RTP packet repair by media distributors.
There was an interesting presentation about RED – redundant encoding. This was in a similar vein to the netvc error resilience discussion, evaluating tradeoffs between less bandwidth efficiency and better handling of dropped packets. In the RED scheme, each RTP packet contains an alternative (low-quality) version of the previous frame for repair purposes, mostly for audio. The main idea being that if packet loss is detected in a poor quality conference, you could reduce some of the bandwidth used for video and instead allocate that to audio packet repair so that at least audio quality suffers less. Double-audio packets could even be handled by media distributors instead of the streaming source endpoint, which would be a very nice feature for CDNs, distributed networks and robust media servers.
Some other topics about TLS-IDs in SDP and FlexFEC were discussed but I had no idea what they were talking about.
The findings of a paper on non-volatile main memory (NVMM) by NEC Labs Europe were presented at the Transport Area Open Meeting.
NVMM is a far-along technology coming to mobile devices soon. Computers going back many decades have used volatile main memory, meaning the contents of RAM are lost when the power is turn off. There exists a major practical and abstract barrier between main memory (RAM) and persistent storage (SSD, disks) because of the differences in volatility, speed and capacity. With NVMM, main memory can be used as persistent storage. Of course it’s not quite that simple; NVMM costs are higher than RAM and much higher than mass storage devices, and not yet faster than typical DRAM. But it is an area with potential applications for accelerating certain tasks.
The researchers investigated the implications for networking, focusing on the use case of downloading a file over a network.
Right now when your computer is downloading a file the data follows a path from the Network Interface Card (NIC) to DRAM (using DMA I believe), then is read from DRAM by the OS networking stack, a read() by the application doing the downloading, then a write() to the storage stack, which is buffered into DRAM and then flushed to disk. This process was measured to have a latency of about 2000µs. By simply replacing the last bit with a copy from DRAM to NVMM, the latency was reduced to about 40µs, showing that the disk flush was extremely significant, as well as the cache misses involved due to the fact that the area of DRAM being read from was an ever-advancing pointer .
Part of their solution was to maintain a static ring buffer of packets and a small set of metadata entries containing offset/length indexing information of the packets in the buffer. This helped prevent cache misses as the region of memory for the packets remained fixed. The other change was to DMA packets to L3 cache instead of main memory, and only if packets needed to be stored was the cache flushed to DIMM. They said a 10-88% increase in throughput was obtained and a 9-46% reduction in latency, and the improvements scaled linearly with cores.
The researcher suggested that similar types of optimizations which change assumptions about the persistence of main memory storage can pay large dividends and that there are likely many such areas for taking advantage of NVMM capabilities. Exciting!
I attended a BoF session for IDentity-EnAbled networks.
From the very cursory glance I gave the Bof it superficially resembled a topic I’ve long been interested in: the concept of a universal mechanism for identity on the internet. I’ve long thought it would be a massive step forward of internet services could make a basic assumption about the requestor, such as every request containing a public key. Say every request made to a website contained such a public key; you wouldn’t need to register a separate username and password at every site you visit. You could have one universal identity or generate new ones on the fly as desired, it would strongly prove that the requestor is in possession of an unfalsifiable key but also provide pure anonymity at the same time. All data could be end-to-end encrypted and stored securely such that only the owner of that identity could read it, and so much more, all with a very simple change. I even wrote a ton of code for a project for a new application layer based on this concept about ten years ago but I got a little too carried away on the scope of it and there was no possible way I was going to do it by myself.
So I was excited that maybe there would be efforts towards standardizing this simple but powerful idea at the IETF. Part of the agenda was a system that even had the same name as my project! Imagine my disappointment when I learned that their plans were impenetrable soups of acronyms and incredibly complex and confusing academic-speak.
Much of the blame lies with me for not reading through the materials ahead of time, to be sure. The IETF meetings assume everyone is up to speed on all the drafts and documents and mailing list traffic. As a newcomer trying to sample many different projects I simply didn’t have hours and hours to read over all the drafts before going to the different meetings. However at all the other sessions I attended I mostly got the gist of everything even if I was not intimately familiar with every detail and issue of conflict at the WG. The IDEAS session was very different.
The session discussed the definition of an identity-identifier split, defining an identifier as something similar to but not quite an location identifier, which could be a “valid but often non-routable v4/v6 address” and could “be truncated but managed within a domain of use”. An identity belonged to a machine, not a person. A concept of HIT (host identity tag) for the HIP (host identity protocol?) was a ‘flat’ namespace of identity tags which were v6 address looking things. They wanted to separate identifiers from locations, as “IP addresses have overloaded semantics going back to 1993”.
While I should mention again I didn’t do the reading before class, I do have a considerable background in related topics and I didn’t understand the point of their discussion at all and everything seemed mind-bogglingly complex and there were dozens of acronyms tossed around that I’d never heard of. Their solution required complex service topologies with lots of arrows and diagrams, considerable infrastructure, and even a design for HIP that “requires changes in the IP stack.”
The ideas presented at IDEAS were so dense, complex and impenetrable that I simply can’t imagine any kind of widespread adoption of whatever it is they were pitching. As someone who designs and builds complex systems for software services I have a bad reaction to obviously over-engineered systems and generally prefer simpler and easier to understand, if less powerful solutions. The technical sophistication of a system must be balanced with actual human concerns about ease of adoption, ability to communicate the design in a clear and concise way to other humans, and make the benefits and trade-offs clear so other humans can make informed choices about your system. This was the only session I attended that felt utterly doomed and depressing and I couldn’t sit through the end. In fact it bothered me so much that I did something I was not supposed to: got up and asked a question without reading all of the materials ahead of time. I paid to be here, might as well get my money’s worth.
“I have a stupid question…” I said to the presenter.
Speaker: “There are no stupid –”
Me: “This all seems incredibly complicated and dense and difficult to grasp. Why not use a public key as an identifier?”
Speaker: “Which format of public key and what algorithm? (is this ID_KEY_ID??)” [language from official meeting notes]
Me: “OpenSSH key format.”
Speaker: CLEARLY you did NOT read the drafts and YEARS of hard ACADEMIC RESEARCH and [your question is stupid].
The Codec Encoding for LossLess Archiving and Realtime transmission WG was full of great progress and news. Its charter is related in a fashion to the internet video codec WG in that both are standardizing free and open formats for multimedia in an effort to not get the entire world stuck in a trap of being burdened with de facto standards of proprietary and royalty-encumbered audio and video formats. cellar is focused on lossless archiving of multimedia, as in the United States’ Library of Congress as one example. If digital multimedia is to survive many years of technology changes and new formats it must be encoded in a well-defined standard and not lose any quality.
From the charter:
“The preservation of audiovisual materials faces challenges from technological obsolescence, analog media deterioration, and the use of proprietary formats that lack formal open standards. While obsolescence and material degradation are widely addressed, the standardization of open, transparent, self-descriptive, lossless formats remains an important mission to be undertaken by the open source community.”
In a nutshell (or Matroshka), the group is defining normative guidelines for an official format to be used for representing lossless audio and video data and containing them. The choice has been made of Matroshka (.MKV) for the container, FFV1 for video, and FLAC for audio. FFV1 is already specified for archival use by the US Library of Congress, and FLAC is widely used by audiophile pirates.
Issues discussed were problems with the existing specifications vs. the reference encoder, which has some known issues like integer overflows and incorrect colors, which are supported by the reference decoder. The next milestone and format version is removing these documented exceptions and “documenting reality” instead.
The illustrious open-source media codec library ffmpeg supports Matroshka binding V_FFV1 CodecIDs without a compatibility layer but doesn’t write out the codec ID by default in ffmpeg to preserve compatibility with older versions of ffmpeg. They are ready for the future with a native FFV1 codec ID.
The FFV1 coder description is described except for the description of the single-pixel Pixel() function. Much is already written in plain english but a normative C-like description should be given.
FFV1 v4 should support more pixel formats and add native metadata, not relying on the container (MKV) for metadata. FFV1 can transport its own metadata as well.
A description of Matroshka was given live via remote video feed (naturally) along with some historical context. It was started in 2012 to store live TV captures because existing containers were unsuitable for them. It was forked into its own project due to disagreements with the community. It borrowed ideas from AVI, Ogg, XML and semantic web ideas. Later on the codecs H264, H265, VP8, VP9, AC3, DTS, and Opus came. It was adopted by Google and Mozilla for their standardized “WebM” format, designed to be a standard for free and open multimedia format for the web, consisting of VP8 or VP9 for video and Vorbis or Opus for audio. It is used and supported today but not well-supported by Apple and Internet Explorer due to evilness and greed (see netvc above).
Matroshka/WebM is widely supported by open source software players, Windows 10, blueray, smart TVs, Netflix, Nintendo, Youtube. Recently 360° video and HDR metadata support was added.
Question: “What is the plan for documenting WebM? Will that be a part of the cellar specification?”
Speaker: “WebM is basically the Matroshka specification online, WebM doesn’t have anything not in Matroshka. Matroshka all applies to WebM and the spec says if it applies. They are the same format. I wish Google would help us work on this spec. Mozilla and Google people are on the mailing list but aren’t helping with the spec.”
The cellar working group’s IETF documents are generated from Markdown and EBML-defined XML files. XML semantics defining EBML can used to generate code, including all parts of WebM. The Matroshka v3 spec was submitted July 2017, and in September the v4 spec is due to be submitted. The specification is a huge task comprising 243 element, 33 of which are deprecated. There are seven pending pull requests, text clarifications and codec definitions, and 22 known issues remain, mostly text clarifications along with some format additions, formatting changes and codec definitions.
The Security Area Advisory Group met to listen to some invited talks on security-related topics relevant to the work of the IETF.
A long and fascinating talk (slides; recommended reading) was given by Kenny Paterson about post-quantum cryptography. PQC is one of those concerns that (as far as is publicly known) is not an immediate problem but something people should be thinking about and planning for well before the time it actually becomes a crisis, if indeed quantum computing ever reaches a point where it can break most classical encryption schemes currently in use today. There’s even an obscure film about this scenario called The Traveling Salesman.
For context, the timeline of a weakness of the hash algorithm SHA-1 was given:
The point being that there were many years between the discovery of a theoretical weakness and an actual successful attack, with a standards organization (NIST) trying to promote an improved version, and resistance by the complacent commercial certificate authorities. That is until they had a change to replace their certificates with SHA-2 after mass revocations due to the OpenSSL “heartbleed” vulnerability.
So a sane route might be to continue research today to potentially protect against future quantum computing attacks on classical cryptographic methods or at the very least explore and document interesting alternatives to prime factorization and elliptic curve crypto. Some of these include lattice-based, code-based, non-linear, and ECC-isogenies and I haven’t the foggiest notion what those are.
Is significant quantum computing on the horizon? People have been saying QC is “a decade away”, for several decades now. Also the quote “In terms of fundamental physics …. we’re pretty close to what we need. There’s just tonnes of engineering work…” was mentioned, to the laughter of the engineers in the audience. The speaker said quantum physics laws have been verified to around ten decimal places, which isn’t all that great. Some relevant questions are: “is quantum computing solid against advances in physics?” versus “is public-key crypto vulnerable to algorithmic advances in conventional algorithms for factoring, discrete logs, etc.”
There exists a company D-Wave which produces fantastically machines kept at near-zero temperatures for “quantum annealing” with some notable customers. Quantum annealing is a quantum version of simulated annealing, a common optimization technique in which the “energy” of a system decreases and settles on more local minima/maxima as time goes on.
There have been publicized advances in quantum-key distribution, such as a recent experiment using QKD over long distances by China with mainstream media headlines like “unhackable encryption” and “the future of security”. It should go without saying that such reports are dubious. For one, QKD isn’t really distribution – it expands existing keys. This can already be done with key derivation functions (e.g. PBKDF2) with classic cryptography. The problem with QKD is that it doesn’t work for any great length, there must be signal boosting components which decode and then re-encode the transmission stream to send it over long distances, preventing end-to-end encryption over distances. The UK’s NCSC (formerly GCHQ) took the unusual step of publishing a white-paper bagging on QKD and describing its infeasibility.
The IETF is developing two drafts for hash-based signatures which are considered mature. Other PKC schemes are being researched but not anywhere near standardization. The suggestion was made that IETF should not lead the standardization effort for PKC but instead follow the lead of the US NIST, and for the present the IETF should care not to bake in any algorithms yet, such as too-small maximum field sizes.
Participant: “current estimates for key sizes are going to be an order of magnitude larger… so like 50k-bit key sizes. If you have a protocol like UDP where everything fits in one packet, you’re going to have a bad time.”
Participant: “I do have a PhD in nuclear physics and I don’t think QKD is going to work because the engineering parameters are too hard. .. We need a deployment plan for this now, before we have any crypto.”
Another (brief) talk was given on the p≡p (pretty easy privacy) project, a software engineering effort to improve interoperability of privacy and cryptography between instant messaging and email applications, in the vein of S/MIME and OpenPGP. The speaker said that the IETF could help with MIME-based message formats, key synchronization, base protocol mapping for email, Jabber, URI schemes for missing message addressing such as GNUnet, signal and so on. They said they had a library available with adaptors for Java, C#, Python, Obj-C, Swift and more, with actual software written for Android, EnigmaMail, Outlook, iOS and Email/p≡p. It sounded like a great project and opportunity for IETF standardization and real engineering effort to come together in a standards-based effort to increase privacy, trust and interoperability all at the same time.
All in all the meeting was a great way to not only learn about lots of intricacies and interesting technical problems that smart people were trying to solve, but to see the process of creating and implementing standards crucial to the openness and freedom of the internet. This work is something that so many people take for granted and they don’t appreciate the constant ongoing difficult effort that thousands of people do to prevent corporations or governments from monopolizing the function and operation of the internet.
The IETF is distinct from other standards bodies such as the government-influenced ITU or the vendor/carrier-driven 3GPP group for wireless network standards. Without work being done in the open and distributed through a community of volunteers, nefarious actors can and do try to dictate their proprietary solutions for technology, often for their own financial benefit and not necessarily in the interest of the greater good.
Nobody forces the IETF standards on anyone; they are implemented voluntarily by engineers working on internet-related technology to promote interoperability and ensure the underlying protocols, transports, networks and formats remain free and open. Everyone chooses to implement the IETF standards because of Metcalf’s Law: the value of a telecommunications network is proportional to the square of the number of connected users of the system.
Recognition and support should be given to the work the IETF does to promote freedom and privacy around the world, and I encourage anyone to get involved and join the mailing lists and discussions of any working groups related to their interests.