Like many others before and since, my introduction to the world of databases was via MySQL. It was a good enough database that did what I wanted, so I was content to use it without realizing what I was missing out on.
Comparing MySQL to PostgreSQL is a bit like comparing notepad.exe to emacs. While they both let you type in some text and save it to a file, that doesn’t quite tell the whole story. They are both RDBMSes that allow for the execution of queries via a client connection, but the philosophies, capabilities and sophistication of each are hardly even on the same plane of existence.
The way I think about it is that MySQL is a simple database program. Technically it does have pluggable engines; you can choose between a terrifically useless engine and one that at least knows what a transaction is. I consider Pg more of a “database framework”: an incredibly sophisticated and flexible program that provides a standard frontend interface, much as emacs ties everything together through “modes” and elisp, Pg has a C client library and speaks SQL.
To give you a more concrete example of what I mean, in Pg you can create “foreign data wrappers”, where the actual implementation of a query can be defined by a plugin. One example of a foreign wrapper is “MySQL” – you connect to Pg like normal through your client library and run a SQL query, but it’s actually executed on a remote MySQL server and presented as a table view in Pg. Or perhaps your query is auto-parallelized and distributed among many mongoDB nodes instead. No big deal.
Pg also has an astonishingly powerful set of extensions. My absolute favorite is PostGIS, a geospatial extension that adds geographic and geometric types, operators, indexed storage and much much more. Seeing a demo of it at the SFPUG meetups blew my mind, and I’ve been an avid user of it ever since.
Did I mention it also has a full-text search capability every bit as useful as Solr or ElasticSearch, with stemming and your choice of GIN or GiST indexes? My life is so much better now that I can simply use a trigger to keep my tsearch columns updated instead of application-level logic to keep Solr indexes in sync.
Pg is chock-full of awesomesauce that I’ve used to replace many moving parts of my application. I chucked out Solr and replaced it with PostGIS and tsearch. Also I ditched ZeroMQ (nothing against ZMQ – it’s a great project) and just use Pg’s built-in async message queue instead. Oh, you didn’t know it had a message queueing system? As a matter of fact it does, and I gave a talk on it for SFPUG at UC Berkeley, coincidentally a few feet from where the precursor to Postgres was first written. In my talk I showed how to construct a location-based storage engine using PostGIS and a single trigger, which would fire off an async notif of JSON-encoded lat/lng updates whenever a row was inserted or updated. Add in a WebSocket<->Pg client/server (such as the esteemed WSNotify) and you have a real-time event push system that notifies a JavaScript browser client whenever a location field on a row is changed, all without a single line of application code (other than the SQL trigger). Let’s see MySQL do that. (Slides and example code are here: https://github.com/revmischa/pgnotify-demos)
We all love fancy features and reducing the number of moving parts in our infrastructure, but I actually think the most compelling argument for Pg is that it is an community-driven project and is not owned by Oracle. MySQL is not an open project, and it is owned by Oracle. One does not need to look far for bone-chilling examples of Oracle’s management of open-source database projects. Let’s talk about BDB.
BDB, the Berkeley DataBase, was a nice little embedded database engine used by many pieces of software dating all the way back to 1994. Any of the maintainers of software using BDB might be in for a bit of a nasty shock should they decide to upgrade to the latest version, now that Oracle’s acquired the software and changed the terms of the license. Now they have two options which are helpfully explained on Oracle’s website.
You can choose from option A – the “open source” version, or you can choose option B – the “pay Oracle money” version. That webpage does seem to leave out one minor little detail though, the “open source” version is actually the Affero GPL. The AGPL sounds a lot like the GPL but with an important difference – not only does it require any programs you distribute to provide the source code, but also covers server software as well (N.B. I am totally not a lawyer and probably don’t know what I’m talking about).
Many projects sort of skirt around the requirements of sharing their code even though they use GPL software because they don’t actually distribute binaries, instead they just run a server and let it communicate to clients. The AGPL was designed to close that rather sizable loophole. What this means in practice is that the thousands of existing commercial products that use BDB, or use code that uses BDB, are all going to be prevented from upgrading to the latest BDB unless they make their product open source… or choose option B (cough up the dough).
You gotta give Larry props for that one. I respect him and am very pleased with the fact that at least someone in this world is willing to go full James Bond villain, complete with tropical island fortress. However I’ll stick with the community-run project for my database system.
Last point to make here: there is clearly a hard limit on how good MySQL is going to be able to get. If it was truly awesome and powerful, why would anyone need to buy Oracle DB?
Well at this point you might be saying “gee, that PostgreSQL option does sound pretty nifty, but how on Earth am I going to switch my existing application to Postgres?” Worry not, friend. There is a very handy DB dump conversion tool (mysql2pgsql) which does the job for you. I used it myself and had a bit of trouble with converting BLOBs and some ordering of foreign keys, but I was able to patch those problems up quite easily and get my changes upstream, so no big deal. I switched in 2012 or so and haven’t looked back since.
Finally, the PostgreSQL community is wonderful. I have always had a great time at the SF postgres user group and seen some amazing stuff people do with it that you would never imagine could be done with your database server if you’re stuck in the tiny MySQL world. Go check it out.
I watch this video every day: