We recently completed a big project migrating our primary rails app from Heroku to AWS. We’re planning to write a series of posts over the next few weeks discussing AWS ECS, terraform, docker, and other tools we now rely on. I think we’ve figured out a number of tricks (and sometimes ugly hacks) along the way that might be useful to others.
But to start it off: why did we make the move? There were a number of small reasons that together convinced us it was worth switching:
We use terraform pretty heavily to manage our AWS environment, but on Heroku everything is set manually via web and command line UI’s. Starting out is really easy on Heroku and a web UI can make for a sleek demo, but when you want to do things like test an infrastructure change in staging and then roll out exactly the same thing to production, or rolling back to the previous configuration, or have a history of what you changed when (with commit messages describing why), or launch a new service that uses the same patterns & plugins, it’s a lot harder in a web UI. You have to make sure you manually follow all the same steps.
Terraform does have some support for Heroku now, but from looking through it it doesn’t give you nearly the same control over the various parameters of different services (plug-ins) you’re connecting to, fine-grained permissioning, etc. as AWS does.
I’m willing to trust that the Heroku platform itself is probably pretty secure, but a lot of security best-practices are unavailable to you on Heroku. Some of the big ones we’ve come across are:
No private networking within your environment, for example to take your database port off of the public internet or communicate between multiple internal apps/services. (They do now have this feature in Enterprise accounts called Private Spaces, but these are not enabled by default and are still very new. For example, they didn’t even support a release phase, which we use heavily to run DB migrations, until a few months ago). There’s an “Activity” log of deploys but you don’t have a permanent audit log of other changes (such as Cloudtrail in AWS). You don’t have a log of sessions and commands that have been run via
heroku run, etc. They don’t support 2FA for the command line tools (such as when using
Heroku only really natively supports a basic 3-tier web app architecture (load balancer, web dynos, database). For anything else you have to run it externally or use 3rd party plug-ins. These can be of varying levels of quality, and it’s up to each individual plugin to implement its own authentication, monitoring information, etc., there is no real core framework in Heroku that these can hook into. Our experience was that this encouraged us to shoe-horn everything we could into our primary database, which often ended up leading to performance problems. We only had a hammer so everything looked like a nail.
The big draw of AWS is there are a wide-array of services that are just one
terraform plan and apply away. Things like the SQS message queue, Kinesis, Elasticache, Elasticsearch, DynamoDB for large sharded OLTP data sets, Redshift, Athena, Batch, EMR and others for OLAP processing. They all use the same IAM paradigm for authenticating between them and have operational logs and metrics in the console or in Cloudwatch or S3. It’s a huge win to always have the right tool available for a given job in a way that is fully managed, secure, and easy to hook into.
To be fair, the grass is not all green in AWS land. For example, we’ve been using their Elasticsearch service for the past year and have been very close in that time to giving up on it and either hosting it ourselves or using the elastic.co managed solution (it had some kinks until very recently like not allowing scripting, and it went 2 years without a single version update). You can never tell whether an AWS service is effectively dead or a new awesome feature is just around the corner. I’m sure at some point with some service we’ll hit a brick wall and have to host our own open-source version or use a hosted one. But at least so far in my experience, the number of reliable (if sometimes clunky) building blocks that AWS provides outweighs the occasional misses.
Other Basic Services
Over time we’ve built up some helper tools that don’t really fit into our app itself but don’t really make sense as their own full web app either. For example, we have a “deploybot” tool which monitors github for PR’s that are ready to ship, and then merges them and deploys. We also have some servers running Jenkins for CI tests. These both run on EC2 instances rather than in Heroku, but it hadn’t been worth investing in tooling around managing and monitoring these well since it was a totally different environment than our primary app. That meant it was always kind of annoying to work with when the instance would become unresponsive or the app dould go down.
As good as Heroku is, and as useful as the rest of the AWS suite of services are, at some point we’ll need to run software on a basic server or VM running Linux or similar. Now that our app is running in AWS (and on ECS, which runs on top of EC2 instances that you manage yourselves), we’re building up useful tooling for this environment that should extend to pretty much anything we may run in the future.
Overall Heroku was a great platform for us to start on, and I credit it with helping us evolve a great culture of continuous deployment and shipping often. But we’re excited to now be in a place where we have more control over our infrastructure, and can customize it to make our engineers even more effective going forward.
— Danny Cosson