Ubuntu has had a great way to automate the building of your Amazon EC2 server through something called ClouldInit. What this allows for is that during the creation of a new instance, you can also pass in a variety of data needed to setup your server & application. Normally what is passed is a script but there are several other options. The script can be created in bash, perl, python or awk. The script normally installed any packages needed by your app, configures the various services, loads any startup data and finally installs your application. By scripting the setup, you are ensured that your server is 100% built the same way each time you create a new server. As of yesterday, CloudInit can be used by those of you more comfortable with Redhat/Centos as Amazon has announced their own CentOS-based linux AMI image that includes CloudInit. So now, there is a standard way to automate the building of your server, no matter what flavour of Linux you use.
I’ll talk more about CloudInit in a minute but first I wanted to review some of the other options that people have to setup their server instance and why automating using CloudInit is most likely the right tool to use.
- manual setup. This involves SSH’ing into the instance once it is up and running and manually entering the commands to install your application and its requirements. While this is acceptable as a starting point while you are in development, no application should be deployed into production on a server built this way. If your server ever goes down, you are in for a lot of pain (and stress) when you have to recreate your server on a moments notice during an outage. I’ve seen a fair number of startups using servers built this way. They start with a ‘dev’ server that was hand built and somehow that ends up being the production box. It’s really important teams take the time to rebuilt the production server cleanly before launching.
- using a pre-built AMI. this is where you manually setup your server and then create a new AMI image from it. Or more likely you are using someone else’s AMI. Ec2onrails is a perfect example of this. The advantage of using pre-built AMIs is that the server always comes up in a known (good) state. This is a big step forward from manually setting up the server. The downside is that if you want to make any changes to the setup, you need to save a new AMI, which is a slow process. And if you are using someone else’s AMI, you may not be able to do so. In this age of agile development, this can be a handicap.
- capistrano. This is a build tool from the Ruby world. It can be used to deploy non-Ruby apps as well but the scripts must be in Ruby (this may or may not be an issue for you). Overall, there is a lot to recommend about capistrano in that it is also a scripted solution.Only normally Everything is done through your Capfile. This is where you script the setup of your application.
The way that capistrano works is by SSH’ing into the server instance and running commands. This happens after the server is up and running. Normally, you start the server instance manually and then plugin the IP address or server hostname into your Capfile. The only downside to capistrano is that is runs from the developer’s desktop. Which may be fine for smaller teams. The minute you have a NetOps team, you probably want something that is not tied to a singled developer station.
Instead of the above, you should take a look at using CloudInit. What that lets you do is pass a script to the server instance that is run during the boot process. So how does CloudInit work and what are the key options. CloudInit allows you to use the ‘user-data’ field of the ec2-run-instance command to pass in a variety of information that CloudInit will use to setup your new server instance. The options include:
- Cloud Config Data This is the simplest use and allows the common tasks like update linux, install packages, set the hostname and load ssh keys. The full set of options are described here. Using config data will perform some core items but will not be enough to bring up your app.
- Run a script. If you can’t do what you need with the config data, you can create a script that will handle the extra items. The script can be in bash, python or any other language installed. For my servers, I use this to setup the database, load the application from git and register the server with my dynamic DNS provider. In fact, if you prefer, you don’t need to use the config data and can put everything you need in a script. Note, the script is run late in the boot process. This is normally a good thing but if you have a need to run something earlier take a look at Boothooks.
- upstart job. if you need to, you can provide a single upstart script that will be installed to /etc/init.
- a combination of the above. Finally, it is possible to combine all of the above. CloudInit supports creating a mime multi-part file that is a combination of any of the above items. There is a helper tool called write-mime-multipart that will take a set of inputs and generate the mime encoded data to pass to user-data. Note, the maximum total size that can be passed to user-data is 16K. If you are above that limit, you can gzip the data or you can move the items to files accessable through a URL and pass the URLs to user-data
As you can see CloudInit is very flexible and should allow you fully automate the building of your servers. Finally, I’ll note that CloudInit is not limited to EC2 but will also work with Ubuntu Enterprise Cloud (UEC).