Cloud, Software Development, System Administration

Using CloudInit to automate Linux EC2 setup

Ubuntu has had a great way to automate the building of your Amazon EC2 server through something called ClouldInit. What this allows for is that during the creation of a new instance, you can also pass in a variety of data needed to setup your server & application. Normally what is passed is a script but there are several other options. The script can be created in bash, perl, python or awk. The script normally installed any packages needed by your app, configures the various services, loads any startup data and finally installs your application. By scripting the setup, you are ensured that your server is 100% built the same way each time you create a new server. As of yesterday, CloudInit can be used by those of you more comfortable with Redhat/Centos as Amazon has announced their own CentOS-based linux AMI image that includes CloudInit. So now, there is a standard way to automate the building of your server, no matter what flavour of Linux you use.

I’ll talk more about CloudInit in a minute but first I wanted to review some of the other options that people have to setup their server instance and why automating using CloudInit is most likely the right tool to use.

  • manual setup. This involves SSH’ing into the instance once it is up and running and manually entering the commands to install your application and its requirements. While this is acceptable as a starting point while you are in development, no application should be deployed into production on a server built this way. If your server ever goes down, you are in for a lot of pain (and stress) when you have to recreate your server on a moments notice during an outage. I’ve seen a fair number of startups using servers built this way. They start with a ‘dev’ server that was hand built and somehow that ends up being the production box. It’s really important teams take the time to rebuilt the production server cleanly before launching.
  • using a pre-built AMI. this is where you manually setup your server and then create a new AMI image from it. Or more likely you are using someone else’s AMI. Ec2onrails is a perfect example of this. The advantage of using pre-built AMIs is that the server always comes up in a known (good) state. This is a big step forward from manually setting up the server. The downside is that if you want to make any changes to the setup, you need to save a new AMI, which is a slow process. And if you are using someone else’s AMI, you may not be able to do so. In this age of agile development, this can be a handicap.
  • capistrano. This is a build tool from the Ruby world. It can be used to deploy non-Ruby apps as well but the scripts must be in Ruby (this may or may not be an issue for you). Overall, there is a lot to recommend about capistrano in that it is also a scripted solution.Only normally Everything is done through your Capfile. This is where you script the setup of your application.
    The way that capistrano works is by SSH’ing into the server instance and running commands. This happens after the server is up and running. Normally, you start the server instance manually and then plugin the IP address or server hostname into your Capfile. The only downside to capistrano is that is runs from the developer’s desktop. Which may be fine for smaller teams. The minute you have a NetOps team, you probably want something that is not tied to a singled developer station.

Instead of the above, you should take a look at using CloudInit. What that lets you do is pass a script to the server instance that is run during the boot process. So how does CloudInit work and what are the key options. CloudInit allows you to use the ‘user-data’ field of the ec2-run-instance command to pass in a variety of information that CloudInit will use to setup your new server instance. The options include:

  • Cloud Config Data This is the simplest use and allows the common tasks like update linux, install packages, set the hostname and load ssh keys. The full set of options are described here. Using config data will perform some core items but will not be enough to bring up your app.
  • Run a script. If you can’t do what you need with the config data, you can create a script that will handle the extra items. The script can be in bash, python or any other language installed. For my servers, I use this to setup the database, load the application from git and register the server with my dynamic DNS provider. In fact, if you prefer, you don’t need to use the config data and can put everything you need in a script. Note, the script is run late in the boot process. This is normally a good thing but if you have a need to run something earlier take a look at Boothooks.
  • upstart job. if you need to, you can provide a single upstart script that will be installed to /etc/init.
  • a combination of the above. Finally, it is possible to combine all of the above. CloudInit supports creating a mime multi-part file that is a combination of any of the above items. There is a helper tool called write-mime-multipart that will take a set of inputs and generate the mime encoded data to pass to user-data. Note, the maximum total size that can be passed to user-data is 16K. If you are above that limit, you can gzip the data or you can move the items to files accessable through a URL and pass the URLs to user-data

As you can see CloudInit is very flexible and should allow you fully automate the building of your servers. Finally, I’ll note that CloudInit is not limited to EC2 but will also work with Ubuntu Enterprise Cloud (UEC).

Happy scripting!!

Software Development, System Administration

Finally Amazon adds a micro instance. No more need for rackspace / slicehost

Amazon is clearly the right answer for most people’s cloud services. But when you are developing software and just need a small server to do some testing, their smallest instances was about 6 times more expensive than competitive offerings. As a result, a lot of developers also had a rackspace or slicehost account. Now that AWS has announced their new ‘micro’ instance, most of us can get back to the simplicity of using a single cloud.

The new ‘micro’ install actually is not that small. It has a decent amount of RAM at 613MB. That’s much more than most small VPSs from other vendors that only have 128MB or 256MB. Also, while it has one ECU (compute unit), it can burst up to two ECUs. Again, not bad compared to other vendor’s basic offering. Finally, one different is that the ‘micro’ does not include much hard disk space so you will need to add a EBS volume. If you can live with a small 10GB EBS, that will only add another $1 per month (remember we are talking about test / development servers).

So for now, I’m back to using AWS for my test servers. It will be interesting if Rackspace and others respond to Amazon’s aggressive move.

System Administration

not all dynamic DNS services are equal

In the tech world, once you have something working, you tend stick with it for a long time. I’ve been using Zoneedit to host my various DNS domains for almost 10 years. But today, I’m switching to DnsMadeEasy.

This is being driven by Zoneedit’s Dynamic DNS limitations. Specifically, when you update the IP associate with one of your hosts, it can take a while before Zoneedit start publishing the new address. In my tests, it was taking over one hour. On top of that, you can not set the Time-To-Live (TTL) on the (A)ddress records. For DDNS, both of these are non-starters. The whole idea with DDNS is that you know your IP address will change periodically. And when it does, you want the world to have the new IP address as soon as possible.

Several months ago, for a service I was creating, I had the chance to use DnsMadeEasy. At a high level, there is not a lot of difference between DNS hosting companies. But sometimes the small differences are important. DnsMadeEasy has a much better web interface that allows you to have fine grained control of your DNS records. And equally important, changes to your zone are live almost immediately. Finally, rather than hosting your domain on just 2 DNS servers, DnsMadeEasy gives you six.

I think I would be more accommodating if I had been using ZoneEdit’s free version but I have been paying for the service. So while I have not been unhappy with ZoneEdit’s service over the years, it’s time to move to something better.

System Administration

Fully automated ubuntu server setups using preseed

When you are building a server on a cloud like Amazon’s EC2, its quite typical to create a shell script to automate the process. This saves you time when you need to create a test server or god-forbid, your production server dies. But one of the challenges in creating these scripts is that some packages display a UI asking for user input. Two common examples of this are mysql (root password) and postfix (server type, root email, etc).

The solution to this is to use preseeding. This is where you tell the debian installer, in advance, the response to each of the questions it would normally ask when it installs the package. The command-line tools used to preseed are part of the debconf-utils package. So the 1st step is to make sure these are installed.

sudo apt-get -y install debconf-utils

Now you can go ahead and figure out which settings need to be set for each package. Let’s use mysql as an example. The easiest way to do this is to install the package manually and then use the debconfig-get-selections command to query the list of settings.

# sudo apt-get -y install mysql-server
# sudo debconf-get-selections | grep mysql
mysql-server-5.1 mysql-server/root_password_again password
mysql-server-5.1 mysql-server/root_password password
mysql-server-5.1 mysql-server/start_on_boot boolean true

So what we need to do now is create a preseed file with the above settings and then pass it to the debian installer using the debconf-set-selections command.

# echo "mysql-server-5.1 mysql-server/root_password password $MYSQL_ROOT_PWD" > mysql.preseed
# echo "mysql-server-5.1 mysql-server/root_password_again password $MYSQL_ROOT_PWD" >> mysql.preseed
# echo "mysql-server-5.1 mysql-server/start_on_boot boolean true" >> mysql.preseed
# cat mysql.preseed | sudo debconf-set-selections
# apt-get -y install mysql-server

Once the values have been set, you can run apt-get to install the package without prompting for any inputs.

One final note, if you have a lot of servers or rebuild them on a regular basis, you should probably be looking at Puppet or Chef.

System Administration

how a geek spends his free time

One of the hassles of being a geek is that you actually get your hands dirty with technology.  Usually that’s a good thing.  But sometimes it isn’t.  Yesterday at 5am, I started getting SMS messages from our mail server indicating that services were failing.   I had no idea why this was happening.

Now if I wasn’t a geek, I would not have a server to manage and whom ever provides our web/email would take care of the problem.  But I am a geek and want the flexibility to do things that don’t come in a standard solution from a web/email provider.  So that means I had to be the one to figure why I was getting SMSs at 5am.

It turns out an email account got compromised and spammers were hammering our server with SPAM they were sending.  Thank goodness my hosting company ( has great support.  They were able to identify the problem and help me fix it.

Now I’m not really sure how spammers got the password for one of our email accounts.  We use IMAP and SMTP AUTH, both of which send the password unencrypted but only a sniffer at the ISP would be able to grab that.  And I assume most (if not all) ISPs protect against this.  Anyways, I decided that we needed to get all email clients using TLS and SSL.  Turns out TLS was already enabled on the server and all I had to do was add a SSL cert for IMAP.  So, hopefully we are protected now.

Along the way, someone suggested I should also look at SPF and domainkeys.  Now they really don’t have anything to do with the issue but they are good things to implement anyways.  I had already added SPF records to our DNS but was not familiar with domainkeys.  So I spent saturday morning tackling this.

Turns out domainkeys is not that hard to implement as I’m on a host with cpanel, which supports domainkeys.  The only complication is that I have our DNS hosted at a DNS hosting provider (so I can get redundancy).  So on our hosted server, I just used /usr/local/cpanel/bin/domain_keys_install account to generate a private/public key and make it available to exim.  And then I took the entry added to the DNS file in /var/named and added it to our external DNS provider.  So now, between the SPF and domainkeys, we should not have much, if any, email rejected.

All in all a satisfying couple of hours.

It turns out spammers had gotten one of the passwords for an email account and where using the server to send a ton of SPAM.   Now, if I wasn’t a geek, I would have just gotten our email from a service provider and just have to