Under The Hood of Cloud Computing: October 2013

One of the things many people are asking for in OpenShift is alternate ways of authenticating SSH and git interactions with the applications gears. Since I'm doing my development work in EC2, I thought that was surely the right place to try it out. Well as usual, it didn't work out quite as simple as I'd planned.

This post isn't about OpenShift directly. It addresses what I found when I tried to implement FreeIPA in EC2 so that I could develop code to allow Kerberos authentication in OpenShift.

Kerberos in way too few words

Kerberos is an authentication protocol and service defined originally at MIT as part of Project Athena (along with things like the X Windows System and Zephyr, a predecessor to modern IM services). It is meant to provide authenticated on unencrypted and even untrusted networks. Perfect right? Well Kerberos has some quirks.

First, different people can run their own Kerberos services. To avoid conflicts, each service is given an identifier string known as a realm. By convention the realm string is the same as the enterprise DNS domain name. That is, if a company has DNS domain example.com then the Kerberos realm would be EXAMPLE.COM. Unlike DNS domain names, Kerberos realms are case sensitive.

Each participating host must be registered with the Kerberos server and each user must be added to the user list on the server as well. Hosts and users (and any other manageable entity) is identified with a principal. This is basically a name which is unique for each resource, err user, umm host.... thing. The important thing is that the host is identified by a string which is derived from its hostname.

Now wars have been fought over whether a hostname should be the Fully Qualified Domain Name (FQDN) or just the host portion. For Kerberos there is only one answer: FQDN.

The host principal for a given host is composed of the hostname, and the realm. When a client tries to log in, it needs to know the correct principal to request from the server. This is why the FQDN must be the hostname. When the user attempts to log in he must provide both his own principal and the host principal for the destination. The only way to know the destination's host principal is if it is related to the hostname as viewed from the client host.

This is where life gets interesting in AWS EC2.

You see, AWS uses RFC 1918 and something like Network Address Translation to create a private network for the virtual machines which make up the EC2 service. AWS also uses an internal DNS service to identify each virtual machine. This means that from the view of a host inside the private network, the destination host has a different IP address and a different hostname than when viewed from outside the private network. The upshot is that, to use Kerberos with EC2 I need some way to make sure that the user can determine a valid host principal to request regardless of where the user is located.

A Word about IPA and AWS

IPA (and FreeIPA) is not a single service. It's a collection of services configured so that they work in concert to provide secure user and host access over untrusted networks. Kerberos is only one of the services, though it is probably the core one. LDAP, NTP and DNS are all support services which make the operation of Kerberos work. IPA wraps these services in such a way so that mere mortals don't necessarily need to know how the bindings work merely to get the service running.

In this post I'm dealing almost entirely with the Kerberos service within IPA and I'll refer to that component by name. Where I mention FreeIPA it will be in reference to the specific tools that FreeIPA provides to set up and manage the conglomerate service.

AWS (Amazon Web Services) is also a suite of services. The core of that is the EC2 virtual host service. Again, all of the AWS services generally work together, but I'm only dealing with EC2 instances in this post so I'll refer to EC2 specifically unless I'm referring to the full suite.

UPDATE: 2013-11-07 - AWS TOS do not permit open DNS recursion.

One other thing to be aware of when running IPA in AWS. Amazon terms of service do not allow users to create open recursive DNS services within AWS on the grounds that they can be abused.

When setting up your AWS security policies and the named service on your IPA hosts, be sure to disable recursion and/or limit access to appropriate IP ranges for your DNS clients or you'll get a polite nastygram from Amazon.

Kerberos, Linux and SSH

I want to use Kerberos with SSH so that I can avoid using SSH authorized_keys when pushing git updates to my applications on OpenShift. (mostly ignoring EC2 for now). To do that I need several things set up:

A Kerberos (FreeIPA) server - IPA installed, configured
A set of users configured into the FreeIPA service
A target host (OpenShift node)

For SSH the most important things are that the Kerberos and LDAP configurations are set up properly. This includes configuring sssd, and the /etc/nsswitch.conf settings. Luckly the FreeIPA ipa-client-install script (with the right inputs) will do all of that for me. I think there are ways to get it to tell me precisely what changes it's making but I haven't learned how yet. I do know that I can find the results in the /var/log/ipaclient-install.log.

The other thing I need to do is to make sure that the SSH client and server both will at least try to use the GSSAPI protocol for managing the authentication process. On the server this means making sure that the GSSAPIAuthentication is enabled.

On the client side, I may need to specify that I want to use the gssapi-with-mic authentication method. I may also need to specify the host principal to use to access the destination (as distinct from the hostname from the client's vantage point). More on these later.

EC2 , cloud-init and resisting dynamic naming

The network interface numbering and naming in EC2 are dynamic by design, both on the internal and external interfaces. EC2 does offer "elastic IP" which is really "static IP" for an instance and since I own a DNS zone I can assign a name to the address. Unfortunately this only offers control of the external IP address assigned to an instance. I have to find ways to manage the internal naming myself.

When a host registers with a Kerberos service it generally uses its own hostname as the identifier for the host principal. If this is the same as the DNS name associated with one or more if its IP addresses, this is just by convention. That is, Kerberos doesn't maintain the mapping. So if a host changes its hostname but no changes are made to the Kerberos database, the host can no longer identify itself by it's principal. Also, if the name by which it is known from the outside changes (because the IP address and/or DNS name changed) then clients will no longer know what principal to use to request an access ticket.

There are two factors here: Making sure the host knows its own name, and making sure that users coming from remote hosts can determine the (a?) valid principal (based on the hostname) to request a ticket for.

Maintaining Host Identity

For Kerberos, the hostname is the anchor for a host principal. If the hostname changes on a registered host, it will no longer be able to properly communicate with the Kerberos server and clients. Luckly the Fedora and RHEL images in EC2 use cloud-init to initialize potentially dynamic information on startup.

Cloud-init is software which, when installed on a host, can take input from the cloud environment and customize the host to integrate it into the environment. It can do things like.. oh, say, set the IP address of network interfaces and hostnames, install SSH host keys, set device mount points and the like. It will also allow me to tell it not to update the hostname on each reboot.

The main configuration for cloud-init is /etc/cloud/cloud.cfg. I just need to add a line containing 'preserve_hostname: 1' and set the hostname I want in /etc/hostname. From then on, restarts or reboots will keep the hostname I set. Given that value I have my anchor for registering the host with the kerberos server and maintaining the host/principal mapping.

The host now always knows its own name: part one solved.

The view from Inside/Outside

You do learn something every day. In talking with some of the FreeIPA developer folks I learned something I hadn't known about how the Kerberos protocol works. ;Here's the important bit.

When client wants to gain access to some resource, it sends a message to the kerberos server saying "I am this principal and I want access to that one over there, ok?" The Kerberos server sends back a signed/encrypted ticket with both names (principals) wrapped inside it. The client then sends the ticket in an authentication request to the destination host, who verifies "yep, that's me, and I can see that that's you, let me check are you allowed?" and if the answer is "yes" the client request is granted.

What this means is that the client must know the name (principal) of the destination resource before attempting to connect to the resource. It must know a name that both the kerberos server and the resource host itself will recognize. When everyone uses DNS FQDNs to identify hosts and they have the same view of DNS, this works nicely. Accessing private network resources from a public network creates some issues.

Most tools, SSH included, assume that they can compose a host principal from the hostname given by the user. So if a client was using realm EXAMPLE.COM and tried to reach a remote host with FQDN 'destination.example.com' the principal would be host/destination.example.com@EXAMPLE.COM. But since the EC2 hosts have (not one but two) random hostnames assigned when they boot, it's impossible to know from the hostname alone what the principal of the destination is.

If I happen to know the mapping (ie, what principal is associated with the destination host) then SSH allows me to specify that with -oGSSAPIServerIdentity=<principal> on the CLI or in a Host entry in my .ssh/config file. From the illustration above, to properly authenticate with the Kerberos Host I could do this:

ssh -oPreferredAuthentications=gssapi-with-mic -oGSSAPIServerIdentity=host1.example.com random2.external

(this also assumes that my local hostname and remote one are the same and that I've got a ticket-granting-ticket for the EXAMPLE.COM realm using kinit.)

What this says is to log into a host who's name (from this view) is random2.internal, and who's principal is host1.example.com. With that the local client can send a query to the Kerberos server and get the right ticket back to hand to the destination host. It can say "yep, that's me and yep you're you, and yep you're allowed"

The Many Faces of Kerberos

It's totally coincidence that Cerberos is the 3-headed dog that guards the landing in Hades on the river Styx and I'm going to add two "faces" to my kerberos clients. Totally.

I think that in the discussion above I've been careful to make it clear that a Kerberos principal is an identifier. That is, it is a handle which is used to refer to an object in the Kerberos database which corresponds to an object in reality. I have nick names. Hosts in Kerberos can have them too, and this is going to solve my identity problem with random dynamic names and IP addresses.

I've managed to give each host a fixed hostname, thanks to cloud-init. Once I know the dynamic names both public and internal I should be able to inform the Kerberos server of both of the aliases.

If this works, here's what will happen when I try to log in either from a host inside or outside the private network, my SSH client will form a principal from the (DNS) name I offer. My client will send that to the Kerberos server and request an access ticket to the remote host using the alias principal. And the Kerberos server will know which host that means. It will create an access ticket which will grant me access to the destination host, which will examine it and on finding everything in order, will allow my SSH connection.

It turns out that FreeIPA doesn't yet have a nice Web or CLI user interface to add principals to a registered host record, but the Kerberos database is stored in an LDAP server on the Kerberos master host. For now I (or a friend actually) can craft an LDAP query which will add the principals I need to the host record. This is assumed to be run on

kerberos# ldapmodify -h localhost -x -D "cn=Directory Manager" -W @lt;@lt;EOF
dn: fqdn=host1.example.com,cn=computers,cn=accounts,dc=example,dc=com
changetype: modify
add: krbprincipalname
krbprincipalname: host/random2.external@EXAMPLE.COM
krbprincipalname: host/random2.internal@EXAMPLE.COM
EOF

The invocation above will request the password of the admin user for the FreeIPA LDAP service. I'm sure there's a way to do it with Kerberos/GSSAPI, but I haven't got it yet.

What that change does is add two Kerberos principal names to the host entry for host1.example.com. The principal names match what an SSH client would construct using the DNS name (internal or external) to reach the target host. Now when the Kerberos server gets a ticket request from clients either inside or outside the private network, the principal in the ticket request will be associated with a known host.

The Devil's in the Dynamics

This is all fine so long as host1.example.com doesn't reboot. When it does, AWS will assign it a new internal and external IP address and new DNS names. It would be really nice if the host, when it boots could inform the Kerberos service what its new internal and external principal names are.

I don't currently know how to do this, but I suspect that I could add a module to cloud-init to do the job. The client is already configured to use the LDAP service on the Kerberos (FreeIPA) server. Once the server knows that all three principals refer to the same host life should be good.

Now to learn some cloud-init finagling and enough Kerberos so that I can have the host update itself on reboot.

What does this mean for OpenShift?

If you want to run an OpenShift service in AWS and you want to offer Kerberos authentication for SSH/git to the application gears, you'll have to do a little LDAP tweaking of the Kerberos principals associated with each host so that the Kerberos service will know which host you mean regardless of your view of the destination host.

The first round of Kerberos integration code is going into OpenShift Origin as I write this (the pull requst is submitted and getting commentary). By the next release it should be possible to manage developer access to gears with Kerberos and FreeIPA. Additional use cases will be added over time.

Summary

Cloud services like AWS and corporate networks often rely on private network spaces and Network Address Translation to manage dynamic hosts.
Cloud Init usually updates the hostname on each boot but this can be suppressed.
For a client trying to reach a host for SSH this poses a problem because the view of the destination from the client differs based on where the client sits in relation to the network boundary.
Kerberos can assign multiple principals to a single host, which allows authentication to work.

References

FreeIPA - A component based single-sign-on service
Kerberos - The authentication component of FreeIPA and MIT Project Athena
GSSAPI - A standardized generic authentication and access control protocol
Project Athena - 1980s MIT/DEC/IBM project to design network services and protocols
RFC 1918 - Private non-routable IP address space reservations
Network Address Translation - Private network boundary system
AWS Elastic IP - AWS static IP addresses for dynamic hosts
Cloud Init - A service for customizing host configuration on reboot

Under The Hood of Cloud Computing

Friday, October 11, 2013

Diversion: Kerberos (FreeIPA) in AWS EC2