Wednesday, 9 December 2009

Configuration management with Scala

Configuration management tools are essential things to have in your toolbox if your role is to manage large scale distributed IT systems. Very good free and open source solutions such as Puppet or SmarFrog exist for those familiar with Ruby or Java. It is not really my intention to implement yet another configuration management tool in Scala just for the sake of it. I know for a fact, having worked closely with the SmartFrog team in HP Labs Bristol, that they are rather complicated things to get right.

But that said....

The other day, I got thinking about Scala traits and the potential they offer for creating composable, refinable components that express configuration information and configuration logic. Other Scala features such as its type-safe nature and its package system could enable an elegant and simple way to represent configuration data and logic.

Let's start with a simple example that we will gradually extend. Let's assume I want to setup a cluster of nodes where each node is configured with an admin user account.

I start of with modelling a user with a User trait, a collection of properties such as the user's name, uid, etc.. The trait also contains (mock in this case) logic that can be executed to add or remove users on a computer. Notice how the user trait is itself composed from other traits.
trait Deployable{
  def deploy
  def undeploy
}

trait Named{
  var name : String = _
}

trait User extends Named with Deployable{
  var uid  : Short = _
  override def deploy = println ("useradd -u " + uid + " " + name)
  override def undeploy = println ("userdel -r " + name)
}

Users are to be deployed on nodes that I've modelled with a node trait (and once again that trait itself is composed from others). A node has an IP address and provides a convenience method to add things (such as a user) to it. Things added to the node will be deployed as the node is deployed by the hypothetical deployment runtime.
trait WithChildren{
  var includes : List[Any] = List()
  def contains (a : Any*) = a.foreach ( item => includes = item :: includes)
}

trait Node extends WithChildren with Deployable{
  var ip : String = _
  def deploy = println ("Actual logic to deploy children goes here.")
  def undeploy = println ("Actual logic to undeploy children goes here.")
}

trait Config extends WithChildren with Deployable{
  def deploy = println ("Actual logic to deploy nodes goes here.")
  def undeploy = println ("Actual logic to undeploy nodes goes here.")
}

The config trait models a collection of nodes. Again, in a real system, it would be the thing I actually pass to a deployment runtime for enaction.

With all the pieces in place, let's see a simple configuration:
object config1 extends Config{
  contains{
    new Node{
      ip = "192.168.1.1"
      contains{
        new User{
           name  = "admin"
           uid   = 102
        } 
      }
    }
  }
}

In the configuration above, I deploy the admin user on a single node. I can refine this a little bit by subclassing the node trait to create an AdminNode trait which includes the admin user by default.
object adminUser extends User{
  name ="admin"
  uid = 102
}

trait AdminNode extends Node{
 contains{
  adminUser
 }
}

In the snippet above, I've created an object adminUser which is a trait which has been instantiated. The adminUser can no longer be refined (subclassed) as Scala (unlike SmartFrog) is not a prototype based language. However the object can still be reused and composed into other traits. The other important thing to note is that, in Scala, when you create a trait, the logic that is executed when invoking its constructor is the entire body of the trait. So thanks to this feature, I can define new variables, change existing ones or invoke method calls within the curly braces without having to define an explicit constructor method as I would have to if using Java or Groovy.

object config2 extends Config{
  contains{
    new AdminNode{ ip = "192.168.1.2" }
  } 
}


If you want to modify a particular instance of AdminNode in place let's say to add another user account, it is also easily done.

object config3 extends Config{
    contains{
      new AdminNode{
       ip = "192.168.1.3"
       contains{
        new User{
         name  = "demo"
         uid = 102
        }
       }
      }
   }
}  


One of the benefits of using Scala directly to write the configuration is that I can use constructs such as loops to create or modify the data. For instance, let's create a set of admin nodes from a list of IP addresses.
object config4 extends Config{
  List("192.168.1.2","192.168.1.3","192.168.1.4").foreach{ addr=>
    contains{
     new AdminNode{ ip = addr}
    }
  }
}


Just as I modelled users, I can also model applications running on nodes (again by composing traits). In the example below I model generic applications installed through packages (via a package manager a la apt-get) and controlled via Linux services.

trait Package extends Deployable{
  var packages : List[String] = List()
  override def deploy : Unit = println (packages.foreach(s => "apt-get install " + s))
  override def undeploy : Unit = println (packages.foreach(s => "apt-get remove " + s))
}

trait Services extends Runnable{
  var services : List[String] = List()
  override def start : Unit = println (services.foreach(s => "/etc/init.d/" + s + " start"))
  override def stop : Unit = println (services.foreach(s => "/etc/init.d/" + s + " stop"))
}


Using those traits, I can then model an application such as an Apache Web Server, or refine an Apache Web server into a Django application server running as an apache module.
trait WebServer{
 var port = 8080
}

trait Apache2 extends Services with Package with WebServer{
 packages += "apache2"
 services += "apache2"
 port = 80
}

trait Django extends Apache2{
 packages = "libapache2-mod-python" :: "python-django" :: packages
}

I can then include instances of Apache2 or Django in my hypothetical cluster.
object config5 extends Config{
    contains(
     new AdminNode{
      ip = "192.168.1.3"
      contains{
             new Apache2{ port = 8182 }
             }
     },    
     new AdminNode{
      ip = "192.168.1.4"
      contains{
             new Django{ port = 87 }
             }
     }
    )
}

As the configuration information is written directly in Scala, I automatically gain access to interesting features:
  • the compiler highlights syntax errors in the description
  • I can use IDE for syntax highlighting, auto completion and re-factoring
  • descriptions can be organised into packages and imported as required.

Obviously, as a thought experiment, this ought to be taken with a pinch of salt. I've only touched on some of the language features that are appropriate for expressing composable and reusable models of configuration data. I have not tried (and most likely wont try) to implement a distributed deployment engine that could deploy such configuration descriptions.

No comments:

Post a Comment