Your IP Address is:

Next Fedora Release

Puppet, Puppet functions & CouchDB: A powerful combination

Are you automating your deployments with Puppet?
We are too.I believe it’s a great tool to manage your infrastructure in an organized way.

But what about all the rich data that is stored in the Puppet nodes files? Isn’t that data that could be useful for other processes/purposes?
Yes it is, and that’s why we selected a database to store all this CMDB information: CouchDB

Why do this?


At the time that we started separating our data/information from our Puppet recipes Hiera was a very young product (or did not exist yet, I don’t remember exactly).

A single source of truth:

We wanted one source for all the objects we are managing. If all scripts and tools read from one data-source, then there is no other thruth. So Puppet had to comply with this.


I don’t know if Hiera could do this, but we need our CMDB to be as open and flexible as possible. Our CMDB is a very rich information source, and a lot of other tools/scripts require access to this data in a light-weight way.

Information variety:

Our CouchDB CMDB contains a lot more information than just Puppet configurations. We register all devices (nodes), configurations and services we deliver to customers in our CouchDB CMDB now.
In the future we also intend to store service-levels, service-windows, change-windows, service-owner-information, maintainer-information, etc…

Our CouchDB CMDB also contains all relations between these objects:

  • horizontal relations
    • TCP connections (from loadbalancer to web-proxy, from web-proxy to webserver, from webserver to database, etc)
  • vertical (hierarchical) relations
    • Availability relations: If this object goes down, what would be the impact?
    • Maintainability relations: Which objects are needed to maintain the service, but do not cause an Incident when they are down?
    • Financial relations: How does the cost of a device contribute to the infrastructure cost of a service?

Separated responsibilities:

We also register objects that are not managed/deployed by Puppet. Even objects that are managed by other teams and companies.Why? Because we want to be able to calculate the impact if one of these configurations goes down.

But why CouchDB?

  • Reading from CouchDB is fast
  • You don’t need a driver to access your data: A simple HTTP GET request does the trick
  • It is JSON-based, which makes the data accessible for almost every software-language / script
  • The map/reduce mechanism allows you to quickly count/sum data to generate reports.

How we did this?

Using one simple Puppet function.
It does a CouchDB lookup for a configuration, and gets a JSON object in return.
The Puppet function parses the JSON to Ruby objects, and delivers it to the Puppet manifest. (Ruby DSL)

One small drawback:

  • You really need to check the data that is returned from the CouchDB lookup.
    The CouchDB document can contain all sorts of key/values, and you need to make sure that this data is valid before you continue executing the second part of your manifest.
    At least check if the obligatory keys are present.
    If you don’t check the CouchDB data, and the data is invalid, then you may get some nasty Puppet errors in your Puppet agent log, which are hard to understand.

Here is the Puppet function we created:

Feel free to use it, give feedback, etc..