The following set of rules should govern how Puppet configuration data is constructed & maintained. These are rules & guidelines we compiled where I work to help guide other infrastructure developers in building consistent Puppet configurations. There are a number of rules & guidelines missing in this list that we're still working toward, but it's a great list to get started with.

Nodes get only one role

site.pp:

node /\w{2}[0-9]webapp1-[0-9]+/ {
  include roles::webapp1
}

Roles contain one or more profiles

/path/to/modules/roles/manifests/webapp1.pp:

class roles::webapp1 {
  include profiles::tomcat

  include profiles::webapp1
}

Profiles define application stacks

/path/to/modules/profiles/manifests/webapp1.pp:

class profiles::webapp1 inherits profiles::base {

  $java_version = hiera('profiles::webapp1::java_version')
  $app_port     = hiera('profiles::webapp1::app_port')

  class { 'java':
    version => $profiles::webapp1::java_version,
  }

  firewall { '500 allow tcp port ${app_port}':
    proto  => 'tcp',
    port   => $app_port,
    action => 'accept',
  }
}

Profiles define system default configurations

Multiple system configurations should get multiple profiles - one for each type of configuration. Putting too much logic into each profile defeats the purpose & adds complexity to the system. A profile should describe a particular implementation of a service & if there is another similar service that requires slightly different configuration, another profile should be created.

Hiera data is explicitly queried via profiles, never modules

Hiera data is then passed to modules via class parameters from the profile. This keeps all the information that defines a given service contained in the profile simplifying the places where an administrator must look while troubleshooting.

/path/to/modules/profiles/manifests/portalapp1.pp:

class profiles::webapp1 inherits profiles::base {

  $java_version = hiera('profiles::webapp1::java_version')
  $app_port     = hiera('profiles::webapp1::app_port')

  class { 'java':
    version => $java_version,
  }

  [ .. snip ..]
}

Module variables that can accept customization are defined as class parameters

class myfoo (
  $username  = $myfoo::params::user,
  $groupname = $myfoo::params::group
) inherits params.pp {
  file { '/path/to/file/myfoo.conf':
    ensure => 'present',
    user   => $myfoo::username,
    group  => $myfoo::groupname,
    mode  => '0600',
  }
}

Class parameters inherit defaults from a local params.pp

/path/to/modules/myfoo/manifests/init.pp:

class myfoo (
  $username  = $myfoo::params::user,
  $groupname = $myfoo::params::group
) inherits params.pp {
[ .. snip .. ]

/path/to/modules/myfoo/manifests/params.pp:

class myfoo::params {
  case $::operatingsystem {
    'RedHat', 'CentOS' {
      $username  = 'nobody',
      $groupname = 'nogroup',
    }
    default {
      $username  = 'myuser',
      $groupname = 'mygroup',
    }
  }
}

Catalogs should fail to compile when required values are unset

Catalogs that compile one way with bad defaults or empty data required to properly setup a service leave the system in an unreliable & unpredictable state. If a configuration is setup in such a way that it will fail with missing data, work-arounds should not be employed to hide those issues. Instead the issues should be resolved or the logic should be refactored such that the particular scenario ceases to exist.

Hiera lookup functions shouldn't have defaults

DON'T DO THIS:

hiera('foo', '')

DO THIS (note lack of 2nd parameter which provides a default):

hiera('foo')

Do not hack in defaults that allow a run to complete only to be changed on subsequent runs

See: idempotent. This follows closely the rule about about having catalogs break if they can't compile properly. An example of this scenario is defaulting a Hiera lookup to an empty string if no key is found in the Hiera data. If a Hiera lookup is expected to work & it doesn't, this should be cause for alarm leading to troubleshooting a resolution. Including a bogus default simply masks the issue & results in unexpected problems down the line.

Don't use %{calling_class} or %{calling_module} in Hiera hierarchies

See Hiera pseudo variables docs

Using these two built-in variables within Hiera can be very tempting & indeed, they're clever but they can end up causing more trouble than they're worth down the line. If the logic of a module is changed such that the class name that original depended on the Hiera data changes there will be inconsistency for auto-loaded Hiera data. The same goes for modules who's name is changed at some point. This can lead to very confusing results. That being said... see the rule above about having profiles explicitly load Hiera data & handle passing that data to module class parameters rather than using auto-loading.

Users, directories, packages defined as virtual resources go into a common 'virtuals' module

If any module conflicts with a virtual resource, the virtual resource should be removed/refactored in favor of explicitly defined resources. For instance if a package is defined as a virtual resource but a Forge module is installed that explicitly installs the same package there will be a conflict & the virtual resource should be eliminated.

In most circumstances virtual resources should be avoided anyhow as each module should be responsible for managing a very defined service. Only that module should need to define the resources required to configure & manage the service.

Don't use concatenation to build rule-based configurations (firewall, sysctl, etc.)

Use providers & definitions. See the use of the 'firewall' provider Puppet Firewall module as a good example or the 'apache::vhosts' provider in the Puppet Apache module

Virtual resources should only be realized in profiles - never in modules

This is another rule that aims to make configurations explicit & easy to diagnose. Configurations should be defined in consistent & expected locations rather than scattered throughout the entire Puppet configuration. Virtual resources that are defined in profiles eliminates the temptation to add a dependency on the virtual resource that may not exist in other environments or installations.

Where appropriate, modules should follow the install/config/service class separation

Again following a pattern of consistency & clarity. Split up the functions of the management of the service into specific classes such that it's easy to identify where there are issues. This also keeps the logic organized into smaller chunks that are easier to manage & understand when working on them.

Generally modules should have a single generic interface for each use

init.pp if a single use - client/server.pp if providing different functionality for client & server, etc.

Think of init.pp as the user interface for the module that provides the common access to all of its functionality. Restricting interaction to a single class within the init.pp makes it easier for the user to access the module & understand it's functionality.

Module names should be prefixed with a site-specific ID to distinguish them from Forge modules

Use an underscore ('_') to separate the prefix from the module. This is applied to modules that reside exclusively within a site's environment. Any Forge modules that are depended on by other Forge modules can then be installed without concern over whether they will conflict with similarly named internally written modules.

Templates should only use locally defined variables

Define all used variables as class parameters & customize class parameter values in profiles when calling the class. Never call a variable from another module as it adds dependencies within a given modules logic on other modules which may or may not exist. Inter-module dependencies also make the configuration data very monolithic and difficult to re-organize in the future when systems & infrastructures change (trust me, they will & in unexpected ways).

All variables should be explicitly scoped

Use fully qualified variable names: module::class::variable. Non-qualified variables can easily be confused with variables from other portions of the catalog since Puppet will work its way up from the most local scope available to the most global scope looking for the requested variable. If the variable is not fully-qualified you may end up with very unpredictable results that are extremely difficult to find & debug.

Use 'puppet module generate' to create new modules

The 'puppet module generate' command requires prefixing modules with an organization followed by dash followed by the name of the module:

<organization>-<module_prefix>_<module_name>

All commits of puppet code must not have errors as reported by puppet-lint

This should be enforced in a pre-commit hook in SVN. Occasionally, there are times when you have to write code that puppet-lint thinks is bad, but actually is ok. A common example of this is when you are trying to create a variable that contains special characters. Typically, puppet-lint enforces that you surround a variable containing another variable with double quotes. However, if you need to create a variable that contains a dollar sign that you don't want evaluated by puppet as a variable, you have to enclose it in single quotes. This should be used as sparingly as possible and only when there is a legitimate reason for ignoring the check.

Example:

$dovecot_dest = '$recipient'

In the above example from the postfix module, we need to pass $recipient unmodified to the main.cf file so that it gets expanded by postfix as a variable (not by Puppet). Thus, we enclose it with single-quotes instead of double quotes. To tell puppet-lint to ignore this error, we can add a special control comment to the line:

$dovecot_dest = '$recipient' #lint:ignore:double_quoted_strings

You can also surround an entire block of code

# lint:ignore:double_quoted_strings
<code block>
# lint:endignore

More details on puppet-lint are available here: http://puppet-lint.com/