Skip to content

Build an SNMP Profile

SNMP profiles are our way of providing out-of-the-box monitoring for certain makes and models of network devices.

This tutorial will walk you through the steps of building a basic SNMP profile that collects OID metrics from HP iLO4 devices.

Feel free to read the Introduction to SNMP if you need a refresher on SNMP concepts such as OIDs and MIBs.

Ready? Let's get started!

Research

The first step to building an SNMP profile is doing some basic research about the device, and which metrics we want to collect.

General device information

Generally, you'll want to search the web and find out about the following:

  • Device name, manufacturer, and device sysobjectid.

  • Understand what the device does, and what it is used for. (Which metrics are relevant varies between routers, switches, bridges, etc. See Networking hardware.)

    E.g. from the HP iLO Wikipedia page, we can see that iLO4 devices are used by system administrators for remote management of embedded servers.

  • Available versions of the device, and which ones we target.

    E.g. HP iLO devices exist in multiple versions (version 3, version 4...). Here, we are specifically targetting HP iLO4.

  • Supported MIBs and OIDs (often available in official documentation), and associated MIB files.

    E.g. we can see that HP provides a MIB package for iLO devices here.

Metrics selection

Now that we have gathered some basic information about the device and its SNMP interfaces, we should decide which metrics we want to collect. (Devices often expose thousands of metrics through SNMP. We certainly don't want to collect them all.)

Devices typically expose thousands of OIDs that can span dozens of MIB, so this can feel daunting at first. Remember, never give up!

Some guidelines to help you in this process:

  • 10-40 metrics is a good amount already.
  • Explore base profiles to see which ones could be applicable to the device.
  • Explore manufacturer-specific MIB files looking for metrics such as:
    • General health: status gauges...
    • Network traffic: bytes in/out, errors in/out, ...
    • CPU and memory usage.
    • Temperature: temperature sensors, thermal condition, ...
    • Power supply.
    • Storage.
    • Field-replaceable units (FRU).
    • ...

Implementation

It might be tempting to gather as many metrics as possible, and only then start building the profile and writing tests.

But we recommend you start small. This will allow you to quickly gain confidence on the various components of the SNMP development workflow:

  • Editing profile files.
  • Writing tests.
  • Building and using simulation data.

Add a profile file

Add a .yaml file for the profile with the sysobjectid and a metric (you'll be able to add more later).

For example:

sysobjectid: 1.3.6.1.4.1.232.9.4.10

metrics:
  - MIB: CPQHLTH-MIB
    symbol:
      OID: 1.3.6.1.4.1.232.6.2.8.1.0
      name: cpqHeSysUtilLifeTime

Tip

sysobjectid can also be a wildcard pattern to match a sub-tree of devices, eg 1.3.6.1.131.12.4.*.

Generate a profile file from a collection of MIBs

You can use ddev to create a profile from a list of mibs.

$  ddev meta snmp generate-profile-from-mibs --help

This script requires a list of ASN1 MIB files as input argument, and copies to the clipboard a list of metrics that can be used to create a profile.

Options

-f, --filters is an option to provide the path to a YAML file containing a collection of MIB names and their list of node names to be included.

For example:

RFC1213-MIB:
- system
- interfaces
- ip
CISCO-SYSLOG-MIB: []
SNMP-FRAMEWORK-MIB:
- snmpEngine

Will include system, interfaces and ip nodes from RFC1213-MIB, no node fro, CISCO-SYSLOG-MIB and node snmpEngine from SNMP-FRAMEWORK-MIB.

Note that each MIB:node_name correspond to exactly one and only one OID. However, some MIBs report legacy nodes that are overwritten.

To resolve, edit the MIB by removing legacy values manually before loading them with this profile generator. If a MIB is fully supported, it can be omitted from the filter as MIBs not found in a filter will be fully loaded. If a MIB is not fully supported, it can be listed with an empty node list, as CISCO-SYSLOG-MIB in the example.

-a, --aliases is an option to provide the path to a YAML file containing a list of aliases to be used as metric tags for tables, in the following format:

aliases:
- from:
    MIB: ENTITY-MIB
    name: entPhysicalIndex
  to:
    MIB: ENTITY-MIB
    name: entPhysicalName

MIBs tables most of the time define one or more indexes, as columns within the same table, or columns from a different table and even a different MIB. The index value can be used to tag table's metrics. This is defined in the INDEX field in row nodes.

As an example, entPhysicalContainsTable in ENTITY-MIB is as follows:

entPhysicalContainsEntry OBJECT-TYPE
SYNTAX      EntPhysicalContainsEntry
MAX-ACCESS  not-accessible
STATUS      current
DESCRIPTION
        "A single container/'containee' relationship."
INDEX       { entPhysicalIndex, entPhysicalChildIndex }  <== this is the index definition
::= { entPhysicalContainsTable 1 }

or its JSON dump, where INDEX is replaced by indices:

"entPhysicalContainsEntry": {
    "name": "entPhysicalContainsEntry",
    "oid": "1.3.6.1.2.1.47.1.3.3.1",
    "nodetype": "row",
    "class": "objecttype",
    "maxaccess": "not-accessible",
    "indices": [
      {
        "module": "ENTITY-MIB",
        "object": "entPhysicalIndex",
        "implied": 0
      },
      {
        "module": "ENTITY-MIB",
        "object": "entPhysicalChildIndex",
        "implied": 0
      }
    ],
    "status": "current",
    "description": "A single container/'containee' relationship."
  },

Indexes can be replaced by another MIB symbol that is more human friendly. You might prefer to see the interface name versus its numerical table index. This can be achieved using metric_tag_aliases.

Add unit tests

Add a unit test in test_profiles.py to verify that the metric is successfully collected by the integration when the profile is enabled. (These unit tests are mostly used to prevent regressions and will help with maintenance.)

For example:

def test_hp_ilo4(aggregator):
    run_profile_check('hp_ilo4')

    common_tags = common.CHECK_TAGS + ['snmp_profile:hp-ilo4']

    aggregator.assert_metric('snmp.cpqHeSysUtilLifeTime', metric_type=aggregator.MONOTONIC_COUNT, tags=common_tags, count=1)
    aggregator.assert_all_metrics_covered()

We don't have simulation data yet, so the test should fail. Let's make sure it does:

$ ddev test -k test_hp_ilo4 snmp:py38
[...]
======================================= FAILURES ========================================
_____________________________________ test_hp_ilo4 ______________________________________
tests/test_profiles.py:1464: in test_hp_ilo4
    aggregator.assert_metric('snmp.cpqHeSysUtilLifeTime', metric_type=aggregator.GAUGE, tags=common.CHECK_TAGS, count=1)
../datadog_checks_base/datadog_checks/base/stubs/aggregator.py:253: in assert_metric
    self._assert(condition, msg=msg, expected_stub=expected_metric, submitted_elements=self._metrics)
../datadog_checks_base/datadog_checks/base/stubs/aggregator.py:295: in _assert
    assert condition, new_msg
E   AssertionError: Needed exactly 1 candidates for 'snmp.cpqHeSysUtilLifeTime', got 0
[...]

Good. Now, onto adding simulation data.

Add simulation data

Add a .snmprec file named after the community_string, which is the value we gave to run_profile_check():

$ touch snmp/tests/compose/data/hp_ilo4.snmprec

Add lines to the .snmprec file to specify the sysobjectid and the OID listed in the profile:

1.3.6.1.2.1.1.2.0|6|1.3.6.1.4.1.232.9.4.10
1.3.6.1.4.1.232.6.2.8.1.0|2|1051200

Run the test again, and make sure it passes this time:

$ ddev test -k test_hp_ilo4 snmp:py38
[...]

tests/test_profiles.py::test_hp_ilo4 PASSED                                                                                        [100%]

=================================================== 1 passed, 107 deselected in 9.87s ====================================================
________________________________________________________________ summary _________________________________________________________________
  py38: commands succeeded
  congratulations :)

Rinse and repeat

We have now covered the basic workflow — add metrics, expand tests, add simulation data. You can now go ahead and add more metrics to the profile!

Next steps

Congratulations! You should now be able to write a basic SNMP profile.

We kept this tutorial as simple as possible, but profiles offer many more options to collect metrics from SNMP devices.


Last update: November 30, 2020