Observium Mibs

Posted on  by 



This page is outdated as of Feb 26 2015, as state sensors have been rewritten. I keep an updated document on:
https://github.com/mgmoerman/docs/blob/master/observium-alert-checkers.md

Subject: Observium how to make observium not to poll certain mibs Hi, Sorry if this has been answered. But I didn't find the answer in doing search. So we have a router bug of memory leak core-dump which was caused by observium trying to do snmpwalk against a certain mib. Vendor advise us. Mib Browser provided by Observium - Intuitive Network Monitoring FORTINET-FORTIGATE-MIB Various non-monotonically increase Counter32 values have been changed to Gauge32. Adding an OS type to Observium can be relatively straight forward or it can be quite complex, depending upon the structure and design of the MIBs and SNMP implementations involved. Adding an ICT Platinum DC Power device, looks like its missing the correct MIB translations. It categorizes it correctly as a power device but doesn't discovery any of the important power status. I know this is a newer device model. See attached Vendor MIB file.

Observium straight out of the SVN repository (if you bought the subscription) doesn’t come with alert-checkers, which is unfortunate, as you need to figure out how this alerting system works by trial and error. Goal of this blog post is to give some examples of generic alert-checkers, and provide some more explanation on Metrics & Attributes, and some of the values that go with it. This document is off course not complete, and can always be improved. Please give me feedback to improve this.

Observium has a very powerful way of using entity types & check conditions to do alerting. But you do need to know how this is implemented.

There is some documentation on the Observium site itself, which is useful to read:

Creating an alert checker

Let’s go through the steps that are involved to actually create/add an alert checker in Observium

Observium mibs import

Entity type

First of all when you create an alert,you’ll need to pick the ‘entity’ type for what you are building the alert for. An entity type is nothing more than a “thing” for which you would like to see alerts.

These are the ones that are available as of 12/12/2014:

  • Device
  • Memory
  • Storage
  • Processor
  • BGP Peer
  • Netscaler vServer
  • Netscaler Service
  • Toner
  • Port
  • Sensor

They kinda speak for them selves, if you want alerts on things that go on with ports, pick ports, if you want something that has to do with a sensor, pick that one. Device is a very generic one, and will just give you status things on wether it’s up/down and it’s uptime and the response time for ping/snmp, the entity type Device has nothing to do with Ports or Sensor on the device itself, for alerting for that, pick actually Ports or Sensor

Alert Checker details

Once you picked the entity type, there’s a couple of more things that need to be filled in but these are simple, pick a name for the alert, and pick a message you want to be included once an alert is sent out.

Use Alert Delay to set the amount of poller runs that a condition of your alert checker should persist until it actually starts alerting. This could be useful when for example you’re creating a check for processor usage, but you don’t want to be alerted on every CPU spike that is happening. If you set a delay of say, 2, it’ll take 2 poller runs for actually alerting (providing the condition for which you are checking hasn’t changed off course)

Send Recovery button is self explanatory, and the Severity is currently not in use

Checker Conditions

Then we come to the Checker Conditions, this is where you actually implement the check for a specific entity.

It’s important to know what Metrics & Attributes are, see the overview below for a complete list of Metrics & Attributes

When filling in the fields for Checker Conditions, you use the Metrics mentioned in this page.

These need to be single lined entries, you can put as much in there if you want but you usually have one to check for a single condition, or two, for example to check an upper and lower limit. Use the boolean to switch between ANY or ALL of these conditions to match.

A single line consists of three values:

  • the actual metric
  • a “test” (le, ge, lt, gt, ne, match and notmatch)
  • a value

Associations

Mibs

In these input fields you’ll create the first association rule, in other words, which subset of the entity type you selected needs alerting based on the conditions specified in the previous pane. When initially creating an alert checker, it allows for ony 1 association rule. Once it’s added, you can later on add more association rules to it.

These association rules are made from a “device association” and an “entity association”. First input field you’ll do your device matching, based on the attributes for devices. Second input field you’ll do your entity matching, using the attributes for the entity type you want to associate it with (this can off course be different then the condition you’re checking for)

This works in sort of the same way as the Checker Conditions. It uses the same line method (metric,test,value), however with some exceptions:

  • instead of using metrics, you’ll be using attributes
  • you can’t use a device attribute twice in the same association rule, so for example multiple “hostname match bla” statements with in the same association rule won’t work
  • for a single device association line, you can have multiple entity association lines

That last exception allows for more specific filtering, for example, you would want to match against all sensor classes (sensor_class) that are of type “state”, but when that nets you to many results, you can add a match for it’s description (sensor_descr), or you’d want to match all ports of type (ifType) ethernetCsmacd, but you only want certain ones with a specific description (ifAlias)

Example alerts

If you scrolled down here to just copy/paste some alert-checkers, perfectly fine, but don’t complain if they don’t work, PLEASE read how these work above.

The following is a set of very useful alert checkers:

AlertEntity typeCheck ConditionsCheck Conditions booleanDevice matchEntity match
Device downDevicedevice_status equals 0ANY**
Processor usage is above 80%Processorprocessor_usage greater 80ALL*processor_descr match processor
Memory usage is above 70%Memorymempool_perc greater 70ALL**
State sensor is in ALERT state!Sensorsensor_event equals alertANY*sensor_class equals state
Fanspeed is above or under tresholdSensorsensor_value greater @sensor_limit
sensor_value less @sensor_limit_low
ANY*sensor_class equals fanspeed
Temperature is higher then 50 degreesSensorsensor_value gt 50ANY*sensor_class equals temperature
Traffic exceeds 85%PortifInOctets_perc ge 85
ifOutOctets_perc ge 85
ANY*ifType equals ethernetCsmacd
BGP Session downBGP PeerbgpPeerState notequals establishedANY*bgpPeerRemoteAs equals 41552
Storage exceeds 85% of disk capacityStoragestorage_perc ge 85ANY*storage_type equals hrStorageFixedDisk
Port has encountered errors or discardsPortifInErrors_rate gt 1
ifOutErrors_rate gt 1
ANY*ifType equals ethernetCsmacd
Port is enabled, but operationally downPortifAdminStatus equals up
ifOperStatus notequals up
ALL*ifType equals ethernetCsmacd

Per entity overview of Attributes , Metrics and their values (if any)

Device

MetricsValues
device_status0 = down, 1 = up
device_status_typereason for down, ‘snmp’/’ping’
device_pingresponse in ms
device_snmpresponse in ms
device_uptimein seconds
device_duration_pollin seconds
AttributesValues
hostnameSelf explanatory, this is the hostname for the device
os cisco,asa,junos,linux,printer, generic, etc.
For an up-to-date list see /opt/observium/includes/definitions/os.inc.php
typenetwork,server,workstation,storage,voip,firewall
sysNameDerived through SNMP
sysDescrDerived through SNMP
sysContactDerived through SNMP
hardwareDerived through SNMP
serialDerived through SNMP

Port

MetricsValues
ifInOctets_rate & ifOutOctets_ratenumber
ifInOctets_perc & ifOutOctets_perc0-100 percentage
ifInUcastPkts_rate & ifOutUcastPkts_ratenumber
ifInErrors_rate & ifOutErrors_ratenumber
rx_ave_pktsize & tx_ave_pktsize
ifOperStatusup/down
ifAdminStatusup/down
ifSpeedinterface speed derived through SNMP in mbit
ifMtunumber
ifDuplexfull/half
AttributesValues
ifSpeedinterface speed in a mbit number
ifAliasthe interface description
ifDescrLocation of the interface, (blade, slot, etc)
ifName
ifTypename of interface as described by IANA, see https://www.iana.org/assignments/ianaiftype-mib/ianaiftype-mib
ifPhyAddressMAC address of the interface
port_descr_type
port_descr_descr
port_descr_speed
port_descr_circuit
port_descr_notes

Observium Update Mibs

Memory

MetricsValues
mempool_free
mempool_perc0-100 percentage
mempool_used
AttributesValues
mempool_descr
mempool_mib
mempool_index

Processor

MetricsValues
processor_usage0-100 percentage
AttributesValues
processor_descr
processor_type
processor_oid

Storage

MetricsValues
storage_free
storage_perc0-100 percentage
storage_used
AttributesValues
storage_descr
storage_type
storage_mib
storage_index

BGP Peer

MetricsValues
bgpPeerStateestablished
bgpPeerAdminStatus
bgpPeerFsmEstablishedTime
AttributesValues
as_text
bgpPeerRemoteAs
bgpPeerRemoteAddr
bgpPeerLocalAddr
bgpPeerIdentifier

Sensor

MetricsValues
sensor_valuenumber
sensor_eventup, warning, alert, down
AttributesValues
sensor_descr
sensor_classvoltage, current, power, frequency, humidity, fanspeed, temperature, dbm, state
sensor_type
sensor_index
poller_typepossible types: snmp, agent, ipmi

Toner

MetricsValues
toner_current
AttributesValues
toner_descr

Netscaler vServers

MetricsValues
vsvr_state
vsvr_bps_in
vsvr_bps_out
AttributesValues
vsvr_namethis matches vsvr_fullname except when longer then 32chars, it becomes a randomstring
vsvr_fullname
vsvr_label
vsvr_ip
vsvr_ipv6
vsvr_port
vsvr_type
vsvr_entitytype

Netscaler Services

MetricsValues
svc_state
svc_bps_in
svc_bps_out
AttributesValues
svc_namethis matches vsvr_fullname except when longer then 32chars, it becomes a randomstring
svc_fullname
svc_label
svc_ip
svc_port
svc_type

Observium Install Mibs

Observium is an amazing quasi-opensource solution used to monitor up/down and performance of your networks. It allows you to monitor things such as interface usage, CPU, memory, disk, temperature, BGP, SLA etc.

To upgrade your existing Obervium installation, you will need to

Connect to your Observium server using either ssh or Hyper Visor ‘console’ feature. I recommend ssh as it will be easier to copy/paste.

First, you will need to move to the directory your Observium is installed.

cd /opt

Now you will move the observium directory to another directory named obervium-old. (You can choose any name you wish)

mv observium observium-old

Next you will need to download the latest version. The great thing about Observium is that the link below, is the ‘latest’ Observium version. There is no need to figure our the actual version number and add the version number to the link.

wget -O observium-community-latest.tar.gz http://www.observium.org/observium-community-latest.tar.gz

NOTE: If you do not have ‘wget’ installed on your server, you can easily install it by entering

yum install wget

Next you will unpack the file you downloaded using wget in the previous step.

tar zxvf observium-community-latest.tar.gz

This will untar the file you downloaded into a directory name ‘observium’ in the /opt parent directory.

Now that the file has been extracted, you will need to restore the RRD, log and config.php files from your original installation.
RRD Files

mv /opt/observium-old/rrd observium/

Log Files

mv /opt/observium-old/*log* observium/

PHP Config file

mv /opt/observium-old/config.php observium/

Observium Custom Mibs

Now that your files have been restored, you will need to update the Database Schema

/opt/observium/discovery.php -u

Observium recommends that if you have not updated in the last 12 months, you should force a rediscovery of all devices. To do this, from the command line enter:

/opt/observium/discovery.php -h all

Once this is complete, you can delete the temporary backup you created in step 2.

rm -rf observium-old





Coments are closed