Yeti monitoring with RIPE Atlas

1. Purpose of Yeti Monitorint Sytsem

Currently Yeti testbed lacks sufficient information to reflect the status of Y
-eti operation, in aspect of availability, consistency and performance. It
also lacks tools to measure the result of experiments of like introducing more
root server, KSK rollover, Multi-ZKSs etc. Thus, there is a need of monitoring
the functionality, availability and consistence of Yeti Distribution Master as
well as Yeti root server.

The basic idea is setting regualarly monitoring task in Atlas to query SOA
record of fifteen root servers through both UDP and TCP to check the consi
-stence of SOA record. Use Nagios plugin to periodically get the result of
Atlas monitoring task and parse it to trigger alert. Alert email is sent when
there is an exception.

2. check_yeti, the Nagios plugin

2.1 the design of plugin

chekc_yeti gets test results by Atlas API and analysis it.
then output status results and display it by Nagios Web interface.

get the result from Atlas, sample code:

start_time=str(int(int(stop_time) - 3600))
url=base_url + targetid + "/results" + "?start=" + start_time + "&" + \
                                       "stop=" + stop_time + "&format=json"
urllib.urlretrieve(url, outfile)

2.2 Checking Algorithm

  • Atlas:
    a.set regularly monitoring task for 15 root server.
    b.each time use 10 probes.
  • Nagios:
    a.get result of each root server in one hour
    b.analyse result
    c.return statuse code

    OK: over six probe returns OK result WARNING: OK number blows 4
    CRITICAL: None of proble returns OK
    UNKNOWN: no return data.

PS: nagios status code


3. Deployments

3.1 check_yeti

Give the permission to execute it and put it into Nagios’s plugin directory.

3.2 hosts.cfg

Define monitoring Server

define host{  
    use                     linux-server-yeti-rootserver  
    address                 240c:f:1:22::6  

Different servers should be defined seperately

3.3 commands.cfg

Define check_yeti checking commands

define command{
        command_name    check_yeti
        command_line    $USER1$/check_yeti $ARG1$

$USER1$ : nagios’s plugin directory $ARG1$: plugin input parameter, ID of monitoring task

3.4 service.cfg

Define check_yeti monitoring service

define service{  
   	     use                             generic-service          
   	     service_description             check_yeti  
   	     check_command                   check_yeti!1369633 

generic-service: Define nagios monitoring templates such as checking interval(2 hours), alarming interval, alarming level
1369633 : ID of monitoring tast
Different servers should be defined seperately

3.5 contacts.cfg

Define alarming contactors

 define contact{
       contact_name      yeti          
       use               generic-contact               

contact_name : contactor name, will use directly in templeate
email: contactor, seperate by commas

3.6. Start nagios

servic nagios restart

4. Display Atlas monitoring status on website

  1. seting dnsdomainmon tasks in Atlas, get zone ID
  2. using Atlas API, refering
  3. key parameter: zone: “3069263”
  4. sample code
	    <!DOCTYPE html>
	    <title>domainmon test</title>
	    <script type="text/javascript" src=""></script>
	    <script type="text/javascript" src="">
	    <script type="text/javascript" src=""></script>
	    <script type="text/javascript" src=""></script>
	    <script type="text/javascript" src=""></script>
	    <script type="text/javascript" src=""></script>
	    <script type="text/javascript" src=""></script>
	    <script type="text/javascript" src=""></script>
	    <script type="text/javascript" src="/variables.js?v=Archangel"></script>
	    <script src="/easteregg.js?v=Archangel"></script>
	    <script type="text/javascript" src="" ></script>
	    <div id="domainmon"></div>
	            var dnsmon;
	            $(function() {
	                var hasUdp = true;
	                var hasTcp = true;
	                function onDraw(params) {
	                    var tab;
	                    if (params.isTcp) {
	                        tab = $(".protocol-tabs a[data-protocol=tcp");
	                    } else {
	                        tab = $(".protocol-tabs a[data-protocol=udp");
	                dnsmon = initDNSmon(
	                            dev: false,
	                            lang: "en",
	                            load: onDraw,
	                            change: onDraw
	                        }, {
	                            type: "zone-servers",
	                            zone: "3069263",
	                            isTcp: hasUdp ? false : true