Knowledge Base

Knowlegebase articles

Welcome to SysOrb knowledge base. Her you will find detailed explanation on questions regarding SysOrb. The list is under continues development and new articles will be added. You can follow the links below or enter a "key word" to search all of the articles.

Explanation of the score keeper strategy.

First of all, we have a thing we call the "state" of a check, that is green/yellow/red depending on the last measurement of e.g. the CPU time utilization (according to the "Warn when below", "Alert when ..." settings). When determining whether a Node is late for checkin, the state is green immediately after the Node has completed a checkin, and stays so until the period given in "Checkin in every" (plus a little slack) has expired, at which point the state turns red.

If you choose the Immediate strategy, then the color you will see in the SysOrb web interface will be the "state" of the Check.

If on the other hand, you choose ScoreKeeper, then the "state" is used as a basis for updating a numeric score, and that score is in turn used to decide the color to show in the web interface.

The score moves in the range from zero to "Alert ceiling", with lower scores meaning better (green). At any point in time, if the score is above "Alert at" then the icon in the web interface will be red, if the score is not above "Alert at", but is above "Warn at", then the icon will be yellow, and if the score is below "Warn at", then the icon will be green.

The score is conceptually adjusted every five seconds, depending on the state of the check. Typically a green state will decrement the score by 5, a yellow state will increment it by 6, and a red state will increment it by 20. (all of these adjustments are performed every 5 seconds, no matter the frequency of the measurements.)

For checkin, the adjustments are known as "Checkin score" (green) and "Missed checkin score" (red). (Checkin never has a yellow state.)

For other checks, the adjustements are known as "Good score", "Warning score" and "Alert score".

ScoreKeeper is useful for NetChecks, ping times for instance may fluctuate, and you do not want an alert just because a few measurements showed a high round trip time.

This is an example of how it works:

The default settings on the node are:: You can see this if you click configure- edit node.

Warn At : 200
Warn ceiling : 300
Alert At : 1000
Alert ceiling : 1100

The score keeper parameters on a check is as default:

Good Score : -5
Warn Score : 6
Alert Score : 20

If we say that a checks score is 0 and the check goes into Warning state, the 6 scores will be added each 5.second

Meaning it will take:

5 seconds * (200 / 6) ~= 2.8 minutes, before you will receive a warning from the check

Then after 1. min. the score will reach the Warn Ceiling at 300 and will not increase any more.

If the check then goes into alert state, the score will count up from 300, using 20 each 5.second. This means that it will take:

5 seconds * (1000 - 300) / 20 ~= 3 min. Before you will receive and alert

The score will stop when it reach Alert ceiling at 1100.

Now let say that a check goes directly to Alert state starting from a score at 0 then you will receive an alert after approximately 4 min. 5.seconds* (1000/20)=250 seconds~=4 min

Now when a error then get corrected or the check is ok again, then SysOrb will start counting down the score using 5 each 5.second. This means that after approx. 1½ min. the check will go from alert to warning ( SysOrb counts down from 1100 to 1000 to get below the alert At). Then it will take 13 minutes before the checks go out of warning. ( you have to count 800 down before you are below the warn At)

We recommend that you leave the score and warning / alert / ceiling values to their defaults, and only change them if you both understand what the change will mean, and actually have a need to change them.