Knowledge Base
Knowlegebase articles
Welcome to SysOrb knowledge base. Her you will find detailed explanation on questions regarding SysOrb. The list is under continues development and new articles will be added. You can follow the links below or enter a "key word" to search all of the articles.
- Agent crashes
- Automatic rescan of all nodes in a domain
- Bandwidth consumption estimate
- Basic database tuning
- Can I customize the reports in SysOrb?
- Disabling specific checks
- Does SysOrb support SNMP traps?
- Explanation of the score keeper strategy.
- How does HTTP Netcheck work in SysOrb?
- How to automatically update alert group on all nodes to “As domain”
- How to backup SysOrb database
- How to change time interval value of agent_checkin_delay
- How to move the SysOrb server to a new server
- How to quickly set downtime on an agent
- I cannot get all of the "performance counters/cache" entries to appear in SysOrb what should I do?
- Migrating MIB information from one SysOrb installation to another
- Migrating the configuration of a Windows SysOrb server from 32 to 64 bits
- No empty blocks in meta database.
- Script to get a list of nodes which have no AlertGroup
- SysOrb agent is not checking in to the Sysorb server
- SysOrb agent stops checking in from a windows server with very long system uptime
- SysOrb server shuts down unexpectedly
- System uptime no longer updates on windows
- Unable to monitor hardware (fans, disks, temperatures etc.) on Windows
- Upgrading SysOrb on Windows
- Uploading a SysOrb database to Evalesco
- What does KiB and MiB mean?
- What is IPMI ?
- Windows agent late for check-in every hour
Explanation of the score keeper strategy.
First of all, we have a thing we call the "state" of a check, that is green/yellow/red depending on the last measurement of e.g. the CPU time utilization (according to the "Warn when below", "Alert when ..." settings). When determining whether a Node is late for checkin, the state is green immediately after the Node has completed a checkin, and stays so until the period given in "Checkin in every" (plus a little slack) has expired, at which point the state turns red.
If you choose the Immediate strategy, then the color you will see in the SysOrb web interface will be the "state" of the Check.
If on the other hand, you choose ScoreKeeper, then the "state" is used as a basis for updating a numeric score, and that score is in turn used to decide the color to show in the web interface.
The score moves in the range from zero to "Alert ceiling", with lower scores meaning better (green). At any point in time, if the score is above "Alert at" then the icon in the web interface will be red, if the score is not above "Alert at", but is above "Warn at", then the icon will be yellow, and if the score is below "Warn at", then the icon will be green.
The score is conceptually adjusted every five seconds, depending on the state of the check. Typically a green state will decrement the score by 5, a yellow state will increment it by 6, and a red state will increment it by 20. (all of these adjustments are performed every 5 seconds, no matter the frequency of the measurements.)
For checkin, the adjustments are known as "Checkin score" (green) and "Missed checkin score" (red). (Checkin never has a yellow state.)
For other checks, the adjustements are known as "Good score", "Warning score" and "Alert score".
ScoreKeeper is useful for NetChecks, ping times for instance may fluctuate, and you do not want an alert just because a few measurements showed a high round trip time.
This is an example of how it works:
The default settings on the node are:: You can see this if you click configure- edit node.
Warn At : 200
Warn ceiling : 300
Alert At : 1000
Alert ceiling : 1100
The score keeper parameters on a check is as default:
Good Score : -5
Warn Score : 6
Alert Score : 20
If we say that a checks score is 0 and the check goes into Warning state, the 6 scores will be added each 5.second
Meaning it will take:
5 seconds * (200 / 6) ~= 2.8 minutes, before you will receive a warning from the check
Then after 1. min. the score will reach the Warn Ceiling at 300 and will not increase any more.
If the check then goes into alert state, the score will count up from 300, using 20 each 5.second. This means that it will take:
5 seconds * (1000 - 300) / 20 ~= 3 min. Before you will receive and alert
The score will stop when it reach Alert ceiling at 1100.
Now let say that a check goes directly to Alert state starting from a score at 0 then you will receive an alert after approximately 4 min. 5.seconds* (1000/20)=250 seconds~=4 min
Now when a error then get corrected or the check is ok again, then SysOrb will start counting down the score using 5 each 5.second. This means that after approx. 1½ min. the check will go from alert to warning ( SysOrb counts down from 1100 to 1000 to get below the alert At). Then it will take 13 minutes before the checks go out of warning. ( you have to count 800 down before you are below the warn At)
We recommend that you leave the score and warning / alert / ceiling values to their defaults, and only change them if you both understand what the change will mean, and actually have a need to change them.