Friday, June 12, 2015

One Tool to rule them all ...


Data Recording Module - Five default, main, data recorders, written in Perl 5, responsible to collect and report: overall system utilization and queuing, per-CPU statistics, per disk IO throughput and errors, per NIC IO throughput and the computer system inventory data: sysrec, cpurec, diskrec, nicrec and hdwrec.

But why not having a single process, simple to start and stop, and operate ? Does it matter, anyway ?


One tool cannot perform all jobs. We believe that.

We have assigned different tasks to different recorders, making easy and simple to update a data recorder without breaking the others. As well, we wanted to separate functionality based on the main 4 system's resources: CPU, Memory, Disk and Network.
Additional we think it is very important to have a description of the hardware and software running on that computer system, sort of the inventory, that everybody understands easily what are we looking at. Saying these, we ended up using the following data recorders:
  • sysrec: overall system CPU, MEM, DISK, NIC utilization and saturation
  • cpurec: per-CPU statistics
  • diskrec: per-DISK statistics, throughout in KBytes and IOPS
  • nicrec: per-NIC statistics, throughout, as KBytes and packets along with the link saturation, errors
  • hdwrec: hardware, software inventory, like number of CPUs, total physical RAM installed, number of disks


Footprint

How about system utilization using Kronometrix ? This is general footprint in terms of cpu, memory and disk used:
  • CPU: on an idle host, all data recorders use less than 0.5%. On a busy system 95-100%, the data recorders use up to 3%
  • Memory: All default data recorders use up to 64MB RAM including the transport utility. Windows data recorders use up to 128MB RAM
  • Disk: the default installation, without raw data uses up to 75MB disk space and the data recorders are not disk IO intensive applications

Keep it simple

We can add or change a data recorder within minutes. Having data recorders based on Perl5 is allowing us to change or add new functions, very easily. Additional we can put recorders run at different time resolutions, if needed. And in case we don't need certain functions, for example network traffic per NIC, we simple shutdown nicrec without affecting the other recorders. So its easy and simple.

Raw Data

A single recorder means, lots of metrics to report. Say we would have agent_one, the main data recorder which should look overall system, the CPUs, disks etc. The payload would increase when running a single recorder. And we want to store the data collected, so that would mean, we need to split and analyse separately data for CPUs, disks, etc.

The Package

Once upon a time Kronometrix was not using more than 1MB disk space. And we were happy like that. But soon we discovered that people from financial and banking sector were not happy changing, installing new libraries on their systems to allow us to run Kronometrix. Worse such sites, usually have very strong requirements what operating system packages are allowed to be installed and what not.
So, we needed to rethink and adopt another mechanism to deploy Kronometrix on such networks. We ended-up having our own Perl distro shipped with Kronometrix + OpenSSL. This way we were able to survive without any extra dependencies from customers and keep running.

One Tool to rule them all

Our approach is simple, easy and offers flexibility on different networks. The Kronometrix data recording package is automated for majority of operating systems out there: FreeBSD, RedHat, CentOS, ClusterLinux, OpenSUSE, Debian, Ubuntu, Solaris, Windows.



No comments:

Post a Comment