Wednesday, June 10, 2015

Who are you ? The story about DBus and cloning virtual machines

Kronometrix Data Recording Module, ships with a transport utility called sender, which ensures all raw data is shipped to one or many data analytic appliances over HTTP or HTTPS protocols. 

sender, a Perl5 citizen, checks all raw data updates and relies on a host uuid identifier from the operating system to deliver that raw data. If such host uuid is found will use that for the entire duration of its execution. If no such host uuid is found it will generate a new one and store that on its configuration file, kronometrix.json. The analytics appliance, relies that each data source is unique and valid, properly checked or generated by the transport utility, sender. 

But what's happening if this really does not work ?


Data Source Id

Kronometrix uses the data source id concept, (DSID), to identify a data source point, from where one or many data messages are received. For example for IT, Computer Performance data, a DSID identifies to a computer system host, physical or virtual, connected or not to a TCP/IP network.

The DSID is obtained from operating system core functions. Example:
  • Linux platforms we speak to DBus and we try to get that via machine-id file
  • FreeBSD we ask the sysctl interface for kern.hostuuid
  • other way: we compute one, using UUID::Tiny Perl5 module


Who are you ?

Working closely with one of our customers, we've seen that they were not receiving data from a number of virtual machines where previously we have installed Kronometrix. Looking into this we discovered that sender was producing same data source id across a number of virtual machines, using same DSID:

"dsid" : "96d5b4a4-d0fa-54a8-ba74-14cc978041f1"
So to Kronometrix Analytics Appliance all these hosts were more or less similar, having same DSID. Not good.


Whats wrong ?

As simple as that, we found out that one VM was used to clone other VMs and by mistake the machine-id file was cloned as well. DBus on CentOS 6.x did produce a sane and valid machine-id, but then the VM configuration was taken and cloned, including the machine-id. On this respect from operating system level all hosts were similar identified as having same host UUID. No software was reporting this nor complain about this malfunction.

Our system was able to immediately discover this trouble and we proposed a fix to the Operation Center group. Later the cloning procedure was fixed to ensure machine-id on Linux will not be cloned anymore and data was finally flying to our appliance. Nice and easy. 

No more clones :)

No comments:

Post a Comment