Week 2018-10 – Continuous Data Generation and Monitoring

After setting up a TIG-stack for the monitoring of the CM8 system and creating the first dashboards it is now time for a better data basis. Currently the CM8 system is not actively used and the transfer of the metrics to the InfluxDB is explicitly triggered. This is good for the manual validation of the dashboards, but obviously not very realistic. Thus the task is now to put some load on the CM8 system and to capture the metrics automatically.

To generate some activity on the CM8 system a custom java application was implemented using the CM8 API. In the first iteration two operations are used: create item and retrieve object.

The create item operation creates a new item of a configurable type (currently noindex) without additional metadata. A block with 100kb of random data is added to this item. This is repeated for a configurable number of iterations, and can also be done in multiple threads concurrently.

The retrieve operation first executes a query to get the ids of the documents. In each iteration it picks a random document id from the list, gets the item and then the content. Thus an equal number of get item events and retrieve object events will be generated. This might be changed in the future to get a higher ratio between get item and retrieve object.

A small shell script wraps the two operations. Its main purpose is to vary the rate over the day. Based on the current hour a range for the number of operations is defined and a random number in this range is chosen. The script is called every second minute from cron.

To get the monitoring information from the DB2 to the InfluxDB we plan to use Telegraf. But for the first step a python script is used instead.  It uses the python package ibm_db to execute queries against the library server database, and the package influxdb to directly send the data to the InfluxDB. The script and the required packages and drivers are build into a docker container that is again scheduled via cron.

The CM8 system now starts to fill with content and the activity can be seen on the dashboard.