Thursday, March 19, 2015

System data collection: overhead of long-lived and at-interval collectors

In systems monitoring, data points are harvested with help of collection agents, or collectors. Collectors associate metric values with a point in time and periodically report the result to a time series database.

For my pet projects, I use OpenTSDB's tcollector, which fundamentally supports two modes of running a collection agent:
  • at-interval - a program taking the measurement is being spawned repeatedly in a subprocess at a constant interval, it reports measurements and exits
  • long-lived - tcollector spawns a single subprocess for the data collection agent. The agent outputs datapoints at a constant interval in a continuous loop.

Long-lived collectors are always preferred, because the fork-exec cycle occurring repeatedly with at-interval collectors is an expensive operation, and therefore the at-interval collectors will always incur more overhead.

I wrote a simple collector reporting the CPU temperature in Celsius as read from the /usr/bin/acpi command. I ran it at first as an at-interval collector every minute and subsequently switched it to a functionally equivalent long-lived version. Here is the result:
As reported from the collector itself, the long-lived version keeps the executing CPU entire three degree Celsius cooler.

No comments:

Post a Comment