[MONITORING] How to build your monitoring dashboards?

Morgan Freeman staring at a wall with many small pictures.

You build a great product. You offer it as a service. You define quality and performance Service Level Agreements (SLA) for your clients. You deploy monitoring to track Service Level Indicators to ensure you fulfill your SLA.

And now you want to create monitoring dashboards. To visualize the metrics you collect and understand your product’s behavior. How should you do that? What dashboards to create? What metrics should you add in each dashboard?

Strategy 1: Yeah, I do not know. We have metrics, we plot metrics

  • All metrics, one dashboard: One image is worth 1 word, so you add 1000 small charts in one dashboard. Wonder why metrics or trends get missed.

Strategy 2: Overview. Top-down. Left-right. Cohesive. Consistent.

  • Overview dashboard: Build a dashboard to give a quick overview in the health of your system. Provide one top panel trumping everything, showing the highest level metric indicating system performance (or what we are tracking). A single glance at that panel should indicate if things are ok or not with our system.

Wearing the Site Reliability Engineer and Software Development Engineer hats. Having fun with very large systems.