Milliseconds matter: How Stack Overflow uses Grafana to optimize its systems
Stack Overflow’s custom monitoring tool Opserver provided purpose-built dashboards, but lacked a way to easily create custom self-service dashboards. They needed a tool to allow their developers and SRE teams the freedom to quickly create custom-tailored dashboards to visualize data from OpenTSDB, Elasticsearch, and their alerting tools, all in the same experience. They wanted to empower the teams that knew the data the best.
2020欧洲杯积分榜Stack Overflow’s Ad-server team was the first to discover Grafana. They were searching for a tool to create custom server latency dashboards. Displaying ads on a website is latency sensitive; a millisecond delay can have a huge impact on revenue. Server latency also affects which ads are displayed to whom on the site. The quicker the ad is served, the more targeted it can be for the user. Grafana’s real-time dashboards were critical in discovering where the Ad-Server team could optimize to have the best server performance possible. Grafana quickly spread from the Ad-Server team to other teams at Stack Overflow, since it can visualize data from many different data sources, both open source and commercial. For Stack Overflow, this meant OpenTSDB data could be visualized alongside Elasticsearch data, which could be viewed alongside their custom alerting data from Bosun.
The Bosun alerting system and a new Grafana plugin
is an open source alerting system Stack Overflow created. It has an expressive domain-specific language for evaluating alerts and creating detailed notifications. It also tests alerts against history for a faster development experience. Bosun is robust, but comes with a complex user interface and an often steep learning curve. Its visualization options are also limited. In Kyle’s recent GrafanaCon talk, “,” he described monitoring as “a medium for humans to communicate with other humans through machines.” The Bosun project reflects a deep understanding of the impact of alerting on culture, which Kyle is extremely sensitive to. An intuitive UI and consistent user experience are key to making complex systems easier to understand – something Grafana has always prioritized. So the team decided to build a plugin to bring the power of Bosun into Grafana’s user-friendly interface.
2020欧洲杯积分榜Grafana’s plugin architecture allowed the Stack Overflow team to create a data source plugin for Bosun to visualize Bosun alerting data directly in Grafana. Since Grafana is more user-friendly, people tend to pick it up more naturally. We have turned Grafana users into Bosun consumers, and soon hope to turn Bosun consumers into Bosun authors.
– Kyle Brandt, Director, Site Reliability, Stack Overflow
Stack Overflow is focused on the audience and placing yourself in their shoes. Think of what they know and what they might not know. Don’t tell them what they need to know; show them what they need to know on a Grafana dashboard.
– Kyle Brandt2020欧洲杯积分榜, Director, Site Reliability, Stack Overflow
Grafana allows teams across Stack Overflow to quickly and easily build custom self-service dashboards for what’s important to them, no matter where the data lives or which database it’s stored in. Because Grafana is open source and has a robust plugin architecture, the Bosun team was able to create a plugin to leverage its powerful alerting system, and can now visualize the data in new ways. The new plugin empowers users new to Bosun to write queries and set alerts directly from Grafana’s UI, as well as the flexibility to leverage Bosun’s native expression language. With the popularity of the Bosun plugin internally, the team shared the plugin with the entire stevenscion.community, and it has been installed thousands of times by users of both projects.
Related Case Studies
After trying to DIY, Wix embraces Grafana Cloud
Metrics is an important part of Wix’s culture, so Grafana Cloud was chosen to monitor mission-critical systems.
DigitalOcean gains new insight with Grafana visualizations
The company relies on Grafana to be the consolidated data visualization and dashboard solution for sharing data.
Grafana enhances end user experience for Apica Systems
The company uses Grafana alongside its SaaS product to detect availability and performance issues before they affect users.