Effective Monitoring and Alerting: For Web Operations (Paperback)

Slawek Ligus

  • 出版商: O'Reilly
  • 出版日期: 2013-01-08
  • 定價: $720
  • 售價: 8.0$576
  • 語言: 英文
  • 頁數: 164
  • 裝訂: Paperback
  • ISBN: 1449333524
  • ISBN-13: 9781449333522
  • 相關分類: 大數據 Big-data
  • 立即出貨 (庫存 < 3)

買這商品的人也買了...

商品描述

With this practical book, you’ll discover how to catch complications in your distributed system before they develop into costly problems. Based on his extensive experience in systems ops at large technology companies, author Slawek Ligus describes an effective data-driven approach for monitoring and alerting that enables you to maintain high availability and deliver a high quality of service.

Learn methods for measuring state changes and data flow in your system, and set up alerts to help you recover quickly from problems when they do arise. If you’re a system operator waging the daily battle to provide the best performance at the lowest cost, this book is for you.

  • Monitor every component of your application stack, from the network to user experience
  • Learn how to draw the right conclusions from the metrics you obtain
  • Develop a robust alerting system that can identify problematic anomalies—without raising false alarms
  • Address system failures by their impact on resource utilization and user experience
  • Plan an alerting configuration that scales with your expanding network
  • Learn how to choose appropriate maintenance times automatically
  • Develop a work environment that fosters flexibility and adaptability

商品描述(中文翻譯)

這本實用書將教你如何在分散式系統中及時發現並解決問題,以避免它們演變成昂貴的問題。作者Slawek Ligus基於他在大型科技公司的系統運維經驗,描述了一種有效的基於數據的監控和警報方法,使你能夠保持高可用性並提供高質量的服務。

學習如何測量系統中的狀態變化和數據流動,並設置警報以幫助你快速解決問題。如果你是一位系統運維人員,每天都在努力提供最佳性能並降低成本,這本書適合你。

本書內容包括:
- 監控應用程式堆疊的每個組件,從網絡到用戶體驗
- 學習如何從所獲得的指標中得出正確的結論
- 建立一個強大的警報系統,能夠識別問題異常,而不會產生虛假警報
- 根據資源利用率和用戶體驗來解決系統故障
- 設計一個能夠隨著網絡擴展而擴展的警報配置
- 學習如何自動選擇適當的維護時間
- 建立一個促進靈活性和適應性的工作環境