Study Notes - CloudWatch
CloudWatch 是 AWS 的全託管 (Managed Service) 監控服務,在眾多 AWS 服務預設都會使用,了解他的基本概念與應用,在學習 AWS 服務中是相當重要的。
Log 處理的核心概念
我參考了 Big Data 的處理流水線 (Pipeline),如下圖:
出處:AWS Summit 2016: Big Data Architectural Patterns and Best Practices, P9
以此為概念將 Log Process,分成四個階段:
蒐集 Ingest
: 從資料產生端,蒐集 Log 資料,有時候會包含ETL (Extra-Transform-Load)
、DLP (Data Leak Prevention, 資料外洩防護)
儲存 Store
: 將 Log 儲存到儲存體,常見的 OSS 像是 Elasticsearch、InfluxDB、Prometheus …處理 Process / Analyze
: 將資料分析成有意義的資訊,例如 API Top 10、HTTP 5XX、Latency … ETL 有時候會放這裡。呈現 Visualize / Action
: 將分析的資訊,以視覺畫呈現,或者轉換成行動,也就是自動化的發動點。
整理出對應到 CloudWatch 的應用,如下圖:
底下則是 Data Process Pipeline 常見的 Solutions / Tools:
蒐集 Ingest
:CloudWatch Agent
:安裝在 EC2 或者以Sidecar
形式包成 Docker,負責蒐集 Log 的- Kinesis Agent for Kinesis Stream, Kinesis Firehose, CloudWatch Logs
awslogs
:舊版的 Log Agent,效能比較不佳- ETL Solutions: AWS Glue、Xplant
- OSS: Fluentd、Logstash、Beats
- 架構通常會以 Sidecar 形式寄生在 主應用程式旁邊。
儲存 Store
: CloudWatch Logs- CloudWatch Logs:存放 Log 資料的服務,屬於 Storage Service,縮寫成 CWL
- OSS: Elasticsearch、InfluxDB、Prometheus
處理 Process / Analyze
:CloudWatch Filter
:分析 CWL 的功能,但只能下簡單的 Filter 條件,複雜的使用CloudWatch Logs Insight
:分析存放在 CWL 的功能,可以下類似 SQL 的查詢,功能類似- OSS: Elasticsearch、InfluxDB、Prometheus、Hadoop/Spark Ecosystem.
呈現 Visualize / Action
:CloudWatch Metric
:透過 Filter 或 Insight 產生的 Metric,屬於 Time-SeriesCloudWatch Dashboard
:Metric 的集合呈現,一般當作視覺化監視看板使用。CloudWatch Alarms
:事件驅動的 Event Source,通常會透過 SNS 串接 Lambda 執行實際行為。CloudWatch Events / Rules
:類似 Cron 的服務,依據條件執行 Schedule Tasks,也是一種 Event Source。- OSS: Kibana、Grafana
系列文章
底下是常見的應用場景:
- 系統資源的監控
- Dashboard 系統訊息看板 (即時)
- 自訂監控指標 (Metric)
- 即時 Log 蒐集與儲存
- Log 批次備份 (S3)
因應類似的需求,整理出以下系列文章:
- Study Notes - CloudWatch
- Study Notes - CloudWatch Core Functions
- Study Notes - CloudWatch Agent for Linux
- Study Notes - CloudWatch awslogs
- Study Notes - CloudWatch Metrics
- Study Notes - CloudWatch FAQ
- Solution - CloudWatch for Monitoring and Alarm Systems
- Solution - CloudWatch for Log Analysis
- Solution - CloudWatch for Performance Testing
- 2017/06/21: 淺談系統監控與 CloudWatch 的應用 - AWS User Group Taiwan
後話
其實在做 系統監控
我會一直想到以前在研究 數位音樂科技,特別是混音時常用的一些 EQ / Filter … 基本上跟監控系統的 Alarm 概念一模一樣 XD
延伸閱讀
站內延伸
- What is Monitoring?
- Monitoring vs Observability
- AWS Certified SysOps Administrator - Associate 準備心得
- Resource Provisioning and DevOps
- 軟體自動化測試常見的問題
- 淺談軟體測試的階段與策略
- AWS Study Roadmap
參考資料
- Amazon CloudWatch » Developer Guide
- Build More Reliable and Secure Windows Services Using Amazon Kinesis Agent for Microsoft Windows
CloudWatch New Features
- 2019/06/20: AWS Lambda Console shows recent invocations using CloudWatch Logs Insights
- 2019/06/11: Amazon CloudWatch Launches Dynamic Labels on Dashboards
- 2019/05/23: CloudWatch Logs adds support for percentiles in metric filters
- 2019/05/21: Introducing Amazon CloudWatch Container Insights for Amazon EKS and Kubernetes - Now in Preview
- 2019/04/02: Amazon CloudWatch Launches Search Expressions
- 2018/11/28: Announcing Amazon CloudWatch Logs Insights – Fast, Interactive Log Analytics
- 2018/11/20: Amazon CloudWatch Introduces Automatic Dashboards to Monitor all AWS Resources
- 2018/11/20: Amazon CloudWatch Launches Ability to Add Alarms on Metric Math Expressions
- 2018/11/02: Amazon RDS Now Sends Events to Amazon CloudWatch Events
- 2018/10/22: Amazon CloudWatch Events Adds the Ability to Share Events Across All Accounts in an Organization
- 2018/10/04: Amazon CloudWatch Launches Client-side Metric Data Aggregations
- 2018/09/29: Amazon CloudWatch Agent adds Custom Metrics Support
- 2018/09/28: Changes to Tags on AWS Resources Now Generate Amazon CloudWatch Events
- 2018/09/24: Amazon CloudWatch adds Ability to Build Custom Dashboards Outside the AWS Console
- 2018/09/25: Amazon S3 Announces New Features for S3 Select
- 2018/06/28: Amazon CloudWatch Adds VPC Endpoint Support to AWS PrivateLink
更新紀錄
- 2018/12/25: 重構文章,解構成六篇。
- 2018/12/16:
- 調整文章 ToC
- 加入 CloudWatch Log Insigh, CloudWatch Agent, Metric Math
- 2017/03/02: 初版