When in AWS, and with cloudwatch integration enabled, various KPI indicators are provided to allow monitoring and alerting
The following metrics are provided:
- Available Memory: The percentage of memory allowed to be allocated by the proxy that is available for use. In general, this can change over time dramatically, but if lower than 20% on average for a vie minute interval, an alert should be generated;
- CPU Usage: The percentage of cpu time used by this proxy. While multiple proxies may make this metric harder to guage, in a single proxy environment, if this is above 70% for more than five minutes, an alert should be raised;
- DB Transaction Percent: The percentage of queries that are considered to be in transctions;
- Cache Hit Percentage: The percent of queries that are cache hits. This will vary for each customer, but can be used to set an alarm if caching drops below an expected value;
- DB Query Rate: The rate that queries are made against the database, i.e. are not cached;
- DB Query Time: The average response time from the database. This can be used to generate alerts if above the expected SLA;
- Average Response Time: The average response time for all queries, cached or not. This also can be used to generate alerts if above an expected SLA;
- DB Read Percentage: The percent of queries that are reads, which is a pre-requesite for caching or read/write split.
- SQL Exceptions: The number (per second) of SQL exceptions generated as a result of executing queries.
Each of these metrics are reported on a per-minute basis, as part of the per-minute logging that is done with other logs to the cloudwatch log channel as well.