Canary Analysis
Define PromQL queries, thresholds, and evaluation windows in AnalysisTemplate resources. The controller evaluates them at every canary step.
Canary Analysis Templates
AnalysisTemplate Resource
An AnalysisTemplate defines what metrics to query, how to evaluate them, and what counts as a pass or failure.
apiVersion: kubestead.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: error-rate-analysis
spec:
args:
- name: service
metrics:
- name: error-rate
interval: 30s
provider:
prometheus:
address: http://prometheus.monitoring:9090
query: |
sum(rate(http_requests_total{status=~"5..",service="{{args.service}}",}[2m]))
/
sum(rate(http_requests_total{service="{{args.service}}",}[2m]))
successCondition: result[0] < 0.003
failureCondition: result[0] > 0.010
failureLimit: 1
- name: p99-latency
interval: 30s
provider:
prometheus:
query: |
histogram_quantile(0.99,
sum(rate(http_request_duration_seconds_bucket{
service="{{args.service}}"
}[2m])) by (le)
)
successCondition: result[0] < 0.5
failureCondition: result[0] > 1.0
Evaluation Logic
At each evaluation interval, the controller runs all configured metric queries. The result is one of:
- Successful: all metrics pass their
successCondition - Failed: any metric exceeds its
failureConditionmore thanfailureLimittimes - Inconclusive: metrics backend unavailable; behavior controlled by
inconclusivePolicy
Datadog Integration
provider:
datadog:
query: "avg:http.request.errors{service:{{args.service}}} by {service}.as_rate()"
apiVersion: v2