Install Prometheus with CDK and remote write aggregated data to New Relic with recording rules to save the amount of data
awskubernetesnewrelicprometheusPrometheus’ recording rules is a feature that allows you to create new metrics from existing metrics in PromQL. This can save the amount of data sent to New Relic by sending aggregated data compared to sending raw data, but Prometheus in newrelic-prometheus-configurator included in newrelic-bundle is in Agent mode, so it doesn’t support Recording rules.
Install newrelic-bundle to EKS cluster with CDK and monitor it - sambaiz-net
So I install my own Prometheus using Helm and do remote write.
cluster.addHelmChart('PrometheusOperatorChart', {
release: 'kube-prometheus-stack',
chart: 'kube-prometheus-stack',
repository: 'https://prometheus-community.github.io/helm-charts',
version: '59.1.0',
namespace: 'prometheus',
createNamespace: true,
values: {
defaultRules: {
create: false,
},
prometheus: {
enabled: true,
prometheusSpec: {
storageSpec: {
volumeClaimTemplate: {
spec: {
resources: {
requests: {
storage: '30Gi',
},
},
},
},
},
retention: '7d',
retentionSize: '25GB',
remoteWrite: [{
url: 'https://metric-api.newrelic.com/prometheus/v1/write?prometheus_server=prometheus',
authorization: {
type: 'Bearer',
credentials: {
key: secretKey,
name: secretName,
}
},
writeRelabelConfigs: [
{
sourceLabels: ['aaaa'],
separator: ';',
regex: 'bbbb',
action: 'keep',
},
],
}],
},
},
nodeExporter: {
enabled: false,
},
alertmanager: {
enabled: false,
},
grafana: {
enabled: false,
},
},
})
When you install this, a Prometheus server will be started via CRD such as Prometheus. There is also a chart that doesn’t use operators, but since extraScrapeConfigs is a string, if the indent is incorrect it will cause an error so it is difficult to set up. Additionally, the operator providing various CRDs has an advantage that it is easy to start a Prometheus server and set up scraping as needed on the application side.
const extraScrapeConfigs =
`- job_name: newrelic-pods
honor_timestamps: true
scrape_interval: 30s
scrape_timeout: 10s
...`
Alertmanager etc. are also started by default, but since I don’t use them this time, I disabled them.
You can add targets to scrape with Service|PodMonitor, and can set recording rules with PrometheusRule. Metric names should be in the format of level:metric:operations.
If podMonitorSelectorNilUsesHelmValues is the default true, service|podMonitorSelector will be matchLabels: release:, so if you don’t add this label, it won’t be loaded into configuration. If it is loaded into configuration but doesn’t appear in targets, check that the values of selector, etc. are correct. port in podMetricsEndpoints can be received only port name, and the number can be passed as the targetPort corresponding to __meta_kubernetes_pod_container_port_number.
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: testapp-monitor
namespace: prometheus
labels:
release: kube-prometheus-stack
spec:
selector:
matchLabels:
app: testapp
namespaceSelector:
matchNames:
- testapp
podMetricsEndpoints:
- port: monitor
path: /metrics
interval: 5m
scrapeTimeout: 10s
---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: example
namespace: prometheus
labels:
release: kube-prometheus-stack
spec:
groups:
- name: example
interval: 5m
rules:
- record: code:prometheus_http_requests_total:sum
expr: sum by (code) (prometheus_http_requests_total)
labels:
aaaa: bbbb
Port forward to Prometheus server and try fetching the metrics of Recording rules with PromQL.
$ kubectl -n prometheus port-forward svc/prometheus-server 9090:80
Forwarding from 127.0.0.1:9090 -> 9090
Forwarding from [::1]:9090 -> 9090
$ curl -G 'http://localhost:9090/api/v1/query' --data-urlencode 'query=code:prometheus_http_requests_total:sum' | jq
{
"status": "success",
"data": {
"resultType": "vector",
"result": [
{
"metric": {
"__name__": "code:prometheus_http_requests_total:sum",
"aaaa": "bbbb",
"code": "200"
},
"value": [
1716339421.309,
"182"
]
},
{
"metric": {
"__name__": "code:prometheus_http_requests_total:sum",
"aaaa": "bbbb",
"code": "302"
},
"value": [
1716339421.309,
"2"
]
}
]
}
}
You can confirm that it has also been sent to New Relic.
FROM Metric
SELECT latest(`code:prometheus_http_requests_total:sum`)
WHERE prometheus_server = 'prometheus' FACET code