If Spark jobs run on Cluster mode, the logs are not outputted to step/ directory, so it is hard to check it on the console, so try aggregating them to New Relic.
Launch an EMR cluster with AWS CLI and run Spark applications - sambaiz-net
Monitor infrastructure and applications with New Relic - sambaiz-net
Option 1. Sending logs with self installed fluent bit
Install Fluent Bit that is memory saving fluentd, and send logs with it.
Install Fluent Bit and settings
Install Fluent Bit on bootstrap actions, and place New Relic’s plugin.
From Fluent Bid 1.9, a settings file can be written in YAML, but is currently “not recommended for production”, so I wrote it with the current format this time.
On Cluster mode, Logs of stdout/stderr are outputted to /mnt/var/log/hadoop-yarn/containers/application_id/container_id/(stdout|stderr) in Core node running Driver.
#!/bin/bash -e
sudo tee /etc/yum.repos.d/fluent-bit.repo << EOF > /dev/null
[fluent-bit]
name = Fluent Bit
baseurl = https://packages.fluentbit.io/amazonlinux/2/\$basearch/
gpgcheck=1
gpgkey=https://packages.fluentbit.io/fluentbit.key
enabled=1
EOF
sudo yum install -y fluent-bit-1.9.7-1
cd /etc/fluent-bit/
sudo wget https://github.com/newrelic/newrelic-fluent-bit-output/releases/download/v1.14.0/out_newrelic-linux-arm64-1.14.0.so
sudo tee plugins.conf << EOF > /dev/null
[PLUGINS]
Path /etc/fluent-bit/out_newrelic-linux-arm64-1.14.0.so
EOF
sudo tee fluent-bit.conf << EOF > /dev/null
[SERVICE]
plugins_file plugins.conf
http_server On
http_listen 0.0.0.0
http_port 2020
[INPUT]
name tail
path /mnt/var/log/hadoop-yarn/containers/*/*/stdout
[FILTER]
name modify
match *
add emr_cluster_name $(aws emr list-clusters --query "Clusters[?Id=='$(sudo cat /mnt/var/lib/info/job-flow.json | jq -r ".jobFlowId")']" --region <region> | jq -r ".[0].Name")
[OUTPUT]
name newrelic
match *
licenseKey \${NEW_RELIC_LICENSE_KEY}
EOF
sudo mkdir -p /lib/systemd/system/fluent-bit.service.d
sudo tee /lib/systemd/system/fluent-bit.service.d/newrelicenv.conf << EOF > /dev/null
[Service]
Environment="NEW_RELIC_LICENSE_KEY=<license_key>"
EOF
sudo systemctl start fluent-bit
Send Fluent Bit metrics
Launching with http_server On, the following metrics can be got.
$ curl localhost:2020/api/v1/metrics | jq
{
"input": {
"tail.0": {
"records": 0,
"bytes": 0,
"files_opened": 0,
"files_closed": 0,
"files_rotated": 0
}
},
"filter": {},
"output": {
"newrelic.0": {
"proc_records": 0,
"proc_bytes": 0,
"errors": 0,
"retries": 0,
"retries_failed": 0,
"dropped_records": 0,
"retried_records": 0
}
}
}
Install New Relic Infrastructure agent, and send metrics with New Relic Flex.
#!/bin/bash -e
sudo curl -o /etc/yum.repos.d/newrelic-infra.repo https://download.newrelic.com/infrastructure_agent/linux/yum/amazonlinux/2/aarch64/newrelic-infra.repo
sudo yum -q makecache -y --disablerepo='*' --enablerepo='newrelic-infra'
sudo yum install newrelic-infra -y
NEW_RELIC_LICENSE_KEY=<license_key>
echo "license_key: ${NEW_RELIC_LICENSE_KEY}" | sudo tee -a /etc/newrelic-infra.yml
sudo tee /etc/newrelic-infra/integrations.d/fluentbit.yml << EOF > /dev/null
integrations:
- name: nri-flex
config:
name: fluentbit
apis:
- event_type: fluentbit
url: http://localhost:2020/api/v1/metrics
EOF
sudo systemctl start newrelic-infra
Fetch the values with NRQL.
FROM fluentbit SELECT max(output.newrelic.0.errors) FACET tags.Name TIMESERIES
Option 2. Send logs with Infrastructure agent’s Fluent Bit
In fact, Infrastructure agent also have a feature sending logs with Fluent Bit, and if write settings in /etc/newrelic-infra/logging.d/logging.yml, logs are sent. It can import normal settings files of Fluent Bit, and OUTOUT to New Relic is dynamically inserted.
sudo tee /etc/newrelic-infra/logging.d/fluentbit.conf << EOF > /dev/null
[SERVICE]
plugins_file plugins.conf
http_server On
http_listen 0.0.0.0
http_port 2020
[INPUT]
name tail
path /mnt/var/log/hadoop-yarn/containers/*/*/stdout
[FILTER]
name modify
match *
add emr_cluster_name $(aws emr list-clusters --query "Clusters[?Id=='$(sudo cat /mnt/var/lib/info/job-flow.json | jq -r ".jobFlowId")']" --region ap-northeast-1 | jq -r ".[0].Name")
EOF
sudo tee /etc/newrelic-infra/logging.d/logging.yml << EOF > /dev/null
logs:
- name: fluentbit-import
fluentbit:
config_file: /etc/newrelic-infra/logging.d/fluentbit.conf
EOF
References
hadoop - Where does YARN application logs get stored in EMR before sending to S3 - Stack Overflow