Deploy a container to ECS on Fargate, execute commands by ECS Exec, and perform port forwarding by Session Manager

aws

When investigating an application running in a remote non-container environment, sshd is often running, so commands can be executed with SSH connection. On the other hand, sshd isn’t usually running in a container environment, so it can’t be executed similarly. If absolutely necessary, there is a way to run sshd, but it would be better to avoid it in terms of opening ports and managing keys. In this article, command execution and port forwarding are performed without running sshd in a container running in ECS on Fargate.

Deploy an application with AWS Copilot, which is a CLI to deploy an application to App Runner or ECS on Fargate.

AWS App Runnerの特徴と料金、CloudFormationのResource - sambaiz-net

It doesn’t support ECS on EC2 currently, but if the use case is suitable, it creates resources such as VPC and ELB depending on the type, so it is easy to deploy the application. Besides, existing resources created with CDK etc. are also usable.

CDKでALBとECS(EC2)クラスタを作成し、ecs-cliでDocker Composeの構成をデプロイする - sambaiz-net

Deploy

Create an environment with copilot init, and –deploy Load Balanced Web Service on Fargate to the test environment.

$ brew install aws/tap/copilot-cli
$ copilot -v
copilot version: v1.19.0

$ copilot init --app testapp         \
  --name app                         \
  --type 'Load Balanced Web Service' \
  --dockerfile './Dockerfile'        \
  --deploy
...
✔ Deployed service app.
Recommended follow-up action:
  - You can access your service at http://****.ap-northeast-1.elb.amazonaws.com over the internet.
- Be a part of the Copilot ✨community✨!
  Ask or answer a question, submit a feature request...
  Visit 👉 https://aws.github.io/copilot-cli/community/get-involved/ to see how!

$ copilot env ls
test

$ curl http://****.ap-northeast-1.elb.amazonaws.com/
ok

Manifest like following is created.

$ cat copilot/app/manifest.yml 
name: app
type: Load Balanced Web Service

http:
  path: '/'

image:
  build: Dockerfile
  port: 8080

cpu: 256
memory: 512
count: 1
exec: true

Command execution with ECS Exec

ECS Exec is a feature that can run commands in a container like “kubectl exec”, and Copilot can use it with copilot svc exec. If Session Manager plugin for the AWS CLI isn’t installed in the local machine yet, it is installed first.

$ copilot svc exec --app testapp test --env test --name app
Looks like the Session Manager plugin is not installed yet.
Would you like to install the plugin to execute into the container? Yes

$ session-manager-plugin --version
1.2.339.0

If no command or /bin/sh is passed, shell is started.

$ copilot svc exec --app testapp test --env test --name app --command /bin/sh
Execute `/bin/sh` in container app in task e136ade28eeb48c5ae69b3681e0e0741.

Starting session with SessionId: ecs-execute-command-0ff9d37f33292cbcd
/ # hostname
ip-10-0-0-147.ap-northeast-1.compute.internal

Internally, the SSM agent binary is mounted and a session is created in Session Manager of Systems Manager (formerly SSM).

/ # yum -y install procps
/ # ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  1.1 2689044 47296 ?       Ssl  03:31   0:02 java App
root         8  0.0  0.3 1323312 15552 ?       Ssl  03:31   0:00 /managed-agents/execute-command/amazon-ssm-agent
root        35  0.0  0.7 1410172 28080 ?       Sl   03:31   0:00 /managed-agents/execute-command/ssm-agent-worker
root        46  0.2  0.5 1327588 23196 ?       Sl   04:10   0:01 /managed-agents/execute-command/ssm-session-worker ecs-execute-command-0893f9d72f4a53852
root        55  0.0  0.0  13564  3216 pts/0    Ss   04:10   0:00 /bin/sh
root       213  0.0  0.0  51672  3812 pts/0    R+   04:17   0:00 ps aux

Port forwarding with Session Manager

Session Manager also supports SSH tunnel, so SSH connection can be established without opening the port, but port forwarding can be performed by the SSM agent alone by starting-session with AWS-StartPortForwardingSession.

$ aws ssm start-session --target [instance_id] --document-name AWS-StartPortForwardingSession --parameters '{"portNumber":["5005"], "localPortNumber":["5005"]}'

Way with the self installed SSM agent

Install the SSM agent and AWS CLI.

FROM amazoncorretto:18-al2-full AS builder
COPY . .
RUN javac -d ./out src/App.java

RUN yum install -y unzip \
    && curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" \
    && unzip awscliv2.zip

FROM amazoncorretto:18.0.1-al2

# Install SSM Agent & jq
RUN yum install -y https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/linux_amd64/amazon-ssm-agent.rpm jq \
    && rm -rf /var/cache/yum/* \
    && yum clean all

# Install AWS CLI
COPY --from=builder ./aws ./aws
RUN ./aws/install

COPY ./entrypoint.sh .
RUN chmod u+x ./entrypoint.sh

COPY --from=builder ./out .
EXPOSE 8080
CMD ["java", "App"]

entrypoint.sh creates an activation with aws ssm create-activaton, registers the instance to Systems Manager with amazon-ssm-agent -register, and deregister-managed-instance on terminated.

$ cat entrypoint.sh
#!/bin/bash -e

SERVICE_NAME_FULL="${COPILOT_ENVIRONMENT_NAME}_${COPILOT_APPLICATION_NAME}_${COPILOT_SERVICE_NAME}"

ASSUME_ROLE=$(aws sts assume-role --role-arn "${ACTIVATION_ROLE_ARN}" --role-session-name "${SERVICE_NAME_FULL}")
export AWS_ACCESS_KEY_ID=$(echo $ASSUME_ROLE | jq -r .Credentials.AccessKeyId)
export AWS_SECRET_ACCESS_KEY=$(echo $ASSUME_ROLE | jq -r .Credentials.SecretAccessKey)
export AWS_SESSION_TOKEN=$(echo $ASSUME_ROLE | jq -r .Credentials.SessionToken)

echo "- create SSM activation"
ACTIVATION=$(aws ssm create-activation \
    --default-instance-name "${SERVICE_NAME_FULL}" \
    --description "${SERVICE_NAME_FULL} service in ECS on Fargate" \
    --iam-role ${SSM_ROLE_NAME})
ACTIVATION_CODE=$(echo $ACTIVATION | jq -r .ActivationCode)
ACTIVATION_ID=$(echo $ACTIVATION | jq -r .ActivationId)

echo "- register instance"
amazon-ssm-agent -register -code "${ACTIVATION_CODE}" -id "${ACTIVATION_ID}" -region ap-northeast-1 -y

cleanup() {
    echo "- cleanup"

    MANAGED_INSTANCE_ID=$(cat /var/lib/amazon/ssm/registration | jq -r .ManagedInstanceID)
    if [ -n "${MANAGED_INSTANCE_ID}" ]; then
      echo "- deregister instance ${MANAGED_INSTANCE_ID}"
      aws ssm deregister-managed-instance --instance-id "${MANAGED_INSTANCE_ID}" || true
    fi

    if [ -n "${CHILD_PID}" ]; then
      echo "- kill the child process ${CHILD_PID}"
      kill "${CHILD_PID}"
      wait "${CHILD_PID}"
    fi
}
trap "cleanup" SIGTERM

nohup amazon-ssm-agent &

# for troubleshooting
# sleep 30
# cat /var/log/amazon/ssm/amazon-ssm-agent.log || true
# cat /var/log/amazon/ssm/errors.log || true

echo "- execute $* as a child process"
sh -c "$*" &
CHILD_PID=$!
wait "${CHILD_PID}"

The role that is passed when create-activation is created by placing CloudFormation template in addons/. Parameters must contains App, Env, Name. Outputs can be referenced as capital snake case environment values like SSM_ROLE_NAME.

$ tree copilot/
copilot/
└── app
    ├── addons
    │   └── ssm-role.yaml
    └── manifest.yml

$ cat copilot/app/addons/ssm-role.yaml
AWSTemplateFormatVersion: 2010-09-09
Parameters:
  App:
    Type: String
    Description: Your application's name.
  Env:
    Type: String
    Description: The environment name your service, job, or workflow is being deployed to.
  Name:
    Type: String
    Description: The name of the service, job, or workflow being deployed.

Resources:
  SSMRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service: [ 'ssm.amazonaws.com' ]
            Action: [ 'sts:AssumeRole' ]
            Condition:
              StringEquals:
                aws:SourceAccount: !Ref AWS::AccountId
              ArnEquals:
                aws:SourceArn: !Sub arn:aws:ssm:${AWS::Region}:${AWS::AccountId}:*
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
        - arn:aws:iam::aws:policy/AmazonSSMDirectoryServiceAccess
        - arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy

  ActivationRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              AWS: [ !Ref AWS::AccountId ]
            Action: [ 'sts:AssumeRole' ]
      Policies:
        - PolicyName: SSMRolePassPolicy
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - iam:PassRole
                Resource: !GetAtt SSMRole.Arn
              - Effect: Allow
                Action:
                  - ssm:CreateActivation
                  - ssm:DeregisterManagedInstance
                Resource: '*'
Outputs:
  SSMRoleName:
    Description: The SSM role name passed to create-activation
    Value: !Ref SSMRole

  ActivationRoleArn:
    Description: The role arn to create activation and pass the SSM role
    Value: !GetAtt ActivationRole.Arn

iam:PassRole permission, which is needed for create-activation, can be added to TaskRole by creating AWS::IAM::ManagedPolicy as follows and outputting it, but iam:* is denied by DenyIAMExceptTaggedRoles policy, so create a role and assume it. Roles created here are attached tags of the application, so it can be assumed.

SSMRolePassPolicy:
  Type: AWS::IAM::ManagedPolicy
  Properties:
    PolicyDocument:
      Version: 2012-10-17
      Statement:
        - Effect: Allow
          Action:
            - iam:PassRole
          Resource: !GetAtt SSMRole.Arn

Set ENTRYPOINT with manifest entrypoint so that local run is not affected. It also disables ECS Exec to avoid conflicts with SSM agents.

$ cat copilot/app/manifest.yml 
...
entrypoint: ["./entrypoint.sh"]
command: ["java", "App"]
exec: false

Success if PingStatus is Online after deployment. If enough permissions aren’t given, ConnectionLost occurs, so check the SSM agent log.

$ aws ssm describe-instance-information
{
    "InstanceInformationList": [
        {
            "InstanceId": "mi-078951953b63f94d0",
            "PingStatus": "Online",
            "LastPingDateTime": "2022-07-21T20:32:05.949000+09:00",
            "AgentVersion": "3.1.1575.0",
            "IsLatestVersion": false,
            "PlatformType": "Linux",
            "PlatformName": "Amazon Linux",
            "PlatformVersion": "2",
            "ActivationId": "842e7174-729e-4dba-a7ba-7caf293ebf28",
            "IamRole": "testapp-test-app-AddonsStack-1JJP0GYRGTT98-SSMRole-MASCURUWZ85I",
            "RegistrationDate": "2022-07-21T20:23:30.200000+09:00",
            "ResourceType": "ManagedInstance",
            "Name": "test_testapp_app",
            "IPAddress": "10.0.1.220",
            "ComputerName": "ip-10-0-1-220.ap-northeast-1.compute.internal"
        }
    ]
}

This instance is treated as on-premises, so Instance Tier must be Advance to start-session. If set, each on-premises instance starts to be charged per time.

$ aws ssm start-session --target mi-078951953b63f94d0 --document-name AWS-StartPortForwardingSession --parameters '{"portNumber":["8080"], "localPortNumber":["5000"]}'
$ curl localhost:5000
ok

Way with the ECS Exec’s SSM agent

In fact, a session to the same target as ECS Exec can be started directly.

$ aws ssm start-session --target ecs:testapp-test-Cluster-VHaYIQCdoUj8_c0add05ab98c49d798ba1cb515c9940d_c0add05ab98c49d798ba1cb515c9940d-527074092 \
  --document-name AWS-StartPortForwardingSession --parameters '{"portNumber":["8080"], "localPortNumber":["5000"]}'
$ curl localhost:5000
ok

Clean up

Clean up the environment it isn’t needed.

$ copilot app delete testapp

References

AWS Fargateで動いているコンテナにログインしたくて Systems Manager の Session Manager を使ってみた話 - SMARTCAMP Engineer Blog

Session Managerでプライベートネットワークにセキュアにアクセスする環境を構築

Docker でプロセスをきれいに終了したい - Qiita