CGAP-Docker (Production)

CGAP-Docker runs in production on AWS Elastic Container Service, meant to be orchestrated from the 4dn-cloud-infra repository. End users will modify the Makefile to suite their immediate build needs with respect to target AWS Account/ECR Repository/Tagging strategy. For more information on the specifics of the ECS setup, see 4dn-cloud-infra.

The CGAP Application has been orchestrated into the ECS Service/Task paradigm. As of writing all core application services are built into the same Docker image. Which entrypoint to run is configured by environment variable passed to the ECS Task. As such, we have 4 separate services described by the following table:

Kind Use Num Spot vCPU Mem Notes
Portal Services standard API requests 1-4 Yes 4 8192 Needs autoscaling
Indexer Hits /index at 1sec intervals indefinitely. 4 + Yes .25 512 Can auto-scale based on Queue Depth
Ingester Polls SQS for ingestion tasks 1 No 1 2048 Need API to add tasks

Building an Image

NOTE: the following documentation is preserved for historical reasons in order to understand the build process.

YOU SHOULD NOT BUILD PRODUCTION IMAGES LOCALLY. ALWAYS USE CODEBUILD.

The production application configuration is in deploy/docker/production. A description of all the relevant files follows.

  • Dockerfile - at repo top level - configurable file containing the Docker build instructions for all local and production images.
  • docker-compose.yml - at repo top level - configures the local deployment, unused in production.
  • assume_identity.py - script for pulling global application configuration from Secrets Manager. Note that this secret is meant to be generated by the Datastore stack in 4dn-cloud-infra and manually filled out. Note that the $IDENTITY option configures which secret is used by the application workers and is passed to ECS Task definitions by 4dn-cloud-infra.
  • entrypoint.sh - resolves which entrypoint is used based on $application_type
  • entrypoint_portal.sh - serves portal API requests
  • entrypoint_deployment.sh - deployment entrypoint
  • entrypoint_indexer.sh - indexer entrypoint
  • entrypoint_ingester.sh - ingester entrypoint
  • install_nginx.sh - script for pulling in nginx
  • cgap_any_alpha.ini - base ini file used to build production.ini on the server given variables set in the GAC
  • nginx.conf - nginx configuration

The following instructions describe how to build and push images. Note though that we assume an existing ECS setup. For instructions on how to orchestrate ECS, see 4dn-cloud-infra, but that is not the focus of this documentation.

  1. Ensure the orchestrator credentials are sourced, or that your IAM user has been granted sufficient perms to push to ECR.
  2. Run make ecr-login, which should pull ECR credentials using the currently active AWS credentials.
  3. Run make build-docker-production.
  4. Navigate to Foursight and queue the cluster update check. After around 5 minutes, the new images should be coming online. You can monitor the progress from the Target Groups console on AWS.

Tagging Strategy

As stated previously, there is a single image tag, typically latest, that determines the image tag that ECS will use. This tag is configurable from the 4dn-cloud-infra repository.

After a new image version has been pushed, issue a forced deployment update to the ECS cluster through Foursight. This action will spawn a new set of tasks for all services using the newer image tags. For the portal, once the new tasks are deemed healthy by ECS and the Application Load Balancer, they will be added to the Portal Target Group and immediately begin serving requests. At that time the old tasks will begin the de-registration process from the target group, after which they will be spun down. The remaining new tasks will come online more quickly since they do not need to pass load balancer health checks. Once the old tasks have been cleaned up, it is safe to trigger a deployment task through the Deployment Service.

Common Issues

In this section we detail some common errors and what to do about them. This section should be updated as more development in this setup occurs.

  1. Error: denied: User:<ARN> is not authorized to perform: ecr:InitiateLayerUpload on resource: <ECR REPO URL>
This error can happen for several reasons:
  • Invalid/incorrect IAM credentials
  • IAM user has insufficient permissions
  • IAM credentials are valid but from a different AWS account