Using AWS CodeBuild to execute administrative tasks

This article is a guest post from AWS Serverless Hero Gojko Adzic.

At MindMup, we started using AWS CodeBuild to quickly lift and shift support tasks to the cloud. MindMup is a collaborative mind-mapping tool, used by millions of teachers and students to collaborate on assignments, structure ideas, and organize and navigate complex information. Still, the team behind the product consists of just two people, and we’re both responsible for everything from sales and product management to programming and customer support. One of the key reasons why such a tiny team can support a large group of users is that we tend to automate all recurring tasks in order to free up our time for more productive work.

Administrative support tasks often start as ad-hoc command line scripts, with manual intervention to resolve exceptions. As the scripts stabilize, humans can be less involved, so teams look for ways of scheduling and automating job executions. For infrastructure deployed to AWS, this also means moving away from running scripts from on-premises developers or operations computers to running in the cloud. With utilization-based pricing and on-demand capacity, AWS Lambda and AWS Fargate are the two obvious choices for running such tasks in AWS. There is a third option, often overlooked: CodeBuild. Although CodeBuild is designed for a completely different purpose, it offers some compelling features that make it very easy to set up and run periodic support jobs, especially as a first easy step towards a more systematic solution.

Solution overview

CodeBuild is, as the name suggests, a managed service for executing typical software build jobs. In some ways, such as each job having an associated IAM permissions, CodeBuild is similar to Lambda and Fargate. One of fundamental differences between Codebuild jobs and Lambda functions or Fargate tasks is the location of the executable definition of the job. The executable definition of a Lambda function is in a ZIP archive deployed to Lambda. For Fargate, the executable definition is in a Docker container image, deployed in a task with Amazon ECS or in a Kubernetes pod with Amazon EKS. Both services require an explicit deployment to update the executable definition of a task. For CodeBuild jobs, the executable definition is not deployed to an AWS service. Instead, it is in a source code control system that you can manage locally or using a service such as GitHub or AWS CodeCommit.

Sitting alongside the rest of the source code, each CodeBuild task has an entry-point configuration file, by convention called a buildspec.yml. The buildspec.yml file lists the programming language runtimes required by the job, and the steps to execute before, during and after the build job. For example, the following buildspec.yml sets up a build environment for JavaScript with Node.js 12, installs dependencies, runs tests, and then produces a deployment package using webpack.

version: 0.2

phases:
  install:
    runtime-versions:
      nodejs: 12
    commands:
      - npm install
  build:
    commands:
      - npm test
      - npm run webpack

Usually, the buildspec.yml file involves some variant of installing dependencies, compiling code and running tests, then packaging and versioning artifacts. But the steps of a buildspec.yml file are actually just shell commands, so CodeBuild doesn’t necessarily need to run tasks related to compiling or packaging. It can execute any sequence of Unix commands, scripts, or binaries. This makes CodeBuild a uniquely compelling choice for the transition from running shell scripts on an operations machine to running a shell script in the cloud.

Comparing CodeBuild and Lambda for administrative tasks

The major advantage of CodeBuild over Lambda functions for support jobs is that the scripts can be significantly more flexible. Moving from shell scripts to Lambda functions usually means rewriting the task in a language such as JavaScript or Python. You can execute a shell script from a Lambda function when using Amazon Linux 1 instances, or even use a Bash custom runtime, but when using CodeBuild, you can execute the same shell script without changes.

Lambda functions usually run only in a single language. Support tasks often perform a chain of actions, and different steps might require utilities written in different languages. Running such varied tasks with a Lambda function would require constructing a custom Lambda runtime, or splitting steps into multiple functions with different runtimes, and then somehow coordinating and passing data between them. AWS Step Functions can be used to coordinate the workflow, but most support tasks are a sequence of steps, to be executed in order if the previous one succeeds. With CodeBuild, you can configure the task to include all required runtimes.

Support tasks often need to transform the outputs of one tool and pass it into a different tool. For example, select rows from a database containing expired accounts, then filter out only the user emails, separate the data with commas, and send to an automated mailer with a template. Tools such as grep, awk, and sed become invaluable for such transformations. However, they aren’t available on new Lambda runtimes.

Lambda runtimes based on Amazon Linux 2 bundle only the absolutely minimal operating system packages. Even the basic command line Linux utilities, such as which, are not packaged with the recent Lambda runtimes. On the other hand, CodeBuild runs tasks in a full-blown Linux environment. Executing support tasks through CodeBuild means that you can pipe results into all the standard Unix tools, without having to use half-baked replacements written in a scripting language.

For applications running in the AWS ecosystem, support tasks often need to communicate with AWS services or resources. Standard CodeBuild environments also come with the aws command line tools, so you can use them without any additional setup. This becomes especially important for moving data from and to Amazon S3, where command line tools have operations for batch uploads or downloads or recursive directory synchronization. Those operations are not directly available through the programming language SDK libraries.

It is, of course, possible to install additional binaries to Lambda functions by building them for the right Linux environment. Because the standard shared system libraries are also not in the recent Lambda runtimes, compiling additional tools is akin to building a Linux distribution from scratch. With CodeBuild, most standard tools are included already, and you can add additional tools to the system by using an operating system package manager (apt-get or yum).

CodeBuild execution environments can also be more flexible in terms of execution time and performance constraints. Lambda tasks are currently limited to fifteen minutes. The only performance setting you can influence is the memory size, which proportionally impacts the CPU power. The highest setting is currently 3GB memory, which assigns two virtual cores. CodeBuild allows you to configure tasks which can run for up to 8 hours. You can also explicitly select a compute type, including using GPU processors and going all the way up to 255 GB memory or 72 virtual CPU cores. This makes CodeBuild an interesting choice for tasks that need to potentially run longer than fifteen minutes, that are very computationally intensive, or that need a lot of working memory.

On the other hand, compared to Lambda functions, CodeBuild jobs start significantly slower and running them in parallel is not as easy or convenient. For example, by default you can only run up to 60 CodeBuild tasks in parallel, but this is a soft limit that you can increase. However, support tasks are mostly batch jobs by nature, so saving a few seconds or being able to execute thousands of such tasks in parallel is not usually important.

Comparing CodeBuild or Amazon EC2/Fargate for administrative tasks

Most of the limitations of Lambda functions for admin jobs could be solved by running a virtual machine through Amazon EC2. In fact, running tasks on Amazon EC2 was the usual way of lifting support tasks from the operations computers and moving them into the cloud until Lambda became available. However, due to how Amazon EC2 instances are billed, teams often bundled all the operations tasks on a single Amazon EC2 instance. That instance needed a superset of all the security privileges required by the various tasks, opening potential security risks. That’s where Fargate can help. Fargate runs container-based tasks on demand, offering utilization-based billing and removing many restrictions of Lambda, such as the 15-minute runtime and reduced operating system environment, also allowing you to choose execution environments more flexibly.

This means that, compared to Fargate tasks, CodeBuild execution is more or less comparable in terms of what you can run and how much power you can assign to your tasks. Both can use a custom Docker container, and both run a full-blown operating system with all the standard binaries. They also have similar terms of start-up time and parallelization. However, setting up a CodeBuild job and updating it later is much easier than with Fargate using tasks or pods.

With Fargate, you need to provide a custom Docker container with the right entry point. CodeBuild lets you use custom containers or choose standard images provided by AWS, including Ubuntu or AWS Linux instances. Likewise, configuring a Fargate task involves deploying in an Amazon VPC, and if the task needs to access other AWS services, setting up a NAT gateway. CodeBuild tasks have network access by default, and can be deployed in a VPC if required.

Updating support scripts can also be easier with CodeBuild than with Fargate. Deploying a new version of a task into Fargate involves building a new Docker container and uploading it to a container manager such as Amazon ECS or Amazon EKS. Deploying a new version of a CodeBuild job involves committing to the version control system, without the need to set up a CI/CD pipeline. This makes CodeBuild a compelling way of setting up support tasks, especially for larger organizations with strict access rules. Support people can update tasks definitions by having access to the source code control system, without the need to get access to production resources on AWS.

Fargate environments are transient, similar to Lambda functions. If you want to preserve some files between job runs (for example, compiled task binaries or installed dependencies), you would have to manage that manually with Fargate. CodeBuild supports artifact caching out of the box, so it’s significantly easier to preserve data files or installed dependencies between runs.

Potential downsides

Although taking supporting tasks directly from the source code repository is one of the biggest advantages of CodeBuild over Fargate or Lambda, it can also be a major drawback. Ensuring that the scripts are always in a stable condition requires discipline regarding committing to the trunk. Without such discipline, untested or unstable code might be used for admin tasks by mistake. A potential workaround for teams without good trunk commit discipline would be to use a specific branch for CodeBuild tasks, and then merge code into that branch once it is ready to be released.

Using support scripts directly from a source code repository makes it more complicated to synchronize versions with other deployed software. If you need the support scripts to track the exact version of code that was deployed to other services, it’s probably safer and easier to use Lambda functions or Fargate containers with an explicit deployment step.

Executing support tasks through CodeBuild

CodeBuild jobs take a bit more setup than Lambda functions, but significantly less than Fargate tasks. Below is an example of a CodeBuild job set up through AWS CloudFormation.

Architecture diagram for the CodeBuild being used for administrative tasks

Here are a few things to note:

You can add the required IAM permissions for the task into the Policies section of the CodeBuildRole resource.
The Environment section of the CodeBuildProject resource is where you can define the container image, choose the virtual hardware or set up environment variables to configure the task.
Environment variables are directly available for the shell commands listed in the buildspec.yml file, so this trick allows you to easily parameterize jobs to use resources from the same AWS CloudFormation template.
The Location and BuildSpec properties in the Source section define the source code repository, and the path of the buildspec.yml file within the repository.

Resources:
  CodeBuildRole:
    Type: "AWS::IAM::Role"
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service: "codebuild.amazonaws.com"
            Action: "sts:AssumeRole"
      Policies:
        - PolicyName: AllowLogs
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Effect: Allow
                Action:
                  - 'logs:*'
                Resource: '*'

  CodeBuildProject:
    Type: AWS::CodeBuild::Project
    Properties:
      Name: !Ref JobName
      ServiceRole: !GetAtt CodeBuildRole.Arn
      Artifacts:
        Type: NO_ARTIFACTS
      LogsConfig:
        CloudWatchLogs:
          Status: ENABLED
      Cache:
        Type: NO_CACHE
      Environment:
        Type: LINUX_CONTAINER
        ComputeType: BUILD_GENERAL1_SMALL
        Image: aws/codebuild/standard:3.0
        EnvironmentVariables:
          - Name: SYSTEM_BUCKET 
            Value: !Ref SystemBucketName
      Source:
        Type: GITHUB
        Location: !Ref GithubRepository 
        GitCloneDepth: 1
        BuildSpec: !Ref BuildSpecPath 
        ReportBuildStatus: False
        InsecureSsl: False
      TimeoutInMinutes: !Ref TimeoutInMinutes

CodeBuild jobs usually run after changes to source code files. Support tasks usually need to run on a periodic schedule. The previous snippet did not define the Triggers property for the CodeBuild job, so it will not track source code changes or run automatically. Instead, you can set up an Amazon CloudWatch Event rule (or optionally use Amazon EventBridge, that provides more sophisticated rules) that will periodically trigger the CodeBuild job. Here is how to do that with AWS CloudFormation:

  RunCodeBuildJobRole:
    Condition: ScheduleRuns
    Type: "AWS::IAM::Role"
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service: "events.amazonaws.com"
            Action: "sts:AssumeRole"
      Policies:
        - PolicyName: StartTask 
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Effect: Allow
                Action:
                  - 'codebuild:StartBuild'
                Resource:
                  - !GetAtt CodeBuildProject.Arn

  RunCodeBuildJobRoleRule:
    Condition: ScheduleRuns
    Type: AWS::Events::Rule
    Properties:
      Name: !Sub '${JobName}-scheduler'
      Description: Periodically runs codebuild job to archive defunct accounts
      ScheduleExpression: !Ref ScheduleRate
      State: ENABLED
      Targets:
        - Arn: !GetAtt CodeBuildProject.Arn
          Id: CodeBuildProject
          RoleArn: !GetAtt RunCodeBuildJobRole.Arn

Note the ScheduleExpression property of the RunCodeBuildJobRoleRule resource. You can use any supported CloudWatch schedule expression there to set up when or how frequently your job runs.

Observability and audit logs

If a support job fails for any reason, people need to know. Luckily, CodeBuild already integrates nicely with CloudWatch to report job statuses, so you can set up another CloudWatch Event rule that tracks failures and alerts someone about it. To make notifications flexible, you can send them to an Amazon SNS topic. You can then subscribe for email notifications or forward those alerts somewhere else easily. The following wires up notifications with an AWS CloudFormation template.

  SnsPublishRole:
    Condition: CreateSNSNotifications
    Type: "AWS::IAM::Role"
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service: "events.amazonaws.com"
            Action: "sts:AssumeRole"
      Policies:
        - PolicyName: AllowLogs
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Effect: Allow
                Action:
                  - 'SNS:Publish'
                Resource:
                  - !Ref SnsTopicArn

  CodeBuildNotificationRule:
    Condition: CreateSNSNotifications
    Type: AWS::Events::Rule
    Properties:
      Name: !Sub '${JobName}-fail-notification'
      Description: Notify about codebuild project failures
      RoleArn: !GetAtt SnsPublishRole.Arn
      EventPattern:
        source:
          - "aws.codebuild"
        detail-type:
          - "CodeBuild Build State Change"
        detail:
          build-status:
            - "FAILED"
            - "STOPPED"
          project-name:
            - !Ref CodeBuildProject
      State: ENABLED
      Targets:
        - Arn: Ref SnsTopicArn
          Id: NotificationTopic

Another option to keep the execution of your tasks under control is to generate a report using the test report functionality introduced a few months ago and specify in the buildspec.yml file about the location of the files that store results you want to include in your report.

Testing administrative tasks

Note the build-status list inside the CodeBuildNotificationRule resource. This defines a list of statuses about which you want to publish alerts. In the previous snippet, the list does not include successful runs. That’s because it’s usually not necessary to take any action when a support job runs successfully. However, during initial testing you may want to add IN_PROGRESS (notify when a task starts) and SUCCEEDED (notify when the job ends without an error).

Finally, one of the biggest challenges when moving scripts from an operations machine to running in CodeBuild is to create the right IAM policies. Command-line users on operations machines usually have a wide set of privileges, and identifying the minimum required for a specific job usually involves starting small, then iterating over failed attempts and opening up required operations. Running that process directly through CodeBuild can be quite slow. Instead, I suggest setting up a separate IAM policy for the job, then assigning it both to the role for the CodeBuild task, and to a command-line role or a command-line user. You can then iterate quickly directly on the command line and identify all required IAM operations, then remove the additional command-line user when done.

Conclusion

The next time you need to move a support task to the cloud, and you need a rich execution environment, consider using CodeBuild, at least as the initial step towards a more systematic solution. It will allow you to quickly get a script up and running with all the benefits of IAM isolation, scheduled execution, and reliable notifications.

Gojko is author of the Running Serverless book and interactive course. He is currently working on Video Puppet, a tool for editing videos as easily as editing text. You can reach out to him on Twitter.

AWS DevOps Blog