Building, Testing and Deploying Java applications on AWS Lambda using Maven and Jenkins
With continuous integration (the practice of continually integrating
code into a shared code repository) and continuous deployment (the
practice of building, testing, and deploying code often), developers can
release software faster and more frequently.
This post shows how the principles of code testing, continuous integration, and continuous deployment can be applied to your AWS Lambda workflows. Using Git, Maven, and Jenkins, you can integrate, test, build and deploy your Lambda functions using these same paradigms.
As a side note, we will be having a webinar on Continuous Delivery to AWS Lambda which covers more methods of Continuous Delivery on Thursday, April 28th. Register for the webinar.
You use a Maven-based Java project to accomplish the PDF processing. To explain concepts, we show snippets of code throughout the post.
Lambda executes code in response to events. One such event would be the creation of an object in Amazon S3. When an object is created (or even updated or deleted), Lambda can run code using that object as a source.
For this example, you can set your provisioned throughput to 1 write capacity unit and 1 read capacity unit.
The Maven project comes with a sample S3 event (
Take care to to replace the
Before continuing, test your code to ensure that everything is working:
After ensuring there are no errors, check your DynamoDB table to see the metadata now added to your table.
The code executes because of the sample event in your Maven project, but how does it work when a new PDF is added to your bucket? To test this, complete the connection between Amazon S3 and Lambda.
While this document gives your function access to your DynamoDB resources, you need to add access to your S3 resources as well. Use the policy document below to replace the original.
It is important to note that this policy document allows your Lambda function access to all
S3 and DynamoDB resources. You should lock your roles down to interact
only with those specific resources that you wish the function to have
access to.
After completing your policy document and reviewing the function settings, choose Create Function.
(To troubleshoot any errors, choose Monitoring in your Lambda function to view logs generated by Amazon CloudWatch.)
In a CI/CD environment, changes to the code might be made and uploaded to a code repository on a frequent basis. You can bring those principles into this project now by configuring Jenkins to perform builds, package a JAR file, and ultimately push the JAR file to Lambda. This process can be based off of a Git repo, by polling the repo for changes, or using Git’s built-in hooks for post-receive or post-commit actions.
To enable Jenkins to build off Git commits, create a Jenkins project for your repo with the Git plugin, set Build Trigger to “Poll SCM”, and leave Schedule blank.
In your project folder, find
This ensures that when a commit is made in this project, a
request is made to your project’s build endpoint on Jenkins. You can try
it by adding or modifying a file, committing it to your repo, and
examining the build history and console output in your Jenkins dashboard
for the status update.
(For more information about implementing a post-receive hook, see the Integrating AWS CodeCommit with Jenkins AWS DevOps Blog post.)
However, to be able to push the code to AWS Lambda after a successful commit and build, look at adding a post build step.
This post shows how the principles of code testing, continuous integration, and continuous deployment can be applied to your AWS Lambda workflows. Using Git, Maven, and Jenkins, you can integrate, test, build and deploy your Lambda functions using these same paradigms.
As a side note, we will be having a webinar on Continuous Delivery to AWS Lambda which covers more methods of Continuous Delivery on Thursday, April 28th. Register for the webinar.
Prerequisites
- Jenkins —
Many of our customers use Jenkins as an automation server to perform continuous integration. While the setup of Jenkins is out of scope for this post, you can still learn about unit testing and pushing your code to Lambda by working through this example. - A Git repository —
This method uses Git commit hooks to perform builds against code to be checked into Git. You can use your existing Git repository, or create a new Git repository with AWS CodeCommit or other popular Git source control managers.
Getting started
In this example, you are building a simple document analysis system, in which metadata is extracted from PDF documents and written to a database, allowing indexing and searching based on that metadata.You use a Maven-based Java project to accomplish the PDF processing. To explain concepts, we show snippets of code throughout the post.
Overview of event-driven code
To accomplish the document analysis, an Amazon S3 bucket is created to hold PDF documents. When a new PDF document is uploaded to the bucket, a Lambda function analyzes the document for metadata (the title, author, and number of pages), and adds that data to a table in Amazon DynamoDB, allowing other users to search on those fields.Lambda executes code in response to events. One such event would be the creation of an object in Amazon S3. When an object is created (or even updated or deleted), Lambda can run code using that object as a source.
Create an Amazon DynamoDB table
To hold the document metadata, create a table in DynamoDB, using the Title value of the document as the primary key.For this example, you can set your provisioned throughput to 1 write capacity unit and 1 read capacity unit.
Write the Java code for Lambda
The Java function takes the S3 event as a parameter, extracting the PDF object and analyzing the document for metadata using Apache PDFBox, and writing the results to DynamoDB.// Get metadata from the document
PDDocument document = PDDocument.load(objectData);
PDDocumentInformation metadata = document.getDocumentInformation();
...
String title = metadata.getTitle();
if (title == null) {
title = "Unknown Title";
}
...
Item item = new Item()
.withPrimaryKey("Title", title)
.withString("Author", author)
.withString("Pages", Integer.toString(document.getNumberOfPages()));
/src/test/resources/s3-event.put.json
) from which you can build your tests.{
"Records": [
{
"eventVersion": "2.0",
"eventSource": "aws:s3",
"awsRegion": "us-east-1",
"eventTime": "1970-01-01T00:00:00.000Z",
"eventName": "ObjectCreated:Put",
"userIdentity": {
"principalId": "EXAMPLE"
},
"requestParameters": {
"sourceIPAddress": "127.0.0.1"
},
"responseElements": {
"x-amz-request-id": "79104EXAMPLEB723",
"x-amz-id-2": "IOWQ4fDEXAMPLEQM+ey7N9WgVhSnQ6JEXAMPLEZb7hSQDASK+Jd1vEXAMPLEa3Km"
},
"s3": {
"s3SchemaVersion": "1.0",
"configurationId": "testConfigRule",
"bucket": {
"name": "builtonaws",
"ownerIdentity": {
"principalId": "EXAMPLE"
},
"arn": "arn:aws:s3:::builtonaws"
},
"object": {
"key": "blogs/lambdapdf/aws-overview.pdf",
"size": 558985,
"eTag": "ac265da08a702b03699c4739c5a8269e"
}
}
}
]
}
awsRegion
, arn
, and key
to match your specific region, Amazon Resource Name, and key of the PDF document that you’ve uploaded.Test your code
The sample code you’ve downloaded contains some basic unit tests. One test gets an item from the DynamoDB table and verifies that the expected metadata exists:@Test
public void checkMetadataResult() {
DynamoDB dynamoDB = new DynamoDB(new AmazonDynamoDBClient());
Table table = dynamoDB.getTable("PDFMetadata");
Item item = table.getItem("Title", "Overview of Amazon Web Services");
assertEquals(31, item.getInt("Pages"));
assertEquals("sajee@amazon.com", item.getString("Author"));
assertEquals("Overview of Amazon Web Services", item.getString("Title"));
}
mvn test
The code executes because of the sample event in your Maven project, but how does it work when a new PDF is added to your bucket? To test this, complete the connection between Amazon S3 and Lambda.
Create a Lambda function
Usemvn package
to package your working code, then upload the resulting JAR file to a Lambda function.- In the Lambda console, create a new function and set runtime to Java 8.
- Set function package to the project JAR file Maven created in the target folder.
- Set handler to “example.S3EventProcessorExtractMetadata”.
- Create a new value for role based on the Basic With DynamoDB option. A role gives your function access to interact with other services from AWS. In this case, your function to interacts with both Amazon S3 and Amazon DynamoDB. In the window that opens, choose View Policy Document, then choose Edit to edit your policy document.
While this document gives your function access to your DynamoDB resources, you need to add access to your S3 resources as well. Use the policy document below to replace the original.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*"
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject"
],
"Resource": [
"arn:aws:s3:::*"
]
},
{
"Action": [
"dynamodb:GetItem",
"dynamodb:PutItem",
],
"Effect": "Allow",
"Resource": "*"
}
]
}
After completing your policy document and reviewing the function settings, choose Create Function.
Create Amazon S3 bucket
- Create a bucket in the Amazon S3 console. Note that buckets created in S3 are in a global namespace: you need to create a unique name.
- After the bucket is created, upload the Overview of Amazon Web Services PDF to your bucket. We’ve included this white paper to use in unit tests for debugging your Lambda function.
- Manage events for your S3 bucket by going to the root bucket properties and choosing Events:
- Give your event a name, such as “PDFUploaded”.
- For Events, choose Object Created (all).
- For Prefix, list the key prefix for the subdirectory that holds your PDFs, if any. If you want to upload PDF documents to the root bucket, you can leave this blank. If you made a “subdirectory” called “pdf”, then the prefix would be “pdf”.
- Leave Suffix blank, and choosing Lambda function as the Send To option, choosing the Lambda function you created.
- Choose Save to save your S3 event.
Test everything
Test the entire process by uploading a new PDF to your bucket. Verify that a new entry was added to your DynamoDB table.(To troubleshoot any errors, choose Monitoring in your Lambda function to view logs generated by Amazon CloudWatch.)
Enter Jenkins
At this point, you have created a testable Java function for Lambda that uses an S3 event to analyze metadata from a PDF document and stores that information in a DynamoDB table.In a CI/CD environment, changes to the code might be made and uploaded to a code repository on a frequent basis. You can bring those principles into this project now by configuring Jenkins to perform builds, package a JAR file, and ultimately push the JAR file to Lambda. This process can be based off of a Git repo, by polling the repo for changes, or using Git’s built-in hooks for post-receive or post-commit actions.
Build hooks
Use the post-commit hook to trigger a Jenkins build when a commit is made to the repo. (For the purposes of this post, the repo was cloned to the Jenkins master, allowing you to use the post-commit hook.)To enable Jenkins to build off Git commits, create a Jenkins project for your repo with the Git plugin, set Build Trigger to “Poll SCM”, and leave Schedule blank.
In your project folder, find
.git/hooks/post-commit
and add the following:#!/bin/sh
curl http://<jenkins-master>:8080/job/<your-project-name>/build?delay=0sec
(For more information about implementing a post-receive hook, see the Integrating AWS CodeCommit with Jenkins AWS DevOps Blog post.)
Deploy code to Lambda
You may notice in the console output a command foraws sns publish --topic-arn ...
. In this project, we’ve added a post-build step to publish a message via Amazon Simple Notification Service (Amazon SNS)
as an SMS message. You can add a similar build step to do the same, or
take advantage of SNS to HTTP(S) endpoints to post status messages to
team chat applications or a distributed list.However, to be able to push the code to AWS Lambda after a successful commit and build, look at adding a post build step.
- In the configuration settings for your project, choose Add built step and Invoke top-level Maven targets, setting Goal to “package”. This packages up your project as a JAR file and places it into the
target
directory. - Add a second build step by choosing Add built step and the Execute shell option.
- For Command, add the following Lambda CLI command (substitute the function-name variable and zip-file variable as necessary):
aws lambda update-function-code --function-name extractPDFMeta --zip-file fileb://target/lambda-java-example-1.0.jar
Take it a step further
The code and architecture described here are meant to serve as illustrations for testing your Lambda functions and building out a continuous deployment process using Maven, Jenkins, and AWS Lambda. If you’re running this in a production environment, there may be additional steps you would want to take. Here are a few:- Add additional unit tests
- Build in additional features and sanity checks, for example, to make sure documents to be analyzed are actually PDF documents
- Adjust the Write Capacity Units (WCU) of your DynamoDB table to accommodate higher levels of traffic
- Add an additional Jenkins post-build step to integrate Amazon SNS to send a notification about a successful Lambda deployment
Well written.Keep updating Devops Online Course
ReplyDelete