AWS Lambda PostgreSQL Client

For a while now the teams i’m working with are doing really exciting things with the PostgreSQL database. I’m sure there will be more to share on that soon!

We’re running in the AWS cloud and the vast majority of the time we really like it. One minor gripe we have though is that certain PostgreSQL extensions are not available. The one at the top of our AWS wish list is pg_cron.

With this constraint in mind last week our Lead Database Engineer asked for some help with a cost effective way to run scheduled jobs against a PostgreSQL database.

We briefly had a chat about the bring your own runtime for AWS Lambda and also using a container with crontab. I said i’d have a go at the Lambda. Here is how i got on.

Docker Build

FROM centos:7
RUN yum -y install yum-utils rpmdevtools
RUN yum -y install
RUN yumdownloader postgresql11
RUN yumdownloader postgresql11-libs
RUN rpmdev-extract *rpm
RUN mkdir -p /var/task
RUN cp -r /tmp/postgresql11-libs-11.6-2PGDG.rhel7.x86_64/usr/pgsql-11/lib/* /var/task
RUN cp -r /tmp/postgresql11-11.6-2PGDG.rhel7.x86_64/usr/pgsql-11/bin/* /var/task
WORKDIR /var/task
RUN zip -r9 /tmp/ *

I needed the PostgreSQL client and associated libraries. These were extracted using yumdownloader and rmpdev-extract tools into /var/task and a zip file created. With the dockerfile specified to do all this for me i just needed to do a docker build and a docker run to get hold of the zip file.

docker build -t pg-scheduler-build ./docker
docker run -it -v /~Documents/git/docker-postgresql-scheduler:/tmp/macbook pg-scheduler-build
cp /tmp/macbook

With the zip file now on my Mac i can take a quicklook at what has been bundled up.

clusterdb pg_isready pg_waldump
createdb pg_receivewal pgbench
createuser pg_archivecleanup pg_restore psql
dropdb libpgfeutils.a pg_basebackup pg_rewind reindexdb
dropuser pg_config pg_test_fsync vacuumdb pg_dump pg_test_timing pg_dumpall pg_upgrade

Lambda Packaging

Here is bootstrap


set -euo pipefail

# Initialization - load function handler
source $LAMBDA_TASK_ROOT/"$(echo $_HANDLER | cut -d. -f1).sh"

# Processing
while true
# Get an event
EVENT_DATA=$(curl -sS -LD "$HEADERS" -X GET "http://${AWS_LAMBDA_RUNTIME_API}/2018-06-01/runtime/invocation/next")
REQUEST_ID=$(grep -Fi Lambda-Runtime-Aws-Request-Id "$HEADERS" | tr -d '[:space:]' | cut -d: -f2)

# Execute the handler function from the script
RESPONSE=$($(echo "$_HANDLER" | cut -d. -f2) "$EVENT_DATA")

# Send the response
curl -X POST "http://${AWS_LAMBDA_RUNTIME_API}/2018-06-01/runtime/invocation/$REQUEST_ID/response" -d "$RESPONSE"

and here is

function handler () {
echo "$EVENT_DATA" 1>&2;
RESPONSE="Echoing request: '$EVENT_DATA'"
export PGPASSWORD=xxxxxxxxxxxxx
echo `/var/task/psql -d demodb -h -U xxxxxxx -f create-table.sql`

For testing i hardcoded the credentials and RDS endpoint. Obviously this is a poor example and this won’t be what goes into any non-play environment. You can use Hashicorp Vault, AWS Systems Manager Parameter Store or AWS Secrets Manager to fix this.

I also added create_table.sql so that the Lambda could execute some SQL using the psql binary.

insert into wehavelearnt (id, topic) values (1 , 'terraform is for infrastructure as code');
insert into wehavelearnt (id, topic) values (2 , 'it has providers, resources, variable, outputs and modules');
insert into wehavelearnt (id, topic) values (3 , 'the is a PostgreSQL provider');
insert into wehavelearnt (id, topic) values (4 , 'the main commands are init, plan, apply and destroy');

Next the zip file for the lambda gets created.

bootstrap pg_isready pgbench
clusterdb pg_receivewal psql
create-table.sql pg_restore reindexdb
createdb pg_archivecleanup pg_rewind vacuumdb
createuser pg_basebackup pg_test_fsync
dropdb libpgfeutils.a pg_config pg_test_timing
dropuser pg_dump pg_upgrade pg_dumpall pg_waldump
> zip ../ *

Lambda Creation

For runtime select provide your own bootstrap.

The IAM role needs to have permissions to access your RDS database resource along with the Lambda basic execution role.

For the network settings as the RDS database i was using was in a VPC i needed to specify that in the network settings also and provide the security group associated with the RDS database that allows access via the standard PostgreSQL port 5432.

When you hit save you should end up with a green success message.

To do a basic test of the lambda. Click test and create a simple event. It doesn't really matter what is in it so just use the Hello World example.

Now click the Test button to execute the Lambda.

Check the RDS database and we can see that the rows have been inserted.

xxxxxxxxx@xxxxxxxx:xxxxxxxx> select * from wehavelearnt;
| id | topic |
| 1 | terraform is for infrastructure as code |
| 2 | it has providers, resources, variable, outputs and modules |
| 3 | the is a PostgreSQL provider |
| 4 | the main commands are init, plan, apply and destroy |
Time: 0.076s

Thanks to all the people who published the material below that helped me.

Data Architect @CDL_Software , AWS Community Builder, 12 x AWS Certified. GCP, Azure, SCRUM, DAMA CDMP, Hashicorp & Kafka Certified

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store