Sign in

Data Architect @CDL_Software , AWS Community Builder, 12 x AWS Certified. GCP, Azure, SCRUM, DAMA CDMP, Hashicorp & Kafka Certified

It’s week two of re:Invent and that includes the first ever dedicated keynote for Machine Learning. Here are the features that I found interesting.


Most organisations that process data will have experienced the concept of data in silos. This is where an application is built for a particular purpose and tied to a data store. While this may solve a particular business problem, as time passes developers and engineers may start to spend time extracting data from these silos for other purposes such as analytics and machine learning.

If you are lucky your teams might have provided API’s to access the data, but what if that API is missing two key fields that you need or returns too much data?

For older software that…

It’s a different experience this year. The chat with my teammates is a mixture of discussion about new features and pictures of good times in Vegas from previous re:Invent conferences.

Andy Jassy has finished the first keynote of 2020 and I was not disappointed. Lots of great new features that we have use cases for.

Here are my favourite data related features announced during the Andy Jassy re:Invent keynote.

Glue Elastic Views

Most data teams and customers I work with have data in multiple places. You might have a CRM system, an accounts system, document management etc. …

Here is my list of sessions for week one of re:Invent focussed around data and analytics.

  • How to use fully managed Jupyter notebooks in Amazon SageMaker
  • What’s new with Amazon S3
  • How BMW Group uses AWS serverless analytics for a data-driven ecosystem
  • Innovate faster with applications on AWS storage
  • Embed analytics in your applications with Amazon QuickSight
  • Discovering insights from customer surveys at McDonald’s
  • What’s new in Amazon ElastiCache
  • How FINRA operates PB-scale analytics on data lakes with Amazon Athena
  • How Zynga modernized mobile analytics with Amazon Redshift RA3
  • Implementing MLOps practices with Amazon SageMaker
  • Break down data silos: Build…

My previous Mac was starting to overheat so it was swapped for another.

I decided to install things as I required them rather than restore from a Time Machine backup as I thought I’d probably built up a lot of things that did not need over the last couple of years.

Here are my notes on the basic tools and config for a new Mac that I find useful working mostly with AWS.



Parallels Toolbox

Xcode Command Line Tools

xcode-select --install

Switch to ZSH

On Mac the default interactive shell is now zsh.
For more details, please visit

chsh -s /bin/zsh

Install Oh My Zsh

sh -c "$(curl…

For a while now the teams i’m working with are doing really exciting things with the PostgreSQL database. I’m sure there will be more to share on that soon!

We’re running in the AWS cloud and the vast majority of the time we really like it. One minor gripe we have though is that certain PostgreSQL extensions are not available. The one at the top of our AWS wish list is pg_cron.

With this constraint in mind last week our Lead Database Engineer asked for some help with a cost effective way to run scheduled jobs against a PostgreSQL database.

I’m a big fan of Nest products having installed them in a few properties over the years. The cameras in particular paid for themselves several times over as they captured footage of a lorry that crashed into my wall and drove off.

Whilst most Nest products are very simple to setup, I think the Nest Hello (doorbell) is not quite as simple as I have come to expect from Nest. If you live in the UK most wired doorbells expect 8–12V AC, however the Nest expects 16–24V AC.

Here is how I setup my Nest Hello.

First of all, here…

A very welcome announcement at AWS re:Invent today

Here are my notes on getting it up and running.

The docs are pretty good at taking you through this. A couple of tips however. First you will need to update the AWS CLI on your EC2 machine via pip as at the time of writing the latest Amazon Linux AMI did not have the version that has the “aws kafka” command. Secondly my attempt at using the AWS console to create the Kafka cluster did not work out. It created a cluster that had no security group associated with it…

We use a number of NetApp storage appliances on which we run our Oracle database estate. We use SMO to provide quick clones of databases based on snapshots at the storage layer. We are also big users and fans of the Elastic Stack and use it for all our logging and metrics.

NetApp provide their Unified Manager software which provides us with lots of metrics and graphs, but for us it’s much better to have all our logging and metrics in one place in Elastic. The ability for us to have an overall picture of what’s happening in our entire…

Matt Houghton

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store