Vector Tile Server using PostGIS

We recently held a hackathon at Sparkgeo where we split into a couple teams and turned an idea we had been toying with, but never actually had the time to implement, into a working product. Joe Burkinshaw, Parmvir Thind, and myself were team MVT-MVP aka Map-Zuckerberg.

Our colleague and CTO, Dustin Sampson, was musing about creating vector tiles on the fly, directly from PostGIS. We thought this was a perfect task for a hackathon so we went about implementing it using a few different technologies. Our team has great experience individually with a variety of technologies, but we wanted to get out of our element and try tech that we haven’t worked with in depth. So each team member picked an area (devops, data transformation, database administration, etc) where we did not have a lot of experience and went about to change that.

In the initial discussions, we decided that we wanted to use PostGIS, MapboxGL and AWS Lambda/API gateway as the main components of our tileserver. This would be more of a backend focused project so we didn’t go too detailed with cartographic style or data presentation. Function over form for this project.

Data!!

Joe Burkinshaw was able to grab some OSM data and load it into PosgreSQL. Using the PostGIS extension and relatively new ST_AsMVT function, he setup a query that returns a mvt tile, given an {x} {y} {z} url parameter.

Initially, we setup a PostGIS instance on AWS RDS. It was super easy to get up and running, however we ran into an issue due to a protobuf C library not being included with the instance. This library is essential for the PostGIS’ vector tile creation function “ST_AsMVT”.

This is a known issue and it may change in the future. Check for any progress here:

Instead what we did was got the PostgreSQL instance up and running on EC2 and just managed it from there. The steps are documented below.

  1. Get an AWS EC2 instance up and running with Ubuntu Server 16.04 as the OS: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/launching-instance.html
  2. Connect to the EC2 instance with SSH. If you have not done this before, see the following steps.
    1. Install AWS Command Line Interface (the bundle): https://docs.aws.amazon.com/cli/latest/userguide/installing.html
    2. Configure AWS CLI: https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html
    3. Connect to EC2 instance with SSH: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AccessingInstancesLinux.html
  3. A quick way to get PostGIS up and running is with a docker container.
    1. Install Docker on EC2 instance. This guide is useful.
  4. Create a Docker container with PostGIS (based on something from this guide):
docker volume create pg_data 

docker run --name=postgis -d -e POSTGRES_USER=postgres -e POSTGRES_PASS=<password> -e POSTGRES_DBNAME=mvt_mvp -e ALLOW_IP_RANGE=0.0.0.0/0 -p 5432:5432 -v pg_data:/var/lib/postgresql --restart=always kartoza/postgis:9.6-2.4

Great, we have our database setup and ready for use.

Next step is to add some data for testing. We grabbed OSM building data for BC from Geofabrik and used a command line tool to add the data to PostGIS. We ended up using imposm3 to achieve this, but other tools like osm2pgsql and ogr2ogr will work with the source/destination. Now that we have our data set up, we need to provide a means of serving that data as vector tiles. These steps are outlined below.

  1. Add Mapbox PostGIS vector tile utility to PostGIS. This will provide a function called ‘TileBBox’ which is required to determine the extent of tiles from x and y coordinates and zoom level): 
  2. Create a new function in PostGIS (PgAdmin is useful for this). Note that the data is returned as a byte array – this is important for the AWS Lambda side of things:
CREATE OR REPLACE FUNCTION mvt_mvp_buildings(x integer, y integer, zoom integer, out mvt bytea) RETURNS bytea
         AS $$
SELECT ST_AsMVT(q, 'buildings', 4096, 'geom')
         FROM (
                 SELECT
                 id, name, type,
                 ST_AsMVTGeom(
                         osm_buildings.geometry,
                         TileBBox(zoom, x, y),
                         4096,
                         256,
                         false
                 ) geom
                 FROM osm_buildings  
                 WHERE osm_buildings.geometry && TileBBox(zoom, x, y)
                 AND ST_Intersects(geometry, TileBBox(zoom, x, y))
         ) q;
$$ LANGUAGE sql;

And just like that you have a function that returns vector tiles. This is great, we have the backbone of our tileserver!

Unfortunately direct database requests are not the easiest to manage when building a url, and setting up a full blown tileserver can be a bit overkill if your project only requires one layer. We need a slimmer way to serve our data.

Front end

While Joe Burkinshaw was crushing the db admin side of things, Parmvir Thind set up a map for us to view our vector tiles in using Angular 4 and MapboxGL. He’s a whiz at the front end and got that up in no time. Now we can show our beautiful vector tiles!

Enter the Lambda tile server and API Gateway

This process isn’t new. In fact, stamen did a post about it a couple years ago. In their project, they were serving out raster tiles and using lambda to do the styling. The beauty of vector tiles is that the client can control the styling and pretty much all facets of geospatial data manipulation. Lambda for the vector tiles simply acts as a way to set up a very simple geospatial api. The cool thing with the Lambda api is that it can include basic authentication as well. It’s extremely customizable. Stamen’s article gives a great technical overview of building a lambda function for geospatial processing and there are bunch of others as well. Matt Perry’s tutorial and Vincent Sarago’s tutorial to name a few

After Parmvir Thind got the front end up and running, he jumped over and helped me with setting up a lambda function using python 3.5. We set up our connections to Joe Burkinshaw’s database and didn’t have any issues, we could make queries and see data coming back. Perfect.

Not so perfect was the fact that we could not get the mvt binary to come through Lambda and API Gateway without being converted to a string. Every request through API gateway was met with the byte literal prefix ‘b’ attached. Our tiles were not being served as MVT but instead of as string literals. The mapping libraries (we tried all the big ones) could not make heads or tails of this data. Super annoying.

Wha-Wha

The final morning of the hackathon rolled around and we could only present that we almost made it. We had a gameplan, we had a working vector tile creator, what we didn’t have was enough time to implement the serving part of it. .

The hackathon concluded and we went back to working on helping clients implement solutions to their problems.

….
….
….

This project kept nagging at us in the following weeks and so we decided to try a different route. Instead of python we would use Node.js. Parmvir Thind is great with Node and so set about to writing up the program we would need.

Setting up a Lambda Function with Node.js

We needed a way to talk to the Postgres database. Lo and behold, there’s a package for that.

package.json

{
  "name": "node_postgres",
"version": "1.0.0", "description": "node postgres api", "main": "index.js", "scripts": { "test": " exit 1", "deploy": "— zip-file fileb://Lambda-Deployment.zip", "predeploy": "zip -r Lambda-Deployment.zip * -x *.zip *.log" }, "keywords": [ "postgres" ], "author": "Peter Hanssens", "license": "ISC", "dependencies": { "pg": "^6.1.2" } }

Now that we have the ability to talk to Postgres, let’s see if we can get javascript to return the binary data. After adding the code below alongside the package.js file, we did npm install and npm run predeploy to create a .zip file of the lambda function. This zip file is what gets loaded up to lambda. We used Node.js 8 as our runtime.

The function grabs {x} {y} {z} parameters from API gateway and uses those to query the database. The callback returns a base-64 encoded string (this is how javascript deals with binary) that contains the vector tiles. 

index.js

var pg = require('pg');

var b;

exports.handler = function (event, context, callback) {

var conn = postgres://<username>:<password>@<ip>/<db>;
var client = new pg.Client(conn);
client.connect();
var z = event.pathParameters.z.split('.');
var query = client.query('SELECT mvt_mvp_buildings(' + event.pathParameters.x + ',' + event.pathParameters.y + ',' + z[0] + ');');
query.on('row', function (row, result) {
result.addRow(row);
});
query.on('end', function (result) {
b = new Buffer(result.rows[0].mvt_mvp_buildings, 'binary');
var s = b.toString('base64');
callback(null, {
statusCode: 200,
headers: { 'Content-Type': 'application/octet-stream', 'Access-Control-Allow-Origin': '*' },
body: s,
isBase64Encoded: true
});
client.end();
});
};

We pointed the frontend to our API gateway running the lambda function and…

Success!!

We have vectors beings served through Lambda and ST_AsMVT!

This was a fun experiment for all of us as we were exposed to different development areas which we normally do not work in. Out tile server has not been stress tested and is a proof of concept only.

That being said, we had a lot of fun working on this project and hope that you are able to find some value out of our trials and tribulations.