GeoJSON is a great format, easy to read/view/use but one thing that really stands out is the verbosity of numbers and its effects on file size. Yeah in rare cases it “may” be needed but I’m pretty sure a length that is precise to, well less than a millimeter (example: stream length: 6849.41980435 meters) is never needed. And the other pink elephant in GeoJSON is the number of sig figs used for geometry!
{
"type": "FeatureCollection",
"features": [{
"geometry": {
"type": "Polygon",
"coordinates": [[[-120.37273534694833, 50.67716016936108], [-120.37080246306458, 50.67707673658443], [-120.36562460360317, 50.675414824845106]...
}
I’m sure there’s lots of techniques to decrease the size of a GeoJSON file for transport but one of the more popular is TopoJSON. I guess technically not a GeoJSON format but worth mentioning. It reduces file size by not replicating geometry(simplifying a lot here) similar to the old ESRI Coverage files. But what I wanted to see is, what if we stuck with the GeoJSON format but ran the geometry through the Encoded Polyline Algorithm (EPA) and then throw the results up on a map to see how much faster it would be if any.
How To
Firstly, I need to convert the shapefile to geojson and to do that I’ll use the shp2json.py script.
pixel:data dustin$ shp2json.py -e Neighbourhoods.shp Neighbourhoods.egeojson
The script does a couple things, first it transforms the coordinates to WGS84 and second by adding the -e option the EPA is applied to the geometry. The new geometry structure will look something like this…
{
"type": "FeatureCollection",
"features": [{
"geometry": {
"type": "Polygon",
"coordinates": ["g{htHphu}UPaKjIk_...","}eitHrex}U@sGpDkDnGFfF...", ...] NOT SO PINK ANYMORE!!
},
..
}
Well, now I have this GeoJSON file, let’s see if it draws! We’ll employ leaflet to do this since it’s super easy to make a map and has a great plugin called Leaflet.encoded that can read encoded geometry. I’ve also added some extra code to read regular geojson geometry so I could generate some simple metrics to see how things perform. All the code is available on github if you want to take it for a spin!
Results
Although the drawing times were slower in my simple tests, load times more than made up for the difference. Below is my test results on using three different shapefiles. Would be interesting to see the comparison to TopoJSON ;).
Name | Neighbourhood.shp | Property.shp | Creek.shp |
# of features | 26 | 31,147 | 143 |
Shapefile (in kB) * | 806 | 63,409 | 595 |
GeoJSON GZipped (in kB)** | 707.1 | 17,948 | 551 |
GeoJSON (in kB) | 2,111 | 68,728 | 1,540 |
Load (ms) | 3,467.5 | 16,116.1 | 1,690.8 |
Draw (ms) | 187.2 | 10,367.7 | 189 |
GeoJSON Enc. Geom. GZipped (in kB)** | 56 | 3,221.3 | 70 |
GeoJSON Enc. Geometry (in kB) | 110 | 23,750 | 124 |
Load (ms) | 536.1 | 4,599 | 614.6 |
Draw (ms) | 256.1 | 10,421 | 223.3 |
* includes shp, dbf, shx, prj
** GZip command used: gzip -c Neighbourhood.geojson > Neighbourhood.geojson.gz
Technologies Used
Shp2json script uses:
Map uses: