Convert Data to GeoJSON in a JavaScript App

cover-geohack

Geospatial data conversion is a task that is well-covered by many tools. In the open-source world, there’s our beloved GDAL. GDAL is a command-line utility that converts data between most geospatial formats. In the proprietary world, there is a suite called FME. This tool is the end-all for data conversion. FME also has a passionate user base. Those who use it love it.

If you’re a front-end JavaScript developer. One that needs to convert data right now without an API in a front-end environment through the NPM ecosystem? Here, I demonstrate a tool that converts some geospatial formats to GeoJSON. I also go over the libraries used to build this tool , and I provide samples for your own use.

The Data Converter

This data converter is proof-of-concept that we built as part of an internal hackathon. Here, you can take various formats of geospatial data, and display them on a map for sharing and download.

https://data-converter.sparkgeo.app/

This is a tool that was built by three members of the Sparkgeo team:

Behind the scenes, this is a vanilla JavaScript assembled the ParcelJS Bundler. For deployment, we use Netlify for hosting for its simplicity. You can see the repository here if you want to jump into the code:

https://github.com/sparkgeo/geohack-drag-drop-map

Converting Data Formats

There are many libraries in the NPM ecosystem that can handle data conversion, and we chose these libraries based on package size, and simplicity. One important caveat is that these tools can only work in the preferred coordinate system of GeoJSON, which is WGS 84 (EPSG:4326).

Shapefile

UPDATE: Since writing this article, I have been made aware of a different library for processing shapefiles into GeoJSON, https://github.com/calvinmetcalf/shapefile-js. We haven’t used it in a production app yet, but if you are looking for something to consume shapefiles, this may be what you’re looking for.

Brian, Feb 2022

We start with the shapefile as it the most ubiquitous file format in the Geospatial industry. There are a couple tools in the NPM ecosystem that one can reach for, and we went with “shpjs

https://www.npmjs.com/package/shpjs

The process of extracting EPSG 4326 Shapefiles from this library is as straight-forward as the library suggests. But simplicity comes at a cost: This library is expensive in terms of bundle size. Here is how we consumed the library in our app:

import { parseZip } from 'shpjs'

export default class Shp {
  constructor(zipContent) {
    this.zipContent = zipContent
  }

  geojson() {
    return parseZip(this.zipContent)
  }
}

TopoJSON

TopoJSON is a flavour of GeoJSON that focuses on topology over geometry. This format isn’t as common in front-end development, but is used for applications such as visualization or dealing with “parcel fabrics”, or feature sets where there is a high importance towards shared features. You can read more about it here:

https://github.com/topojson/topojson

For converting TopoJSON to GeoJSON, we used “topojson-client“, a library that does this for us:

https://www.npmjs.com/package/topojson-client

This library works as advertised, and while it is a bit harder to get into the proper mindset than with the one above for Shapefiles, it works as advertised with a small cost of under 100kb to the build size:

import { feature } from 'topojson-client'

export default class TopoJson {
  constructor(content) {
    this.content = JSON.parse(content)
  }

  geojson() {
    const keys = Object.keys(this.content.objects)
    return feature(this.content, this.content.objects[keys[0]])
  }
}

CSV using WKT

For CSV files, we made an assumption that all uploaded CSV files would have one column entitled “wkt” that would have the geometry of a feature encoded in “Well-Known Text” (WKT). WKT is a markup format that represents a series of coordinates in a plain string, and is a well-known part of the geospatial lexicon.

Here, we used two libraries. The first, “papaparse“, is a lightweight library that parses CSV into a format that is consumable in the JavaScript language. Generally, you could also use other techniques to consume CSV for its rules (first row is the header, all values are separated by a comma, all features are separated by a new line), but this library helps deal with the edge cases. The second library, “wellknown” can help you parse WKT into a series of coordinates.

import { Parser as CsvParser } from 'papaparse'
import { parse as parseWkt } from 'wellknown'

export default class CSV {
  constructor(csvContent) {
    this.csvContent = csvContent
  }

  geojson() {
    const csvParser = new CsvParser({
      delimiter: ';',
      header: true,
    })
    const parsedCsv = csvParser.parse(this.csvContent)
    const headerRow = parsedCsv.data.slice(0, 1)[0]
    const dataRows = parsedCsv.data.slice(1)
    const wktColIdx = headerRow.findIndex((text) => text.match(/^wkt$/i))
    let features
    if (wktColIdx > -1) {
      features = dataRows
        .map((dataRow) => {
          return dataRow.length > wktColIdx && dataRow[0] !== ''
            ? {
                type: 'Feature',
                properties: {},
                geometry: parseWkt(dataRow[wktColIdx]),
              }
            : undefined
        })
        .filter((geometry) => !!geometry)
    } else {
      console.error(
        `CSV processor requires one column headed 'wkt' with WKT geometry`,
      )
      features = []
    }
    return {
      type: 'FeatureCollection',
      features: features,
    }
  }
}

GeoPackage

GeoPackage is an OGC committee-built file format for sharing geospatial data, and is a format you will not see often in front-end development; at least compared to Shapefiles, GeoJSON and API responses.

At this time, the only package available is the NGA’s “geopackage

https://www.npmjs.com/package/@ngageoint/leaflet-geopackage

This library comes at a whopping 12.4MB, and it also breaks minification efforts in Webpack. As a result, I would not suggest using this in a producton application. If you have to, you would be better off creating an API endpoint or lambda action that carries out the conversion on the backend using tried tools such as GDAL. But if you really want to use this library to convert GeoPackage to GeoJSON, here is how:

import { GeoPackageAPI } from '@ngageoint/geopackage'

export default class GeoPackage {
  constructor(gpkgContent) {
    this.gpkgContent = new Uint8Array(gpkgContent)
  }

  async geojson() {
    const geoPackage = await GeoPackageAPI.open(this.gpkgContent)
    const featureTables = geoPackage.getFeatureTables()
    let features = []
    featureTables.forEach(function (table) {
      try {
        const geoms = geoPackage.queryForGeoJSONFeaturesInTable(table)
        features = features.concat(geoms)
      } catch (err) {
        console.log('Error reading table ' + table, err)
      }
    })

    return {
      type: 'FeatureCollection',
      features: features,
    }
  }
}