It’s All Big Data, Machine Learning and Platforms

If there were one central theme to this year’s SatSummit, it would be: “machine learning will find all the things, but we need more training data.” It is clear that many teams are finding new and useful ways to leverage enormous amounts of imagery. Against a backdrop of copious compute and ingenious, learning machines, the remote sensing community is steadily moving towards a feature extraction utopia. That said, its fair to say we are not there yet.

Here are a few trends (in no order) to pay particular attention to:

  1. Analysis Ready Data (ARD) is an expectation. We will enjoy more products with simpler data structures, consistent metadata, common understandings of geography and time. The opportunity to combine disparate sources based on their geography and capture time is hugely compelling. ARD may be more interesting for the commercial sector who want to sell more products. However, the government Earth Observation (EO) sector is moving very quickly and deliberately. Seemingly there is a growing desire to publish data more expediently. A case in point is NASA’s commitment to the open data cube.
  2. Get on board with the Spatio Temporal Asset Catalogue (STAC)!
  3. ARD is removing the need for data preparation activities (snort!). Remote sensing people are now investing that time in creating training image chips for machine learning.
  4. In an interesting twist, those same RS people are creating lots of image chips so in the future they will not have to. Once the machines are “trained” there will be a diminished need to train further. Perhaps, this activity will become evolutionary, supporting ML to discern features as they change? Training is a “now” problem.
  5. Time is as difficult to solve as ever, it sounds so easy, but complexity abounds.
  6. Ethical discussions are beginning to emerge. Measuring assets from space could be an invasion of privacy. Is it fair to remotely assess the value of a farm?
  7. Most groups seem to be pointing in a similar direction. Our community is moving towards the goals of better feature extraction and management of enormous amounts of imagery. The cloud is playing an enormous role in this process, distributed. Elastic computing is essential to modern remote sensing.
  8. Developing in the open is OK. There is a great deal of open source code now readily available. However, even companies that maintain open source code are creating great businesses around deployment and management.
  9. Data pipelines are very popular but are still often custom built.

Thanks to the team at SatSummit (mapbox and development seed) for putting on another excellent event!