Notes and Descriptions for the Presentation

Slide 1: The Woods for the Trees

Description

The opening slide contains a photo by K. Mitch Hodge from Unsplash as the background. It is Tullymore Forest Park in Northern Ireland. It shows a stream with many trees in a variety of shades of green. The title is "The Wood for the Trees" which is overlayed on top with the subtitle "Visualising Environmental Data" by Patrick Ferris.

Dicussion

I'm Patrick Ferris, a Research Assistant with the Energy and Environment Group at the Department of Computer Science and Technology.

This presentation is all about visualisations and how they can be used with environmental data. There will be a strong focus on higher-level questions like what it means to visualise something, why bother etc.? Later we will then create some simple visualisations in the browser. The goal is not to become experts in some library X for doing visualisations. There's simply not enough time in a 60 minute presentation.

Slide 2: Definitions

Description

Two definitions are shown on screen:

You gotta see it in your mind. Can you picture that?

Dr Teeth and the Electric Mayhem

The act or an example of creating an image, etc. to represent something

Cambridge Dictionary

Discussion

What exactly is a visualisation? In English, we usually mean two quite related things. Either, the act of imagining something in your mind like thinking of the answer to "Where do you see yourself in 5 years?" or creating a representation of something else.

Slide 3 and 4: Why Visualise and How about now?

Description

This slide has no images only text. There is some CSV text showing rows of data for points on a map. Each row has a name, a longitude and a latitude.

The second slide shows the same points only plotted on to a map of Belfast. This makes it much easier to find the two points that are the closest.

Discussion

This is a jumping off point for considering what are we trying to do when we visualise environmental data (or any data for that matter). What are the trade-offs, what is the agenda, what have we lost along the way?

Here we have a few important points to make:

Slides 5 and 6: Climate Strips

Description

These two slides show the same image but the first has been altered to appear as it might for people with protanopia, a colour-vision deficiency that results from an insensitivity to red light. The image is quite famous, called climate stripes.

Discussion

Accessibility and context matters. A quick caveat that a lot of the points I'm about to raise are actually pretty well-handled by the University of Reading's online visualisation as we shall see.

The climate stripes show the relative difference between the average global temperature and the average between 1971-2000. From left to right, we span the years 1850-2021 with a low of around -0.6 degrees celsius to +0.6.

The first image tries to show how the climate stripes without any other context to someone with protanopia are hard to read and make any good inference from.

The second image, is the normal climate stripes that have been used on the cover of Greta Thunberg's The Climate Book -- and the point this, presented like this they aren't much better. We don't know the time period, the scale, what the baseline is etc.

Finally, the third image solves lots of these problems but would probably not look great on the front cover of a book.

Slide 7: Data Formats

Description

A boring slide with bullet points for a few data formats which are explained in the next discussion section.

Discussion

Environmental data comes in all sorts of weird and wonderful data formats. We can't get into all of them, or all of the details but we'll take a quick tour of the most popular ones.

Images: png, jpeg and tif
JSON: GeoJSON and TopoJSON

JSON is a format for representing structured data, it is often easiest to think about it a bit like a dictionary i.e. key-value pairs.

GeoJSON is a standardised (by the IETF) format built on top of JSON. It is essentially a schema for representing geospatial data in JSON. Here's an example:

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {"type": "Point", "coordinates": [102.0, 0.5]},
      "properties": {"name": "A"}
    },
  ]
}

TopoJSON optimises GeoJSON by utilising the topological information of the geospatial primitives to store less data. Concretely, if you have two polygons that share a boundary (e.g. the border between Scotland and England) then in TopoJSON this will only be stored once whereas in GeoJSON the points will be stored twice.

XML: kml

XML (extensible markup language) is format for storing arbitrary data. KML is the Keyhole Markup Language which adds some geospatial specific features such as place marks and polygons. Google Earth uses KML, here's an example:

<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
<Document>
<Placemark>
  <name>Belfast City</name>
  <description>Belfast City</description>
  <Point>
    <coordinates>-76.3833333,17.8833333,0.</coordinates>
  </Point>
</Placemark>
</Document>
</kml>

GML is another variant that also is very useful for incorporating things like sensor data given it has primitives for time, unit of measurement etc.

Shapefiles

Shapefiles are quite tightly coupled with GIS software tools, and instead of diving into that feel free to have a look at the Wikipedia page. The high-level comment is that they store vector GIS data.

Slide 9 and 10: Copernicus and Seagrass

Description

The first slide shows data from Copernicus:

Copernicus is the Earth observation component of the European Union’s Space programme, looking at our planet and its environment to benefit all European citizens. It offers information services that draw from satellite Earth Observation and in-situ (non-space) data.

The second slide is a screenshot of the NI Forests App for the seagrass section. It shows seagrass polygons in Strangford Lough coupling high-resolution satellite images with structured data.

Discussion

The Copernicus image shows the Fraction of Absorbed Photosynthetically Active Radation (FAPAR). The PAR part is the solar raditation in the spectral ranger that is used by vegetation to produce organic material. We are then interested in the part of the PAR that reaches Earth that is usefully absorbed by plants. We can then infer things like canopy productivity from it for example.

The seagrass data is a mixture of opendatani and information from the Seagrass Project who are doing very cool citizen science. The seagrass restoration handbook is an accessible and interesting read on the subject in the UK.

Slide 11: Demo

A demo was then given including:

Slide 12: Accessibility

Through out the presentation, matters of accessibility were referenced to and in part these notes try to improve a little on that situation as well. However, it would take at least another presentation to talk about matters of accessibility in data visualtion.

I thought I would point to one interesting research paper called Understanding and Improving Information Extraction From Online Geospatial Data Visualizations for Screen-Reader Users in which the VoxLens (an open-source multi-modal solution that improves data visualization accessibility) tools in extended to include geospatial queries.

In one of the original papers they found

Our results show that due to the inaccessibility of online data visualizations, screen-reader users extract information 61.48% less accurately and spend 210.96% more time interacting with online data visualizations compared to non-screen-reader users

Clearly there is a lot of room for improvement and research in this area.

Thanks for reading!