Blog article: Portal v2.4.8 – Release Notes

Portal v2.4.8 – Release Notes

Article text

Why are we doing this release?

Open Data is changing how our backend prepares downloadable files for users. We’re doing this for 2 reasons:

  • Scalability – we were worried our previous logic would consume too much memory on our servers as our datasets grow in size
  • Shareability – we recreated this extension of our portal as an extension to CKAN (our portal’s backend platform), meaning other open data organizations using CKAN can copy our logic freely from GitHub

This will have 2 changes to how data is presented on the portal:

  1. All  “geometry” attributes will be converted to their Multi- counterpart. This helps reduce load on our servers when dealing with inconsistencies in geometric data quality from source data (ex: when Point and MultiPoint data are combines into a single schema)
  2. All shapefiles downloaded from datasets of type “Map” will, in the case of column names being longer than 10 characters, no longer be converted to the standard FIELD_1, FIELD_2, etc nomenclature. They will now be a shortened form of the actual column’s name. As before, we will keep a .txt file mapping the shortened name to the entire name zipped in the shapefile.

Changes to downloaded data’s structure will not effect all datasets immediately – these changes will occur on each dataset as they are updated through their existing refresh processes.

Release notes

Open Data Toronto CKAN Extension

https://github.com/open-data-toronto/ckan-customization-open-data-toronto
Tag v2.4.8

  • Iotrans module replaced with ckanext-iotrans
  • Python files refactored to follow PEP8 and PEP257
  • Following Python dependencies removed:
    • Pandas
    • Geopandas
    • Shapely
    • Pyproj
    • Numpy
    • Iotrans

Iotrans CKAN Extension

Tag v1.2.3

  • Replaces iotrans python module for file format and EPSG conversion
  • Streams data from disk, reducing memory load significantly
  • Is a CKAN extension, meaning it can be used by other CKAN-using organizations freely

Changes

  • All geometries are considered “Multi” geometries to ensure there is no geometry type conflicts within a spatial dataset schema
  • All “geometry” fields in created CSV files now use square brackets instead of round parentheses
  • All output shapefile ZIPs will:
    • have their column names reduced to the first 7 characters plus an incrementing integer. This ensures output shapefiles will not violate a shapefile’s 10 character column name length limit and have uniquely named columns that resemble their original name
    • continue to have a .txt file zipped with them mapping the created name to the original name