Step 4: Maintaining open data
The open data journey does not stop once a dataset is published; divisions are expected to maintain, and where possible, improve, their existing open datasets over time.
Roles and responsibilities
| During the maintenance phase, divisions should… | The Open Data Team provides support by… | 
  | 
  | 
Data quality guidelines
To help divisions assess data quality and identify opportunities to improve, every open dataset on the City’s portal is assigned an automated quality score based on five factors:
- Freshness (35%): is the data being updated according to the stated schedule?  
For example, if a dataset is supposed to be updated weekly, but it hasn’t been updated in two months, its score will be reduced. 
- Metadata (35%): has the required metadata for the dataset been provided by the data owner?  
A dataset’s score will be reduced if required metadata fields are empty, or if insufficient metadata has been provided. 
- Accessibility (15%): is the data appropriately tagged so users can easily find it, is it automatically updated, and can it be easily previewed or visualized by users?   
A dataset’s score will be reduced if it lacks appropriate tags, requires manual updates, or is not stored in the Open Data database (which allows files to easily previewed and accessed in multiple formats). 
- Completeness (10%): is the data, per the City’s policy, exhaustive? Or is data missing or inconsistent?   
A dataset’s score will be reduced if contains more than 50% null values (null values indicate the lack of a value, which is not the same as a zero value). 
- Usability (5%): is the data organized in a way that can be easily understood by users?   
A dataset’s score will be reduced if less than 1/5th of the column names have meaningful English components, or if all of a single column’s values contain “NA” or a similar value. 
The scores for each individual factor combine to provide an overall score:
- Datasets with a score of 80% or above are gold
 - Datasets with a score of 60% to 79% are silver
 - Datasets scoring 59% or lower are bronze
 
These scores and the information used to calculate them are shared publicly on each dataset’s page.
Timeliness guidelines
When an open dataset is published, the owning division must specify an update frequency. This update schedule is included in the dataset’s metadata and communicated to users on the open data portal. For example, the Open Data Team maintains an open dataset of web analytics for the City’s open data portal, and it is updated monthly.
Divisions are responsible for ensuring datasets are updated according to their listed schedule. If a dataset is consistently not updated on schedule, its data quality score will be impacted.
It is best practice for open datasets to be updated automatically and as frequently as possible, to ensure the public has access to the most timely and relevant data.
If there is demand from the public or an identified business need to update a dataset more frequently than its listed schedule, the Open Data Team can help.