Step 1: Identify and prioritize data
The first step in the open data process is to consider the data you collect, analyze and use, and identify potential open datasets. Then you can decide how to prioritize those datasets for publication.
Roles and responsibilities
| When identifying and prioritizing data, divisions should… | The Open Data Team provides support by… | 
|---|---|
  | 
  | 
Flagging open data in staff reports
City Councillors may expect data referenced in a staff report is also available on the Open Data Portal (see 2021 EX22.13).
For assistance in assessing the appropriateness, value, and priority of data referenced in a staff report, consult our Guidance for Staff Report Writers.
We recommend assessing data that:
- Is City-held;
 - AND substantially informs any “hot” or “major/strategic” report intended for a standing committee.
 
In such cases, contact the Open Data Team and flag “Open Data Implications” in the Agenda Forecasting System.
Where appropriate, the Open Data Team will help develop an accelerated timeline to open the data together with or shortly after the report is published. In other instances, the potential open dataset should be added to a division’s data inventory, the contents of which are prioritized annually.
Inventorying data
As part of their participation in the Open Data program, Divisions must create inventories of the datasets held in their trust. These inventories are reviewed annually and updated with new datasets surfaced through divisional activities and communications. The resulting inventories are also made available on the Open Data Portal.
Datasets should not be excluded from divisional inventories based on privacy or confidentiality concerns. The goal is to provide a holistic sense of what data is available, so the City can make informed decisions about which data to prioritize for publication.
Inventories should include key information about the data, including:
- The name and description of the dataset;
 - The dataset’s sensitivity (including whether the source data contains personal or confidential information);
 - The relative value of the data to users and its publishing priority (see the prioritization section for more information);
 - The data’s source system (e.g. in what enterprise system is the data stored), if applicable;
 - Information about who owns, administers or stewards the data.
 
Prioritizing open data
The City’s Open Data Policy focuses on quality over quantity. Divisions are not expected to publish high volumes of data, but they are expected to prioritize publishing high-value data, e.g. data that the public wants and is likely to use.
When prioritizing potential open datasets – whether as part of the annual inventorying process or in more ad hoc cases such as staff reports or dashboards – divisions should use their best judgment to rank the data’s value and sensitivity. These rankings can then be used to calculate a dataset’s priority for publication.
The process ensures datasets that are high value and able to be easily and responsibly opened published first.
Value
Value is a measure of a dataset’s demand and potential for impact, and can be ranked high, medium or low.
| High | 
  | 
| Medium | 
  | 
| Low | 
  | 
Sensitivity
A dataset’s sensitivity is a determination of how suitable the dataset is for public release, and whether review or mitigation is required to safeguard private, confidential or sensitive information.
For additional information on how to assess a dataset’s sensitivity, please consult the City’s Information Protection Classification Standard.
| IM Protection Classification | Description | Sensitivity | Open Data Considerations | 
|---|---|---|---|
| Public | Records that are or can be available to the public without restriction, including any records that can be accessed without a Freedom of Information request or routine disclosure request. | Low | Can be published as open data without review | 
| Routinely Disclosed | Operational and administrative records that can be released without a Freedom of Information Request, including any information available in a division’s routine disclosure plan. | Low | Can be published as open data without review | 
| Exempt-for-review | Records that are exempt under Part 1 of MFIPPA.  For examples, please consult the City’s Information Protection Classification Standard (page 8).  | Medium | May be able to be published as open data, but only after review, and only if sufficient mitigations can be put in place | 
| Excluded | Sensitive or confidential information that has restrictions on its access.  For examples, please consult the City’s Information Protection Classification Standard (page 8).  | High | May be able to be published as open data, but only after review, and only if sufficient mitigations can be put in place | 
| Personal or Personal Health Information | Recorded information about an identifiable individual.  For examples, please consult the City’s Information Protection Classification Standard (page 8).  | Critical | Cannot be published as open data unless personal information can be de-identified and/or aggregated, and only pending review by relevant SMEs | 
Restricted data
For the purposes of inventorying and prioritizing data, divisions may also classify a dataset as restricted. This should only be done in cases where:
- The dataset contains information that, if released, could lead to harm to the public or jeopardize City operations;
 - AND where no mitigations are available to effectively safeguard, de-identify or remove that information;
 - OR where mitigations would eliminate the usefulness of the dataset for the public.
 
If a division decides to classify a dataset as restricted, it should be identified in their inventory, marked as restricted and a rationale for the restriction must be noted.
Priority
A dataset’s priority is the order in which it should be made available as open data relative to other datasets belonging to the division. A dataset’s initial priority is automatically calculated based on its value and sensitivity.
| Value | ||||
|---|---|---|---|---|
| Low | Medium | High | ||
| Sensitivity | Low | P2 | P2 | P1 | 
| Medium | P3 | P2 | P2 | |
| High/Critical | P4 | P3 | P2 | |
While this provides a simple way to evaluate a dataset’s priority, other factors may affect a division’s ability to publish a dataset.
For example, if it is very easy to publish and maintain a P2 dataset, it may be moved up to P1. Or if there are major concerns about data quality or accuracy for a P1 dataset, it may be wise to adjust it to P2 until those concerns are addressed.
Divisions are only expected to provide an adjusted priority in cases where that adjustment influences their annual publication plan (e.g. when a P1 dataset is adjusted to be a lower priority for that year). If a dataset’s priority is adjusted, a rationale should be included in the division’s inventory.
The table below lists some considerations that may influence the decision to move an individual dataset up or down the priority list.
| Factors that may increase priority | Factors that may decrease priority* | 
|---|---|
  | 
  |