Blog article: Exploring the Future of Open Data: Hosting Community-Generated Datasets on Toronto’s Open Data Portal

Article text
Written by Toronto’s Open Data Team member – Toronto Urban Fellow Angel Li.
When we think about open data, we often think about government-published datasets—like transit schedules, water quality data, or demographic information. But what if Toronto’s open data portal could go beyond simply offering city-owned data and start hosting datasets created by community organizations, researchers, and academic institutions?
That’s the big idea the Open Data team has been exploring. Bringing community-generated data into the City’s open data portal could lead to positive impacts for civic engagement, innovation, and collaboration.
However, it also comes with challenges that must be carefully considered to ensure it’s done equitably and responsibly.
Learning from Other Cities
For the purpose of this blog, community-generated data refers to data voluntarily created by non-governmental entities, such as community organizations, private entities, and academic institutions, without government direction or oversight.
To get a sense of what’s possible, we looked at how other jurisdictions approach community-generated data. Here are some key insights:
- France has developed a national geo-referenced address database where citizens can report address information, helping improve data quality and accuracy.
- Spain operates a national open data portal that includes datasets from the private sector and academia, not just government agencies.
- Finland allows anyone to upload datasets to its open data portal. While this open model fosters inclusivity, it has also led to challenges with data ownership and content moderation.
- Ottawa and Montreal take a more controlled approach, primarily sharing datasets from organizations that already have formal partnerships with the city.
Each of these models presents different benefits and challenges, and there’s no one-size-fits-all solution. The key takeaway? There’s value in exploring this idea further, but careful planning is necessary to understand if – and how best – it could work in Toronto.
What the Community Thinks
We also spoke with members of Civic Tech Toronto (CTTO), a local group dedicated to civic technology and data projects, to gain insights from those actively involved in creating community-generated data. Their response? Enthusiastic but cautious. Here’s what they highlighted:
Potential Benefits
- Access to Resources: Community groups often lack the resources to maintain and host large datasets. The City’s open data portal could ease this burden.
- More Data, More Collaboration: Easier data sharing is expected to generate more datasets, spark new partnerships and drive innovative projects.
- Greater Visibility: It is believed that the City’s platform could help community-generated datasets reach a wider audience.
- Credibility and Validation: Data owners feel that having their data hosted on the City’s portal could increase its legitimacy, encouraging policymakers to use it in decision-making.
Potential Challenges
- Political Sensitivities: Many of the groups creating community data are advocacy organizations who want the City to adopt a specific policy or approach. There’s still an open question about whether it’s appropriate for the City to host such data, and how that relationship might work.
- Data Ownership and Control: How much control would community groups have over updating or removing their data?
- Sustainability: Could data owners commit to keeping their datasets current and accurate over time, in line with the City’s approach to its own data?
Moving Forward Responsibly
The idea of making Toronto’s open data portal a place to find information about the City, not just from the City is interesting, but it requires a thoughtful approach.
Based on our research so far, here are some potential paths forward:
- Begin with Trusted Partners: Should the City decide to host community-generated data, it might be wise to follow other jursidctions’ lead and start by collaborating with
well-established organizations that have strong data governance practices in place. This could include universities, or larger NGOs that already have existing data relationships with the City.
- Develop Clear Inclusion Criteria: Regardless of where a dataset comes from, it’s important that users feel that data on the portal is reasonably accurate and trustworthy. But at the same time, the City shouldn’t be assuming responsibility for third-party data. Navigating that starts with developing clear standards and expectations for that community-generated data would need to meet to be included on the open data portal.
- Communicate clearly: Users of the open data portal should be able to easily understand whether a dataset comes from the City or a third-party. That way they can decide how and how best to use the data. If the City does dip its toes into hosting community-generated data, we should clearly differentiate it from City-owned data, and include a disclaimer noting that hosting data doesn’t imply endorsement, and that third-party organizations are responsible for their data’s quality and content.
What’s Next?
We’re continuing to explore the feasibility of hosting community-generated data on the open data portal. There’s potential value in making the City’s open data ecosystem more inclusive, but it’s essential to strike the right balance between inclusivity, accountability, and utility.
If you have thoughts on this initiative, we’d love to hear from you! How do you see community-generated data playing a role in Toronto’s open data landscape? Reach out to opendata@toronto.ca with any questions or ideas —let’s keep the conversation going!