Urban Data Centre

Introduction 

The plethora of sources of urban data, be it open city data, IoT data, or data from 3rd parties, presents both opportunities and challenges for researchers and policymakers. Opportunities in what the scale, breadth and depth of the data enables practitioners and researchers to achieve, and challenges in achieving useable results due to the quality, sparseness, validity, interoperability and relevance of the data.

The mission of the Urban Data Centre (UDC) is to enhance the design, planning and operations of cities by expanding the opportunities presented by data, and addressing the challenges inherent in data. To achieve this mission, the Centre will initially pursue five objectives. The development of: 

  • a Canadian Urban Data Repository, 

  • standards for the representation of urban data, 

  • tools for the management of urban data, 

  • tools to support the analysis and interpretation of urban data, and 

  • a national Urban Data Network.

  1.  

Work Streams 

WS 1: Canadian Urban Data Repository 

The Canadian Urban Data Repository (CUDR) will be the first open, authoritative and crowd-sourced urban data repository. It will be the premier repository for those who wish to access Canadian data.  It will provide urban researchers with a vastly broader set of data and data sources that will enable a richer set of analyses. It will support a wide array of groundbreaking Canadian urban research hitherto impossible due to previous limitations on access to data. CUDR will also spawn an entire eco-system of research, development and productization into the issues of data curation, validity, authenticity/integrity, cleaning, trust, relevance, and fusion. In addition, issues of access, ownership and privacy can be explored. It will fuel research in the Computer Sciences, Statistics, Machine Learning, Operations Research, Social Sciences, and Information Science, to name a few. 

 

WS 2: Standards for the representation of urban data 

A common data model enables cities to share information, plan, coordinate, and execute city tasks, and support decision making within and across city services, by providing a precise, unambiguous representation of information and knowledge commonly shared across city services. This requires a clear understanding of the terms used in defining the data, as well as how they relate to one another.

To motivate the need for a standard urban data model, consider the evolution of cities. Cities deliver physical and social services that traditionally have operated as silos. If during the process of becoming smarter, transportation, social services, utilities, etc. were to develop their own data models, then we would have smarter silos. To create truly smart cities data must be shared across these silos which can only be accomplished through the use of a common data model. For example, “Household” is a category of data that is commonly used by city services. Members of Households are the source of transportation, housing, education, and recreation demand. It represents who occupies a home, age, occupations, where they work, abilities, etc. Though each city service may gather and/or use different aspects of a Household, much of the data needs to be shared with each other. 

Our goal is to identify, define, formalize and validate city-level concepts. In particular, our objectives are to: 

  1. Identify candidate concepts and their use cases through a detailed analysis of existing city vocabularies, ontologies and city enterprise software data models. 

  1. Reduce the candidate concepts from the different sources into a minimal set of core concepts. 

  1. Manage and curate an open process where Standards Development Organizations (SDOs) and other experts can comment on the core concepts (defined in step 2) and post modifications, use cases and new concepts.5 

  1. Formalize and evaluate the resulting concepts from step 3. 

  1. Promulgate the results through the Standards Council of Canada and the International Standards Organization so that they will be adopted by cities around the world. 

 CUDR provides the platform to identify and evaluate alternative ontologies. 

 

WS 3: Tools for the management of urban data 

The focus of this work stream is the research and development of tools that support the integration of multi-sourced urban data. Urban data, specifically data sourced from governments, the web and social media is a morass where the validity of information varies widely, and mirrors the beliefs and agendas that groups and individuals in society possess, and the messiness of data provided by both human and machine sensor nets. The problem we face is how to distinguish valid information from fake or simply incorrect. What is a truly reliable source of valid information? Is it a government? Certainly some governments are more reliable than others. Is it a "trusted source" such as a newspaper? The same holds true for newspapers.The source of the problem lies at the core of our information society; Much of the web, and particularly social media, is crowd-sourced, hence by its nature it is impossible to enforce any scheme for ensuring information quality and validity. Consequently, we have to rely on evidence "buried" in the web itself to determine the degree of validity of any piece of information. 

Predicated on the existence of a shared ontology, as described in work stream 2, this work stream focuses on tools to support integration. 

 

WS 4: Tools to support the analysis of urban data 

The third area of exploration is software tools to support urban analysis, design, planning and operations. Tools fall into four categories: 
 

  1. Data transformation. Data transformation tools focus on the syntactic and semantic transformation of data so that it can be integrated and consumed by an analysis process.

  1. Data analysis. Data analysis tools include software libraries for statistical analysis, machine learning, and visualization.

  1. Analysis process definition. Analysis process definition tools enables the definition of an analysis process/workflow specifying the operations to be performed and the flow of data amongst them.

  1. Experiment management. Experiment management supports the definition and archiving of urban data analysis experiments. The searchable archive includes the datasets used, the analysis process definition, and analysis results.
     

WS 5: National Urban Data Network 

The School of Cities, in partnership with universities/libraries across the country, will facilitate the creation of a National Urban Data Network that will provide researchers and policy makers across Canada with unprecedented access to multi-sourced urban data leading to potentially revolutionary new insights into how cities function. The network will be composed of curators housed in libraries across Canada.  Their role will be proactive and reactive; proactively searching for new datasets, and reactively responding to requests for data from Canadian researchers. Curators will: 

  1. identify sources of urban data,  

  1. secure rights to use the data, 

  1. annotate the data with meta data covering ownership, usage license, quality, etc., and 

  1. deposit the data into the Canadian Urban Data Repository.

The network of curators will be supported by the UDC.

 

Watch this space for updates by Fall 2021