Metadata management principles

Principle

Final | March 2019 | v1.0.0 | PUBLIC | QGCIO

Principles at a glance

  1. Standardised: Agencies select and use existing standard metadata schemas to comprehensively describe their information
  2. Contextualised: Agencies create, maintain or use data dictionaries and/or business glossaries to provide greater understanding and context in relation to their information
  3. Managed: Metadata is managed throughout the information lifecycle
  4. Published: Agencies make metadata accessible and discoverable
  5. Automated: Agencies ensure that metadata capture, modification, management and monitoring is time and resource efficient and employ automated processes where practical

Introduction

Metadata plays an increasingly critical role, not only in relation to managing an agencies internal information, but also as an enabler for sharing and re-use. Decision making, research, analytics, and information sharing all depend on reliable content, context and quality of information – metadata is the key to providing this information.

The uses of metadata are evolving – at its most basic, metadata is “data about data” but increasingly it provides a mechanism to do more than simply describe and curate information. In today’s rich and complex information landscape where technology facilitates the capture of an ever-increasing volume and variety of information, the role that metadata plays in describing this information grows more important, particularly in the context of access and reuse. In the future, accurate, reliable and machine-readable metadata will underpin demands for Machine Learning, Artificial Intelligence and advanced analytics. Implementing effective metadata management will allow departments, organisations and the broader public to extend the use of Queensland Government information to derive insights, deliver services and develop new data driven opportunities.

These Metadata management principles are closely aligned with the ‘connectivity’ and ‘trust’ priorities of DIGITAL1ST, the Records governance policy, the Information access and use policy and the ICT profiling standard. They should be read in conjunction with the Information principles, in particular the sections outlining implications for agencies.

Purpose

The Metadata management principles are a set of ambitions or values that accountable officers should aspire to when making decisions regarding the creation, management and use of metadata.

These principles provide for a consistent and contemporary approach to metadata for the Queensland Government and will assist agencies in the establishment and maintenance of metadata management practices. Accurate, integral and relevant agency metadata will allow effective discovery, use and re-use of Queensland Government information.

This improved discovery, use and reuse of Queensland Government information that metadata provides for, is becoming increasingly crucial across a range of related activities. These include agency information and records management, data sharing and analytics practices and broader information lifecycle governance.

Audience

This document is primarily intended for:

  • Metadata creators
  • Information asset custodians
  • Information management specialists
  • Record keepers and archivists
  • Enterprise architects
  • Information architects
  • System administrators
  • Data analysts and researchers
  • Website developers
  • Metadata consumers.

Applicability

This document provides guidance on metadata management for those departments that currently produce, collect, manage, store, publish or share information.  The term information has been used throughout this document in accordance with its meaning as defined in the Queensland Government Chief Information Office glossary.

These principles apply to all Queensland Government departments and its adoption by other Queensland Government entities is encouraged.

Scope

All information currently produced, collected, managed, stored, published or shared by Queensland Government departments, either by manual or automated processes, which requires the creation and maintenance of associated metadata is in scope for these Metadata management principles.

Principles for metadata management

The Metadata management principles recognise that not all metadata is of equal value, and therefore encourage agencies to take a value-based approach to metadata management. This allows effort and energy to be focused on the management of metadata which has the most business value to the agency, the Queensland Government and ultimately the people of Queensland.

The rationale and implications of the principles in this document are focused on their application specifically to metadata and its management processes and practices.

Standardised: Agencies select and use existing standard metadata schemas to comprehensively describe their information

Rationale

The Queensland Government collects vast amounts of information across a range of agencies, subject areas and service delivery areas. Metadata helps to identify the business value of this information and maximise it through enabling people and systems to find and use information.

Due to considerable variation across subject areas and potential uses of this data, there is no one size fits all approach when it comes to metadata management. However, in many cases proven and well-established subject specific schemas are available, which in many cases are widely recognised as industry best practice.

Standard metadata schemas promote consistency and interoperability across multiple data repositories and aid in the transfer of information across agencies. From a user perspective, a metadata schema can allow for comprehensive descriptors of information, which can help users determine if information is fit for purpose, how it can be used and can also help define governance, risk and compliance requirements.

Implications
  • Agencies should ensure that they leverage well established, existing metadata schemas that meet the requirements of their information and business.
  • Agencies should ensure that any implementations of metadata schemas are both human and machine readable to accommodate future uses (such as Artificial Intelligence)
  • Agencies should consider implementing metadata schemas which are openly licensed to maximise their use and re-use.
  • All potential uses and use cases should be explored when selecting an appropriate metadata schema.
  • Agencies should, take a user centric approach and consider user requirements when selecting an appropriate metadata schema.
  • Agency level training and guidance should be directed at ensuring agency staff and systems are able to create, access, use and consume metadata in accordance with the selected schema.

Contextualised: Agencies create, maintain or use data dictionaries and/or business glossaries to provide greater understanding and context in relation to their information

Rationale

Because of variations between metadata schemas and differences in the definitions that they use, agencies should consider using data dictionaries and/or business glossaries to support greater understanding of their information. Agencies should commence the establishment of data dictionaries and business glossaries for high-value information with the intent of progressing through to all information. Data dictionaries help to provide data element definitions and explain the structure, relationship and provenance of the data while business glossaries assist users with the comprehension, discovery, use and re-use of information.

The context provided by data dictionaries and business glossaries helps the Queensland Government get the best value from the information it collects by providing specific contextual definitions of terms which help avoid misunderstandings and incorrect data usage.

Contextualising metadata facilitates informed decision making and improves confidence by helping users (both business and technical) to better understand the information and then evaluate whether it is fit for the intended purpose. For those responsible for the creation of metadata, data dictionaries help to ensure quality and consistency of metadata, as well as streamlining the creation process.

Implications
  • Agencies should consult the ICT profiling standard, the Records governance policy and the Information security policy which may help determine what high value or high-risk information they collect and manage.
  • The development of data dictionaries and business glossaries should be prioritised for information which has a high business value to an agency, or has a high value to other Queensland Government agencies or stakeholders.
  • Appropriate data dictionaries will contain a complete list of terms, as well as context specific definitions and the relationships between each term (where relevant).
  • Data dictionaries should contain definitions that relate to technical domains while business glossaries should relate to business requirements.
  • Agencies should ensure any data dictionary implementations comply with relevant standards and are both human and machine readable to accommodate future uses (such as artificial intelligence and similar emerging technology).
  • Agencies should consider implementing data dictionaries which are openly licensed to maximise their use and re-use.
  • Depending the context and type of information, there may need to be agreement between agencies as to which business glossary is most appropriate.
  • Agencies should consider using existing data dictionaries or sharing data dictionaries where appropriate.
  • To ensure their continued relevance and maintain accuracy over time, data dictionaries and business glossaries should be regularly reviewed and updated.
  • Agencies should consult the Information quality framework guideline for further advice on assessing the quality of their information.

Managed: Metadata is managed throughout the information lifecycle

Rationale

Because metadata is the key to identifying, cataloguing, evaluating, integrating, governing and sharing information, it must be complete and accurate enough to ensure consistency of discovery over time. This will require metadata to be accurately created, actively maintained, updated when changes occur and monitored for inaccuracy and inconsistency (which may be an automated process as per the Automated principle below).

Metadata management should commence with information that relates to key agency and/or Queensland Government priorities and progress through to include all information collected and managed by an agency. Metadata will be contained in various repositories across an agency, such as documents, applications, websites and databases, and at various levels of granularity. The location of data will help determine its value with data requiring its metadata to receive priority, likely to reside in critical systems that directly support the core business operations.

Implications
  • For information that relates to key agency and/or Queensland Government priorities ensure metadata is captured at the time information is stored, extracted, transformed or exported.
  • Agencies should have documented processes and procedures in place to ensure the capture, quality, accessibility, accuracy and currency of their metadata.
  • Provide specific training for agency staff who are required to create and maintain metadata.
  • Conduct regular metadata health checks, including harvesting existing metadata and correcting and improving it, to ensure its usefulness and currency.
  • Implementing controlled vocabularies, data entry standards and ensuring business rules are adequately captured will assist agencies to manage metadata in a consistent manner.
  • Metadata may be a record and therefore may need to be retained in accordance with the Public Records Act 2002 and the Records governance policy.
  • Agencies should consult the Information quality framework guideline for further advice regarding ongoing quality management of their metadata.
  • Agencies should consult the Information asset lifecycle guideline for further information regarding lifecycle phases and their management.
  • Where metadata contains sensitive or personal information, agencies should manage that metadata in accordance with:

Published: Agencies make metadata accessible and discoverable

Rationale

When metadata is published it facilitates the discovery of information and the potential for that information to be shared and reused across agencies, across government and across jurisdictions. Information assets increase in value the more they are used and reused, and this value grows when information is compared and combined with other information assets.

Providing an option to publish metadata to external discovery tools increases the visibility and ultimately the use of Queensland Government information in line with the requirements of the Information access and use (IS33) policy and the Open Data Policy Statement. Publishing metadata helps to increase the value of information by providing visibility of information collected and stored in agency silos to other Queensland Government agencies or external parties. One example of a cross-agency metadata initiative is the QGCIO’s ICT profiling activity which results in the creation of a consolidated Information Asset Register (IAR) containing basic metadata about Queensland Government information assets. The more accurate the metadata contained in the IAR, the more useful and valuable it becomes as a tool to identify and discover Queensland Government information.

Federated metadata discovery and access also supports users who need to locate specific Government information or services, but do not know which agency to approach.

Implications
  • Metadata holdings should be regularly reviewed to maintain currency, accuracy and availability for discovery and use e.g. in the Open Data Portal, the Queensland Spatial Catalogue or any future catalogues.
  • Agency information asset metadata which relates to key agency and/or Queensland Government priorities should be available for harvesting, indexing and use outside the custodial agency.
  • Agencies should ensure that complete and accurate information asset registers are provided as part of the annual QGCIO ICT profiling activity.
  • Where metadata contains sensitive or personal information, agencies will need to evaluate whether to publish such metadata in accordance with:

Automated: Agencies ensure that metadata capture, modification, management and monitoring is time and resource efficient and employ automated processes where practical

Rationale

Metadata is useful for a variety of purposes, but its creation and management can be both time and resource intensive – particularly where manual processes are employed. In addition to focusing effort on high value information assets (as defined in the ICT profiling standard), consideration should also be given to implementing automated processes and tools to facilitate the efficient creation, management and consumption of metadata.

In a complex information landscape which is the subject of rapid change and technological advancement, agencies should investigate innovative processes and tools which have the potential to improve both the quality and quantity of metadata available for use and consumption.

Automated solutions can assist agencies to produce and harvest consistent metadata as well as increase efficiencies in relation to ongoing metadata management activities. Automated solutions may also increase opportunities to use and re-use information in innovative ways through the use of emerging data technologies such as machine learning and artificial intelligence.

Implications
  • Automated solutions should be based on clear requirements and an understanding of existing integration methods and tools.
  • Where automated solutions are implemented, appropriate quality control mechanisms should be incorporated into business processes.
  • Examine existing sources of agency metadata including any auto-generated metadata to determine whether they meet current business requirements.
  • Identify opportunities for automation and innovation.

Review

Issue date:      7 March 2019
Next review date: March 2020

This QGEA policy is published within the QGEA and is administered by the Queensland Government Chief Information Office. It was developed by the QGCIO and approved by the Queensland Government Chief Information Officer.

Implementation

These principles come into effect from the issue date.


Last Reviewed: 11 March 2019