Data Governance and Policy Alignment

Purpose

The DCoE Policy and Governance Alignment pillar provides the components needed for the management of data internally and externally within an organization. The specific components of any policy should be tailored to fit the organization’s overall environment. As such, the goal of this information is less intended to represent a standard boilerplate as it is to identify discussion points for developing a customized solution.

Governance and Policy Alignment Best Practices

Establishing effective a data policy and governance is an iterative process. Prior to establishment of a framework, it is important to assess the maturity of the data governance program. Below is a diagram that depicts the Data Governance Maturity within 5 milestones:
  1. eGovernment
  2. Open government
  3. Data-centric Government
  4. Digital Government
  5. Smart Government




Data policy and governance is the establishment of an accountability framework designed to influence behavior in the valuation, creation, storage, use, archival, and deletion of data. This framework includes the processes, roles, standards, and metrics that support a government or organization to achieve its objectives through the use of data. The establishment of such a framework is necessary to withstand eventual and certain cultural, political, and organizational changes that governments face. Outlined below are four best practices for the establishment of a data police and governance framework.

Best Practice #1: Take a holistic approach but start small.

Ensure the strategic objectives of the organization or government are at the heart of the framework by balancing out technical compromises to move the program gradually up the maturity scale, recognizing that this won’t happen overnight. Accept that the establishment of data governance is an iterative process that will continue to evolve and mature with the organization. Start with the people and processes before focusing on the technology.

Best Practice #2: Obtain Executive buy-in

Establishment of a data governance framework often requires significant cultural changes within an organization and funding to support the processes and tools needed to maintain the framework. This cultural shift and funding challenges require backing from the executive level. Obtaining support from key decision makers who represent your organizations strategic focus areas or lines of business is pivotal to ensuring you have support from decision-makers and those able to influence the executives.

Best Practice #3: Establish quantifiable return on investment

The benefits that an effective data governance program can bring to a government entity or organization are often difficult to see in the short-term. This makes it difficult to obtain support and funding for the costs to establish a program. As such, it is important to build a business case to illustrate the quantifiable benefits to get the necessary buy-in and ownership of the program. To build such a business case, focus on the relationship between the business processes and the key data elements that support them. Calculate the costs of manually managing these data elements through repeated and duplicated processes. Identify opportunities that available quality data bring in terms of generating revenue or optimizing processes through better customer service and insight.

Best Practice #4: Use metrics to measure the progress

Establishing metrics to validate short-term gains is one of the quickest ways to justify program costs, obtain support, and sustain the engagement. However, it is important to include both quick-wins and strategic long-term improvements when selecting measures at the beginning of a project. These metrics need to be quantitative and focused on business values, such as data management cost and number of decisions made. Consider using Socrata’s Open Performance platform to create a data governance KPI dashboard to facilitate the reporting and management of the metrics.

Standard Data Governance

The development of a policy starts with a high-level definition of the goals and objectives of both the organization’s data program, as well as the policy itself. It should reference and be built upon any related legislation such as the Freedom of Information Act or Open Records Acts, and data retention, storage, and access guidelines, touch on how privacy will be ensured, as well as define whether the policy applies to the entire organization or only specific departments/agencies.
Statements regarding the use of a web portal as the central medium for the data program and the desire to automate and publish data to its current form should also be considered.
The following example was taken from the City of Mesa, Arizona Open Data policy.  
Another example can be found here, and is reviewed with the Socrata Education team during the Data Governance live course at learn.socrata.com, included with a subscription to Education.
The White House has also developed a detailed data policy, which governs the identification of data, formatting, stewardship, source system accessibility, management and release of government data for all federal agencies within the administrative branch.  

Definitions

An effective data policy is one that is easy for the reader to comprehend. Users of a data policy can range from citizens to the organization’s Chief Data Officer. Therefore, it’s vital that the policy properly explain any terminology that could be open to misinterpretation. The more concise the definition, the less likely there will be misunderstandings about the policy.
The following are a few suggested definitions to include in a policy. As a policy is developed and continually re-evaluate, it’s important to request feedback from various user personas to help ensure there’s a clear understanding of the policy.
  • Application Programming Interface (API)
  • Data or Datasets
  • Legislation references
  • Data (include any legislative references that the definition may be based on)
  • Data Governance Committee (or a similarly named group)
  • Open format
  • Protected Data (include any legislative references that the definition may be based on)
  • Publishable Data (include any legislative references that the definition may be based on)
  • Sensitive Data (include any legislative references that the definition may be based on)

Governance

The data policy should specifically state who will oversee the program, how they are appointed and who they report to.
Governance models usually take one of three forms:

  • Centralized - where the program is managed by the organization’s IT function and are heavily reliant upon that team’s capacity
  • Decentralized - where the program is independently managed by the various departments or agencies, who in most cases have a high degree of technical expertise
  • Hybrid - programs typically run by the organization’s departments/agencies, with the IT function playing a strong advisory role
In addition, some organizations modify that management function through the use of single or multi-layered strategies. Oversight in a single layered model is typically performed by either an individual (i.e. Chief Data Officer, Chief Information Officer, etc.) or an Data Governance Committee. That individual or Committee is usually responsible for developing overall policy objectives, as well as the managing (or even performing) the tactical duties of the program.
Within larger organizations, multi-layered models are often led by an Data Governance Committee that is responsible for developing policy objectives. A separate Data Work Group is then responsible for performing the tactical duties of the program based upon the direction of the Data Governance Committee.
Regardless of the governance model chosen by the organization, this section of the policy discussion should align with the underlying workflow that’s again based on the size and complexity of the organization.
An Data Governance Committee typically includes representation from the organization’s administration, information technology and public information/communications teams. Data Work Groups (for multi-layer models) are usually led by the Chief Data Officer/designee, information technology team members, as well as representatives (Data Owners) from each of the departments within the organization.

Data Inventory Policy

The data inventory policy should clarify the organization’s intent with regards to the assets included in the ongoing data inventory. It is recommended that the organization develop a comprehensive inventory of all data, as well as a documented submission and approval process for adding data to the inventory. If the organization chooses not to develop a comprehensive inventory, the policy should explain how data is identified for inclusion.
An organization should also consider whether to require that the data inventory be maintained for public consumption.  
The following example was taken from the Montgomery County, Maryland Open Data Manual.

Data Prioritization Policy

In addition to defining what data is maintained on the inventory, the organization should also define how the data is prioritized for publication. A statement regarding the data re-prioritization process should also be included. Key elements associated with how data is prioritized typically include:
  • Demand for the data (both internally and externally)
  • Availability of staff assigned to the dataset
  • Source data accessibility
  • Data complexity and completeness
  • Dataset approval process
As with the decision to make the data inventory available to the public, the decision on whether to publicly maintain data prioritization decisions should be also included.
The following example was taken from the Montgomery County, Maryland Open Data Manual.


Data Privacy Policy

As previously stated, the organization’s Data policy should build upon and complement any existing policies - especially regarding privacy. At a minimum, the policy should reference any existing legislation and ensure that it clearly communicates the difference between Publishable and Protected data.
The following example was taken from the Topeka, Kansas Open Data Policy.

Enforcement Standards and Compliance

The organization’s Data program performance should be reviewed annually. That review should measure performance against the program’s stated goals and objectives. Goals and objectives for the next reporting period, as well as recommendations on how to improve the Data program should also be considered.
The following example was taken from the Baton Rouge, Louisiana Open Data Policy.

Social Media Cadence & Timeline

Social media is a very powerful force for marketing and communicating progress and maturity of a data program.

Purpose

Social Media writ large serves three broad purposes:
  1. Engage the public
  2. Encourage additional data
  3. Create a feedback loop

Successful Social Media

The following are the steps needed to develop successful social media to support your program:
  1. Define goals
  2. Curate content and branding
  3. Define a timeline
  4. Have a bit of fun

Define Goals

Defining the ultimate goal of using Social Media in terms of data strategy is paramount to success or perceived failure. Common goals include:
  • A measurable uptick in views or downloads
  • Number of likes, responses, or shares

Curate Content and Brand

In terms of Social Media, content and your brand are the true substance of your presence.

Define Timeline

Your Social Media presence should be consistent, but generally revolve around new data publishing, changes to datasets, or new content. Timing can impact this, too.

Have Fun and Tell a Story

Social Media is about authenticity. Perhaps there is a dataset that is not critical to citizen’s everyday life, but tells a human story.

Community Engagement

Taking data out of silos doesn’t end with just the data that a particular government owns. It extends to government data from other overlapping jurisdictions.  For example, a resident of Queens is going to care about data from New York State and New York City.
To meet this need, customers can federate their data. Data Federation allows datasets from your domain to appear on other data portals. In the example provided, because New York State and New York City have federated their data, the resident of Queens would be able to see relevant data they needed. To be the recipient of federated data, the federation request must originate from a different data portal.

Setting Researchers up for Success with your Socrata Data

Colleges and Universities are often important stakeholders in communities, from small towns all the way up to the Federal Government. With the right approach, data and an open platform can become vital resources to academic researchers.
Here are two primary objectives governments should keep in mind when attempting to do this:
Longevity:
If an academic uses data to inform their research, it’s important that that data is available for others to scrutinize and use. This enables their findings to be validated and upheld. It also helps other researchers to build upon successful work by their peers.  
This is only possible if the data is reliably found in the same location. For Socrata customers, this is easy to accomplish through regular maintenance and automated update of Socrata datasets.
Increasing value:
One of the biggest risks to data on Socrata is that it becomes stale.  If a government doesn’t set up an appropriate automation plan for its data to ensure that the most recent data is available, its data will lose value for researchers who have a need for valid and up to date information. Additionally, as datasets grow, updating the metadata to reflect all of the data will enrich the value of the dataset, increasing its attractiveness to researchers.
Relying on datasets that will grow in value is extremely important to researchers. Ensuring that your Socrata platform is positioned to do this.
For additional information, consider reviewing USAID’s Open Data Policy to learn about how they support researchers. Additionally, consider learning about Core Trust Seal, which is a highly respected organization for establishing data as reliable for researchers.
An intelligently designed data program will engage a number of different constituencies within your community. These constituencies are usually identified in the persona exercise that you conduct as a part of your Design Session. There are a number of different types of events that will enable you to engage these constituencies, and several strategies that will help you to make these events as successful as possible.

Host/Sponsor Meetups:

A Meetup is a great way to partner with engaged and technical citizens in your community. This isn’t just for digital hubs like San Francisco.  Even in small towns, there are civic tech organizations who want to partner with their local government, or with engaged departments at the state or federal level. A Meetup can bring these individuals together to determine how to address a shared community challenge. For example, The Code for Greensboro brigade in North Carolina have used recurring Meetups to facilitate making it easier to vote.

Community Involvement:

Most citizens are just regular taxpayers, and are not able to engage with data in a sophisticated way.  However, they can still get a lot of value out of their government’s Socrata data platform. For example, the City of Topeka has developed recurring “Coffee on the Corner” meetings, which give citizens an opportunity to deepen their understanding of Topeka’s data program and make requests for data that they’d like to be able to engage with. These meetings are especially effective because they take place in neighborhoods, as opposed to in City Hall conference rooms.