OrgPedia: Bringing Corporations to Light

ORGPEDIA BRINGING CORPORATIONS TO LIGHT A project of •  New
York Law School •  Rensselaer Polytechnic •  MIT Media Lab •  NYU Wagner with support of the Alfred P. Sloan Foundation

WHAT IS ORGPEDIA? •  OrgPedia will be a comprehensive, open,
public data resource and analytic engine for understanding the corporate world. It will collect data about the world’s corporations – who they are, who owns them, who they own, and how and where they operate – and provide a search engine and analytic tools for regulators, researchers, and many others, including corporations themselves, to use this data to research interrelationships between companies and industries. •  OrgPedia brings together data from diverse sources while maintaining and displaying data provenance. Data comes from: •  Ingesting the datasets government (beginning with the US and the UK) hold about the firms they regulate, such as financial, environmental and labor compliance information as well as patent filings. •  Incorporating other datasets that are licensed for re-use, such as corporate directories from Open Corporates and Duedil. •  Crowdsourcing data from the public by asking people to supply missing data elements and validate other people’s submissions. •  Using data science techniques, such as social media network analysis to discover corporations and their relationships. •  OrgPedia is an open public resource focused on evolving a comprehensive source of data to facilitate cross-industry analysis. BRINGING CORPORATIONS TO LIGHT

Institutions such as corporations are how we organize our society
and ourselves. Yet we have no map of this organizational landscape. OrgPedia's goal is to create a comprehensive data resource and analytic tools for understanding the identity, activities, and ownership structure of corporations. THE SOCIAL STACK: TERRA INCOGNITA

There are many users for organization data in the OrgPedia
core and search tools. We are focused primarily on: •  Investors •  Corporations •  Journalists •  Activists WHY ORGPEDIA

BUILDING ORGPEDIA DATA CORE

CONSTRUCTING DATA CORE OrgPedia Data Core Regulatory Data Other Open
Data Crowdsourced Data Network Analysis Data How We Build It We build the data core by ingesting data from diverse sources to grow the core over time. We start with regulatory data from national agencies in the United States and the United Kingdom and complement that with other freely licensed data. We use crowdsourcing to fill in gaps in the data. A universal legal entity identifier is not a prerequisite.

REGULATORY DATA HELPS TO BUILD THE CORE Through collaboration with
government officials, we are working to ingest major regulatory datasets, and identify companies that have been the subject of regulatory action or have been innovative in areas such as energy efficiency or pro-consumer programs. Data sets include: •  Environmental safety and compliance •  Environmental and labor facilities databases •  Workplace safety compliance •  Financial regulatory filings •  Patent filings

REGULATORY USERS Regulatory data will form the core of the
data hub also be useful to regulatory agencies. For example: • Through a Regulatory Dashboard that we plan to develop in 2013, each agency will be able to identify corporations that have violated other agencies’ regulations – on the theory that a company that is a bad actor in one arena may merit investigation in another. • State governments will be able to look at manufacturing facilities in their states and see an integrated picture of federal data about their corporate practices. • Government economic analysts will be able to study data on entire industries, such as electronics manufacturing or biotechnology, that are not under the jurisdiction of any single regulatory agency. • By studying nongovernmental data and crowdsourced data in OrgPedia, regulators will be able to go beyond SEC filings to understand corporate hierarchy, ownership, and structure, and the activities of private subsidiaries held by the companies they regulate.

CROWDSOURCING COMPLEMENTS DATABASES In order to enable cross-industry analytics, we
can enlist the public to ﬁll gaps in the data company by company.

Users need not contribute data directly to OrgPedia. For example,
they might contribute to OrgPedia via potential partners such as Linked In, Facebook or activist communities ONLINE COMMUNITIES

•  In response to a challenge •  In response to
a crisis •  Out of love and loyalty for a corporation •  Out of dislike of a corporation •  To win points and gain status •  To participate in a community and serve the public good WHY WOULD SOMEONE PARTICIPATE

We will use demonstrated techniques for encouraging participation GAME MECHANICS

BUILDING ORGPEDIA DATA EXPLORER

ORGPEDIA DESIGN OrgPedia Data Core How We Use It We
make the data usable with a search engine, new tools like the Regulatory Compliance Dashboard, and downloadable data for others to use in creating new tools and visualizations.

ORGPEDIA EXPLORER: SEARCH SINGLE ENTITY LOOKUP CROSS-INDUSTRY QUERIES

SPECIALTY SEARCH TOOLS: ORGPEDIA VIOLATIONS DASHBOARD •  We will build
a custom search tool in 2013. •  The Compliance Dashboard will facilitate search and visualization of compliance data to serve as an early warning detection system for regulators. •  All Core data will be downloadable so anyone can build their own.

ORGPEDIA TIMELINE 2011-2012 • During 2011-12, our consortium of university-based designers,
engineers and policymakers developed the technical architecture, functioning prototype, wireframes, and mockups to enable to us to build an launch OrgPedia v. 1.0. We created partnerships with both the White House and 10 Downing Street to accelerate access to data. First Quarter 2013 • We will choose a vendor and begin build out and ingesting of regulatory data sets. In parallel, we will continue to test and refine our crowdsourcing design and begin to forge partnerships with relevant online communities. • We will explore partnerships with new government authorities and ingest data from other open sources. Second Quarter 2013 • We will launch initial OrgPedia Data Core and first challenges to create incentives for initial crowdsourcing. • We will continue to workshop and refine our design and display. Third Quarter 2013 • In conversation with regulators, we will begin build out of specialty search engines. • We will endeavor to create a corporate transparency pledge and get major companies to commit to “map themselves” on OrgPedia. Fourth Quarter 2013 • We will work with the network science community in parallel to identify strategies for community and pattern detection that could uncover additional organizational data. 2014 • Based on our experience over the prior year, we can expand our automated and crowdsourcing work to build the Data Hub and improve our search tools. At this point, we can seek the additional support needed to make OrgPedia self-sustaining.

ORGPEDIA IMPACT If successful, having a comprehensive, open resource about
corporations could: •  Spur the release of more data and the growth of the data core; •  Lead to greater understanding of regulated industry behavior; •  Inform the design of more targeted and effective enforcement and more innovative approaches to regulation; •  Help third-party activists, innovators and researchers develop tools to make corporate performance more transparent.

OrgPedia: Bringing Corporations to Light

OrgPedia: Bringing Corporations to Light

Beth Simone Noveck

More Decks by Beth Simone Noveck

Other Decks in Technology

Featured

Transcript