Upgrade to Pro — share decks privately, control downloads, hide ads and more …

OrgPedia: Bringing Corporations to Light

OrgPedia: Bringing Corporations to Light

OrgPedia will be a comprehensive, open, public data resource and analytic engine for understanding the corporate world. It will collect data about the world’s corporations – who they are, who owns them, who they own, and how and where they operate. It will provide a website, search engine and analytic tools for regulators, researchers, and many others, including corporations themselves, to use this data both to look up information about individual corporations, and also to research interrelationships between companies and industries.
OrgPedia will bring together data from diverse sources while maintaining and displaying data provenance. As a priority, OrgPedia will import data that government agencies (starting with the U.S. and the U.K.) hold about the firms they regulate, such as financial data, environmental, labor, and safety compliance information, and patent filings, most of which have never been integrated. This core data structure will enable OrgPedia to explore adding other information over time, including other data sets that are licensed for re-use, crowdsource data from the public, and data developed through techniques such as social media network analysis. The result will be a “hub” of comprehensive data about corporations worldwide.
Designed by a consortium of leading technology experts at Rensselaer Polytechnic, MIT, New York Law School and NYU, OrgPedia will be a powerful tool to study the corporate world. It will enable government regulatory agencies to use data about regulated entities more effectively, and will allow researchers in or out of government to import OrgPedia data and analytic tools into their own websites and use OrgPedia to do new analyses and build new applications.
With OrgPedia, researchers will be able to answer questions rapidly that previously would have taken months of work or been impossible to answer. For example, they will use OrgPedia to untangle complex corporate ownership structures or to look across entire industries to see which companies have the best and worst environmental, social, and governance practices. If successful, OrgPedia will spur the release of more data; lead to greater understanding of regulated industry behavior; enable more targeted and effective enforcement and more innovative approaches to regulation; and help researchers gain new insights into regulated markets.

Beth Simone Noveck

October 25, 2012
Tweet

More Decks by Beth Simone Noveck

Other Decks in Technology

Transcript

  1. ORGPEDIA BRINGING CORPORATIONS TO LIGHT A project of •  New

    York Law School •  Rensselaer Polytechnic •  MIT Media Lab •  NYU Wagner with support of the Alfred P. Sloan Foundation
  2. WHAT IS ORGPEDIA? •  OrgPedia will be a comprehensive, open,

    public data resource and analytic engine for understanding the corporate world. It will collect data about the world’s corporations – who they are, who owns them, who they own, and how and where they operate – and provide a search engine and analytic tools for regulators, researchers, and many others, including corporations themselves, to use this data to research interrelationships between companies and industries. •  OrgPedia brings together data from diverse sources while maintaining and displaying data provenance. Data comes from: •  Ingesting the datasets government (beginning with the US and the UK) hold about the firms they regulate, such as financial, environmental and labor compliance information as well as patent filings. •  Incorporating other datasets that are licensed for re-use, such as corporate directories from Open Corporates and Duedil. •  Crowdsourcing data from the public by asking people to supply missing data elements and validate other people’s submissions. •  Using data science techniques, such as social media network analysis to discover corporations and their relationships. •  OrgPedia is an open public resource focused on evolving a comprehensive source of data to facilitate cross-industry analysis. BRINGING CORPORATIONS TO LIGHT
  3. Institutions such as corporations are how we organize our society

    and ourselves. Yet we have no map of this organizational landscape. OrgPedia's goal is to create a comprehensive data resource and analytic tools for understanding the identity, activities, and ownership structure of corporations. THE SOCIAL STACK: TERRA INCOGNITA
  4. There are many users for organization data in the OrgPedia

    core and search tools. We are focused primarily on: •  Investors •  Corporations •  Journalists •  Activists WHY ORGPEDIA
  5. CONSTRUCTING DATA CORE OrgPedia Data Core Regulatory Data Other Open

    Data Crowdsourced Data Network Analysis Data How We Build It We build the data core by ingesting data from diverse sources to grow the core over time. We start with regulatory data from national agencies in the United States and the United Kingdom and complement that with other freely licensed data. We use crowdsourcing to fill in gaps in the data. A universal legal entity identifier is not a prerequisite.
  6. REGULATORY DATA HELPS TO BUILD THE CORE Through collaboration with

    government officials, we are working to ingest major regulatory datasets, and identify companies that have been the subject of regulatory action or have been innovative in areas such as energy efficiency or pro-consumer programs. Data sets include: •  Environmental safety and compliance •  Environmental and labor facilities databases •  Workplace safety compliance •  Financial regulatory filings •  Patent filings
  7. REGULATORY USERS Regulatory data will form the core of the

    data hub also be useful to regulatory agencies. For example: • Through a Regulatory Dashboard that we plan to develop in 2013, each agency will be able to identify corporations that have violated other agencies’ regulations – on the theory that a company that is a bad actor in one arena may merit investigation in another. • State governments will be able to look at manufacturing facilities in their states and see an integrated picture of federal data about their corporate practices. • Government economic analysts will be able to study data on entire industries, such as electronics manufacturing or biotechnology, that are not under the jurisdiction of any single regulatory agency. • By studying nongovernmental data and crowdsourced data in OrgPedia, regulators will be able to go beyond SEC filings to understand corporate hierarchy, ownership, and structure, and the activities of private subsidiaries held by the companies they regulate.
  8. CROWDSOURCING COMPLEMENTS DATABASES In order to enable cross-industry analytics, we

    can enlist the public to fill gaps in the data company by company.
  9. Users need not contribute data directly to OrgPedia. For example,

    they might contribute to OrgPedia via potential partners such as Linked In, Facebook or activist communities ONLINE COMMUNITIES
  10. •  In response to a challenge •  In response to

    a crisis •  Out of love and loyalty for a corporation •  Out of dislike of a corporation •  To win points and gain status •  To participate in a community and serve the public good WHY WOULD SOMEONE PARTICIPATE
  11. ORGPEDIA DESIGN OrgPedia Data Core How We Use It We

    make the data usable with a search engine, new tools like the Regulatory Compliance Dashboard, and downloadable data for others to use in creating new tools and visualizations.
  12. SPECIALTY SEARCH TOOLS: ORGPEDIA VIOLATIONS DASHBOARD •  We will build

    a custom search tool in 2013. •  The Compliance Dashboard will facilitate search and visualization of compliance data to serve as an early warning detection system for regulators. •  All Core data will be downloadable so anyone can build their own.
  13. ORGPEDIA TIMELINE 2011-2012 • During 2011-12, our consortium of university-based designers,

    engineers and policymakers developed the technical architecture, functioning prototype, wireframes, and mockups to enable to us to build an launch OrgPedia v. 1.0. We created partnerships with both the White House and 10 Downing Street to accelerate access to data. First Quarter 2013 • We will choose a vendor and begin build out and ingesting of regulatory data sets. In parallel, we will continue to test and refine our crowdsourcing design and begin to forge partnerships with relevant online communities. • We will explore partnerships with new government authorities and ingest data from other open sources. Second Quarter 2013 • We will launch initial OrgPedia Data Core and first challenges to create incentives for initial crowdsourcing. • We will continue to workshop and refine our design and display. Third Quarter 2013 • In conversation with regulators, we will begin build out of specialty search engines. • We will endeavor to create a corporate transparency pledge and get major companies to commit to “map themselves” on OrgPedia. Fourth Quarter 2013 • We will work with the network science community in parallel to identify strategies for community and pattern detection that could uncover additional organizational data. 2014 • Based on our experience over the prior year, we can expand our automated and crowdsourcing work to build the Data Hub and improve our search tools. At this point, we can seek the additional support needed to make OrgPedia self-sustaining.
  14. ORGPEDIA IMPACT If successful, having a comprehensive, open resource about

    corporations could: •  Spur the release of more data and the growth of the data core; •  Lead to greater understanding of regulated industry behavior; •  Inform the design of more targeted and effective enforcement and more innovative approaches to regulation; •  Help third-party activists, innovators and researchers develop tools to make corporate performance more transparent.