Presented at StrataRX 2012: http://strataconf.com/rx2012/public/schedule/detail/25953
While the entire healthcare community, for decades, has been clamoring for, cajoling, and demanding integration of its IT systems, we’re actually in a pretty elementary stage when it comes to useful, practical, health IT systems integration beyond on-premise and in-building hospital software. Our problem in the industry is not that engineers don’t know how to create the right technology solutions or that somehow we have a big governance problem; while those are certainly issues in certain settings, the real cross-industry issue is much bigger – our approach to integration is decades old, opaque, and rewards closed systems.
For decades, starting in the 50’s through the mid 90’s before the web / Internet came along, systems integration meant that every system had to know about each other in advance, decide on what data they would share, engage in governance meetings, have memoranda of understanding or contracts in place, etc. After the web came along, most of that was thrown out the window because the approach changed to one that said the owner of the data provides whatever they decide (e.g. through a web server) and whoever wants it will be provided secure access and they can come get it (e.g. through a browser or HTTP client). This kind of revolutionary approach in systems integration is what the health IT and medical device sectors are sorely lacking and something that ONC can help promote.
Specifically, the following things are holding us back when it comes to poor integration in healthcare and what future EHRs can do about it:
• We don’t support shared identities, single sign on (SSO), and industry-neutral authentication and authorization. Most health IT systems create their own custom logins and identities for its users including roles, permissions, access controls, etc. stored in an opaque part of their own proprietary database. ONC should mandate that all future EHRs use industry-neutral and well supported identity management technologies so that each system has a least the ability to share identities. Without identity sharing and exchange there can be no easy and secure application integration capabilities no matter how good the formats are. I’m continually surprised how little attention is paid to this cornerstone of application integration. There are very nice open identity exchange protocols, such as SAML, OpenID, and oAuth as well as open roles and permissions management protocols such as XACML that make identity and permission sharing possible. Free open source tools such as OpenAM, Apache Directory, OpenLDAP, Shibboleth, and many commercial vendors have drop-in tools to make it almost trivial to do identity sharing, SSO, and RBAC.
• We’re more “push data” versus “pull data” focused than is warranted early in projects. A common question we commonly ask at the beginning of every integration project is “what data can you send me?” This is called the “push” model where the system that contains the data is responsible for sending to all those that are interested (or to some central provider like an HIE). What future EHRs should do is to implement syndicated ATOM like feeds (which could contain HL7 or other formats) for all their data that they can share and allow anyone who wants it to “subscribe” to the data. This is called the “pull” model where data holders simply allow secure authenticated subscriptions to their data and not worry about direct coupling with other apps. If our future EHRs became completely decoupled secure publishers and subscribers of then data many of our integration problems would go away just like they did for others using modern internet approaches.
• We’re more “heavyweight industry-specific formats” focused instead of “lightweight or micro formats” focused. Appointment scheduling in the health IT ecosystem is a major source of health IT integration pain (in fact much worse than most other areas). If EHRs just used industry standard iCal/ICS publishing and subscribing we could solve 80% of appointment schedule integration instantly. Think about how your iPAD can sync with your Outlook/Exchange server at work – it’s not magic, it’s a basic industry-neutral appropriately securable standard widely used and widely supported. Another example is the use of HL7 ADTs for patient profile exchanges instead of more common and better supported like SAML (which emerged due to industry-neutral user identity and profile exchange requirements). If you’ve ever used your Google account/profile to log into another app on another website you’re using SAML. Again, no magic—it works millions of times a day with “good enough” security and user-controlled privacy.
• Data emitted is not tagged using semantic markup so it’s not shareable by default. Even when we do have full data governance, we do our structured data integration, and then we present information on the screen (usually to HTML) we don’t tag data with proper semantic markup when it’s basically free to do (no extra development is required). Future EHRs must at least generate companion RDFa using industry-neutral schemas for common information (e.g. person data) and create microformats to specific information (e.g. clinical). Using RDFa as a start, EHRs can then start publishing full RDF in the future so it’s easier to discover where certain kinds of meta data can be found without requiring massive registries and other old-style opaque techniques. None of this is technically challenging, insecure, or difficult to implement if we really care about integration and are not just giving it lip service. Google’s recent implementation of its Knowledge Graph is a great example of the utility of this semantic mapping approach. Once even basic microformats are in place with RDFa for authenticated or unauthenticated semantic tagging then we can create SPARQL endpoints for easier to understand data.
The Myth of Health Data
An opinionated look at why current health IT systems
integrate poorly and what we can do about it
By Shahid N. Shah, CEO
Who is Shahid?
• 20+ years of software engineering and multi-
discipline complex IT implementations (Gov.,
defense, health, finance, insurance)
• 12+ years of healthcare IT and medical
devices experience (blog at
• 15+ years of technology management
experience (government, non-profit,
• 10+ years as architect, engineer, and
implementation manager on various EMR
and EHR initiatives (commercial and non-
Author of Chapter 13, “You’re
the CIO of your Own Office”
• A deluge of healthcare data is being
created as we digitize biology,
chemistry, and physics.
• Data changes the questions we ask
and it can actually democratize and
improve the science of medicine, if we
• While cures are the only real miracles
of medicine, big data can help solve
intractable problems and lead to more
• Healthcare-focused software
engineering is going to do more harm
than good (industry-neutral is better).
• Applications come and go, data lives
forever. He who owns, integrates,
and uses data wins in the end.
• Never leave your data in the hands
of an application/system vendor.
• There’s nothing special about health
IT data that justifies complex,
expensive, or special technology.
• Spend freely on multiple systems
and integration-friendly solutions.
What you’ll learn today
Let’s stop the hand waving and relying on the government to take care of integration
NEJM believes doctors are trapped
It is a widely accepted myth that medicine requires
complex, highly specialized information-technology (IT)
This myth continues to justify soaring IT costs,
burdensome physician workloads, and stagnation in
innovation — while doctors become increasingly bound
to documentation and communication products that are
functionally decades behind those they use in their
New England Journal of Medicine “Escaping the EHR Trap - The Future of Health IT”, June 2012
What’s creating the “data deluge”?
We’re digitizing biology
Last and past decades This and future decades
Gigabytes and petabytes Petabytes and exabytes
What’s creating “data deluge”?
•Must be continuously collected
•Difficult today, easier tomorrow
started at $100k
per patient, <$1k
•Can be collected infrequently
•Family history is easy
•Must be continuously collected
•Useful for population health
•Part digital, mostly analog
•Family History is hard
•Business focused data
•Built on fee for service models
•Inward looking and not focused
on clinical benefits
Data changes the questions we ask
Simple visual facts Complex visual facts Complex computable
Implications for scientific discovery
The old way
The new way
We’re in the integration age
Source: Geoffrey Raines, MITRE
We’re not in an
future but an
Recognizable Data Sources
Where is all the data coming from?
Data is hidden everywhere
Clinical trials data
(failed or successful)
Secure Social Patient
SMS, IM, E-mail,
Voice, and Telehealth
Blue Button, HL7,
X.12, HIEs, EHR, and
Patient Family and
More hidden sources of data
(DSS and CPOE)
Unstructured patient data sources
Medical Devices Biomarkers /
Source Self reported by
time from patient
Errors High Medium Low
Time Slow Slow Medium
Reliability Low Medium High
Data size Megabytes Megabytes Megabytes
Data type PDFs, images PDFs, images PDFs, images
Availability Common Common Common Uncommon Uncommon
Structured patient data sources
Medical Devices Biomarkers /
Source Self reported by
Specimens Real-time from
Errors High Medium Low Low Low
Time Slow Slow Medium Fast Slow
Reliability Low Medium High High High
Discrete size Kilobytes Kilobytes Kilobytes Megabytes Gigabytes
Streaming size Gigabytes Gigabytes
Availability Uncommon Common Somewhat
What’s the problem?
What are we doing wrong?
• I only have a few systems
• I know all my data formats
• I know where all my data is
and most of it is valid
• My vendor already knows
how all this works and will
solve my problems
• There are actually hundreds
• There are dozens of formats
you’re not aware of
• Lots of data is missing and
data quality is poor
• Tons of undocumented
databases and sources
• Vendors aren’t incentivized to
Why you can’t just “buy interoperability”
Interoperability of data is an emergent property of your IT environment
Application focus is biggest mistake
Application-focused IT instead of Data-focused IT is causing business problems.
Healthcare Provider Systems
Silos of information exist across
groups (duplication, little sharing)
Poor data integration across
Healthcare Provider Systems
Master Data Management, Entity Resolution, and Data Integration
Improved integration by services
that can communicate between applications
The Strategy: Modernize Integration
Need to get existing applications to share data through modern integration
How do we modernize integration?
Why health IT systems integrate poorly
• Permissions-oriented culture prevents
tinkering and “hacking”
• We don’t let patients drive data
• No scripting or customizing EHRs, lab
• Interoperability isn’t required for
transactions to be completed (e-
• We have “Inside out” architecture, not
• We don't support shared identities,
single sign on (SSO), and industry-
neutral authentication and
• We're too focused on "structured data
integration" instead of "practical app
• We focus more on "pushing" versus
"pulling" data than is warranted early
• We're too focused on heavyweight
industry-specific formats instead of
lightweight or micro formats
Process and people consolidation won’t work in
“For decades, businesses typically have been
rewarded for consolidation around standard
processes and stockpiling assets through
people, technology and goods.
Companies are discovering they need a new
kind of leverage – capability leverage – to
mobilize third parties that can add value.”
Defining and coordinating interactions across a
multitude of organizations is the new way
• Outside-in architecture asks you to think
about your operations and processes as
a collection of business capabilities or
• Each individual service must be analyzed
and packaged to see who can deliver
them best. According to Deloitte, “this
architectural transition requires new skills
from the CIO and the IT organization.
CIOs who anticipate and understand the
opportunity are likely to become much
more effective business partners with
other executive leaders.”
Promote “Outside-in” architecture
The IT department inside your organization cannot possibly do everything you’d like
Source: Deloitte “Outside-in Architecture”
Proprietary identity is hurting us
• Most health IT systems create their own
custom identity, credentialing, and access
management (ICAM) in an opaque part of
a proprietary database.
• We’re waiting for solutions from health IT
vendors but free or commercial industry-
neutral solutions are much better and future
Identity exchange is possible
• Follow National Strategy for Trusted Identities
in Cyberspace (NSTIC)
• Use open identity exchange protocols such as
SAML, OpenID, and Oauth
• Use open roles and permissions-management
protocols, such as XACML
• Consider open source tools such as OpenAM,
Apache Directory, OpenLDAP
, Shibboleth, or
• Externalize attribute-based access control
(ABAC) and role-based access control (RBAC)
from clinical systems into enterprise systems
like Active Directory or LDAP
Implement industry-neutral ICAM
Implement shared identities, single sign on (SSO), neutral authentication and authorization
Dogma is preventing integration
Many think that we shouldn’t integrate
until structured data at detailed machine-
computable levels is available.
The thinking is that because mistakes can
be made with semi-structured or hard to
map data, we should rely on paper, make
users live with missing data, or just make
educated guesses instead.
App-centric sharing is possible
Instead of waiting for HL7 or other structured
data about patients, we can use simple
techniques like HTML widgets to share
"snippets" of our apps.
• Allow applications immediate access to
portions of data they don't already manage.
• Widgets are portions of apps that can be
embedded or "mashed up" in other apps
without tight coupling.
• Blue Button has demonstrated the power of
app integration versus structured data
integration. It provides immediate benefit to
users while the data geeks figure out what
they need for analytics, computations, etc.
App-focused integration is better than nothing
Structured data dogma gets in the way of faster decision support real solutions
Old way to architect:
“What data can you send me?” (push)
The "push" model, where the system that
contains the data is responsible for sending the
data to all those that are interested (or to some
central provider, such as a health information
exchange or HL7 router) shouldn’t be the only
model used for data integration.
Better way to architect:
“What data can I publish safely?” (pull)
• Implement syndicated Atom-like feeds (which
could contain HL7 or other formats).
• Data holders should allow secure
authenticated subscriptions to their data and
not worry about direct coupling with other
• Consider the Open Data Protocol (oData).
• Enable auditing of protected health
information by logging data transfers through
use of syslog and other reliable methods.
• Enable proper access control rules expressed
in standards like XACML.
Pushing data is more expensive than pulling it
We focus more on "pushing" versus "pulling" data than is warranted early in projects
HL7 and X.12 aren’t the only formats
The general assumption is that
formats like HL7, CCD, and X.12 are
the only ways to do data integration
in healthcare but of course that’s
not quite true.
Microsoft Excel & Access, Google
Docs, etc. don’t have live access to
our data in transactional systems
such as EHRs.
Consider industry-neutral protocols
• Consider identity exchange
protocols like SAML for integration
of user profile data and even for
exchange of patient demographics
and related profile information.
• Consider iCalendar/ICS publishing
and subscribing for schedule data.
• Consider microformats like FOAF
and similar formats from
• Consider semantic data formats
like RDF, RDFa, and related family.
Industry-specific formats aren’t always necessary
Reliance on heavyweight industry-specific formats instead of lightweight micro formats is bad
Legacy systems trap valuable data
In many existing contracts, the
vendors of systems that house the
data also ‘own’ the data and it can’t
be easily liberated because the
vendors of the systems actively
prevent it from being shared or are
just too busy to liberate the data.
Semantic markup and tagging is easy
• One easy way to create semantically
meaningful and easier to share and
secure patient data is to have all
HTML tags be generated with
companion RDFa or HTML5 Data
Attributes using industry-neutral
schemas and microformats similar to
the ones defined at Schema.org.
• Google's recent implementation of
its Knowledge Graph is a great
example of the utility of this
semantic mapping approach.
Tag all app data using semantic markup
When data is not tagged using semantic markup, it's not securable or shareable by default
Proprietary data formats limit findability
• Legacy applications only present
through text or windowed
interfaces that can be “scraped”.
• Web-based applications present
other assets but aren’t search
Search engines are great integrators
• Most users need access to
information trapped in existing
applications but sometimes they
don’t need must more than access
that a search engine could easily
• Assume that all pages in an
application, especial web
applications, will be “ingested” by
a securable, protectable, search
engine that can act as the first
method of integration.
Produce data in search-friendly manner
Healthcare fears open source
• Only the government spends more per
user on antiquated software than we do
• There is a general fear that open source
means unsupported software or lower
quality solutions or unwanted security
Open source can save health IT
• Other industries save billions by using
• Commercial vendors give better pricing,
service, and support when they know
they are competing with open source.
• Open source is sometimes more secure,
higher quality, and better supported
than commercial equivalents.
• Don’t dismiss open source, consider it
the default choice and select commercial
alternatives when they are known to be
Rely first on open source, then proprietary
“Free” is not as important as open source, you should pay for software but require openness
• Tooling strategy must be comprehensive. What hardware and
software tools are available to non-technical personnel to encourage
• Formats matter. Are you using entity resolution, master data and
metadata schemas, documenting your data formats, and access
• Incentivize data sharing. What are the rewards for sharing or penalties
for not sharing healthcare data?
• Distribute costs. How are you going to allow data users to contribute
to the storage, archiving, analysis, and management costs?
• Determine utilization. What metrics will you use determine what’s
working and what’s not?
E-mail [email protected]