About
this
talk
• Goal:
(Hopefully)
leave
you
with
a
few
new
ideas
on
interes;ng
opportuni;es
at
the
intersec;on
of
Big
Data
and
Seman;cs
• Structure
– Big
Data
&
Analy;cs
– Seman;c
Technologies
– The
intersec;on
– Three
examples
/
opportuni;es
About
Me
• 13
year
experience
in
soIware
driven
innova;on
– seman;c
technologies
researcher
in
the
group
of
Prof
Studer
– manager
of
a
research
division
concerned
with
all
aspects
of
informa;on
driven
decisions
at
FZI
– Big
Data
consultant
with
codecen;c
– (analy;cs
consultant
with
Daimler
TSS)
BeUer
informa;on
processing
through
more
considera;on
for
the
explicit
context
of
processed
elements
Is
label
for
Is
label
for
has
capital
has
popula;on
in
country
contains
Lebanon
Lebanon,
Country
Beirut,
City
4
Million
USA
Dartmouth
Medical
S.
Lebanon,
NH
Seman;c
Technologies
by
Task
• Discovering
Context:
understand
the
meaning
of
unstructured
data
(text,
images,
…)
• Moving
Context:
transfer
meaning
of
data
between
systems
(RDF,
RuleML,
…)
• Using
Context:
inference
based
on
the
meaning
(inference
engines,
deduc;ve
databases)
(some)
seman;cs
relevant
challenges
in
Big
Data
systems
Ingest
Stage
Transform
Serve
Integra;ng
data
from
diverse
sources
Understanding
unstructured
/
polystructured
data
Languages
to
specify
transforma;ons
Through
simplicity,
the
prolifera;on
of
JavaScript
and
through
a
good
fit
to
other
data
structures
JSON
has
become
the
standard
for
data
interchange
on
the
web
I
believe
the
linked
data
techniques
that
worked
for
web-‐scale
data
integra;on
can
offer
long
term
relief
for
the
Enterprise
data
integra;on
challenge
(and
that
JSON-‐LD
can
help
in
doing
this)
Mo;vated
by
the
need
for
agility
in
data
use
and
the
availability
of
tools
to
cheaply
manage
giant
amounts
of
polystructured
data
enterprises
are
moving
from
a
tradi;onal
ETL-‐Data
Warehouse
architecture
…
…
However,
there
is
currently
a
giant
gap
between
capabili;es
of
companies
to
directly
u;lize
this
heap
of
polystructured
data
…
(e.g.
Elas;c
Search
+
Kibana
or
Probabilis;c
knowledge
fusion
and
Google
Knowledge
Vault
Dong,
Xin
Luna,
K.
Murphy,
E.
Gabrilovich,
G.
Heitz,
W.
Horn,
N.
Lao,
Thomas
Strohmann,
Shaohua
Sun,
and
Wei
Zhang.
"Knowledge
Vault:
A
Web-‐scale
approach
to
probabilis;c
knowledge
fusion."
(2014).
I
believe
some
next
genera;on
Big
Data
leaders
will
bring
Seman;cs
(as
in
“discovering
and
using
the
meaning
of
heaps
of
polystructured
data”)
to
many
more
enterprises
Ingest
Stage
Transform
Serve
Linked
Enterprise
Data
(with
JSON-‐LD)
connect
/
download
slides
at
www.vzach.de
Seman;cs
in
the
Data
Lake
LP
for
view
defini;ons
(e.g.
Cascalog)