Linked Data on Speed

# This Presentation

![QR-Code](img/qr-presentation.en.svg)

---

# Linked Data - An Introduction on Speed

Adrian Gschwend, Zazuko GmbH

---

# Web of Documents

---

# Web of Documents

* Since the 90ies
* Link (URL) for each site
* A site typically represents a document
* Links to other pages
* For humans

---

# URL as unique identifier

---

# URL as unique identifier

* We enter the URL by hand
* Use a search engine
* Or follow links within documents
* The links are not typed
* so we don't know what the relation is to the current document

---

# Who invented it?

---

# Who invented it?

* Internet exists already since the late 70ies (TCP/IP stack)
* Mail (SMTP) since 1981
* It did have growth over time but not that fast

---

# The Web

---

# The Web

* With the Web came exponential growth
* Thanks to the hyperlink (and hypertext/HTML)

---

# So far so good

---

# Do you speak Chinese?

---

# Do you speak Chinese?

* Semantics of a page is understandable for (non visually impaired) people,
* as long as I know the script, language and domain

---

# Semantics is hard for machines

---

# For example Google

---

# For example Google

* Semantics is hard for machines
* Context is often completely missing
* Example: Jaguar the animal, the car or the MacOS 10.2 Release?

---

# Confusion of tongues (confusio linguarum)

---

# Everything is going to be alright with databases?

* Data in silos
* Partially available on the Web
* Confusion of tongues between databases
* Technically as well as with regard to contents (semantics)
* You cannot search and/or link across databases/datasets

---

# How about centralized data management?

Not realistic:
* There is no single source of truth
* Does not reflect federal and pluralistic structures
* Even rarely works in centralized systems
* Inherent barriers in regards to scalability
* Ambitious, technically and with regards to content

---

# Decentralized approach

Requirements?
* A common exchange format
* Semantics has to be part of the data
* Multilingualism in its core
* One can express any relationship between entities
* Even across data sets and silos
* Decentralized data management/maintenance
* Queries across data sets/silos are possible

---

# Try this!

---

# Linked Data as approach

* RDF as common data model
* Well-known schemas & ontologies as Lingua Franca
* Web (HTTP) as transport
* Links (URIs) as (decentralized) identifiers
* Multilingualism in its core
* SPARQL as standardized query language
* "Agile" data model

---

# Machine-readability

---

# Data model

![A Thing](img/AThing.svg)

---

# RDF data model

![A Thing with URIs](img/AThing2.svg)

---

# RDF data model

* Instead of a document I describe a single information

Example:

```turtle
<ktk> <givenName> "Adrian"
<ktk> <familyName> "Gschwend"
```

---

# The difference to the previous Web

![URLs vs URIs](img/URL-vs-URI.png)

---

---

# Technology stack

* W3C standard, as the Web itself
* SPARQL as query language
* Implementation of the stack in all common programming languages

---

# SPARQL

* It’s a real standard & vendors respect it
* W3C standard
* Query language for Linked Data
* Scales (in memory)
* Supports Federated Queries
* Commercial vendors, growing marked
* Open Source alternatives

---

# SPARQL Basics

* At home: [SPARQL in 11 minutes](https://www.youtube.com/watch?v=FvGndkpa4K0)
* SELECT for selection, WHERE for conditions
* WHERE typically combines multiple conditions
* In SQL this would be a JOIN
* Without the headache you get by using JOIN

---

# Historisiertes Gemeindeverzeichnis

* Via Federal Statistical Office, [Data & docs](http://www.bfs.admin.ch/bfs/portal/de/index/infothek/nomenklaturen/blank/blank/gem_liste/02.html)
* eCH-0071 Standard
* Cantons, Districts und Municipalities
* Namespace: `PREFIX gont: <https://gont.ch/>`
* Now you can access it in its shortform
* Classes: `gont:Municipality`, `gont:District`, `gont:Canton`
* Details: https://github.com/zazuko/fso-lod/tree/master/doc/eCH0071

---

# SPARQL by Example

* [See Github documentation](https://github.com/zazuko/fso-lod/blob/master/doc/eCH0071/sparql.md)

---

# Benefits

* Re-use outside of its initial use case
* Data is published in the domain of the data owner (like a website)
* Access is visible (logging)
* Complex queries are possible by using SPARQL
* ➞ Answers as CSV, XML, JSON, RDF
* the data is the API
* ➞ Answers in Text, XML, JSON
* no problems like API versioning & maintenance (API hell)

---

# Conclusion

* Web of Data instead Web of Documents
* Same technology stack
* Open standards
* Semantical interoperability through vocabularies & ontologies
* Decentralized, as the web
* Web Scale Database

Built on an agile data model

---

---

# Linked Open Data Cloud

---

# Linked Open Data Cloud

* See [lod-cloud.net](http://lod-cloud.net/)
* Contains the biggest public Linked Open Data repositories
* Linked Open Vocabularies [LOV](http://lov.okfn.org/dataset/lov/) for common vocabularies
* [Prefix.cc](http://prefix.cc/) for prefixes & shortcuts

---

# Wikidata

* Wikipedia for raw data
* Looong story around/against RDF & Linked Data
* Now with SPARQL endpoint!
* Prefix: `http://www.wikidata.org/entity/Qxyz` for Linked Data
* Example query for airport codes & labels: https://t.co/gyWQ7MRzL6
* Also supports [geospatial queries](http://addshore.com/2016/05/geospatial-search-for-wikidata-query-service/)
* Lots of cool new stuff every month!

---

# Google search

* SEO, search engine optimization
* Today via schema.org
* Completely Linked Data
* JSON-LD or RDFa serialization

![Find movies](img/movie-google.png)

---

---

# Offshore Leaks as Linked Data

Two different data sets:

* https://titanpad.com/gdVduv14jD
* https://rawgit.com/Ontotext-AD/leaks/master/README.html

---

# Thanks! Questions?

---

# Contact

adrian.gschwend@zazuko.com

http://www.zazuko.com

http://twitter.com/linkedktk

![Zazuko GmbH](img/logo_color_letter.svg)

---

# References: Standards

* [RDF 1.1 Primer](http://www.w3.org/TR/rdf11-primer/)
* [Turtle Serialization](http://www.w3.org/TR/turtle/)
* [JSON for Linking Data](http://json-ld.org/) (JSON-LD)
* [SPARQL 1.1 Query Language](http://www.w3.org/TR/sparql11-query/)

---

# Books

* Learning SPARQL, [Homepage](http://www.learningsparql.com/), [Safari Books Online](http://proquest.safaribooksonline.com/book/web-development/rdf/9781449371449)
* Semantic Web for the Working Ontologist [Homepage](http://workingontologist.org/), [Safari Books Online](http://proquest.safaribooksonline.com/book/web-design-and-development/9780123859655)

---

# Programming languages
* JavaScript: [RDF-Ext](https://github.com/rdf-ext/rdf-ext), [alternatives](https://www.w3.org/community/rdfjs/wiki/Comparison_of_RDFJS_libraries)
* JavaScript visualization: [Uduvudu](https://github.com/uduvudu/uduvudu), [RDF2h](https://github.com/rdf2h/rdf2h)
* Java: [Apache Jena](http://jena.apache.org/), [Eclipse RDF4J](http://rdf4j.org/) (formerly known as Sesame)
* C#.Net: [dotNetRDF](http://dotnetrdf.org/)
* Ruby: [Ruby RDF Project](https://ruby-rdf.github.io/)
* Python: [RDFLib](https://github.com/RDFLib)
* C: [Redland RDF Libraries](http://librdf.org/)
* PHP: [EasyRdf](http://www.easyrdf.org/) (Google for more)
* Perl: [Perl and RDF](http://www.perlrdf.org/) & [Attean](https://github.com/kasei/attean)