My Blog

Trends and Transients 2017


Each year there are more new technologies to keep track of, more ways to organise
your life and your company’s information, more ways to communicate. This session
will introduce you to new and potentially over-hyped technologies, discuss
older, overlooked technologies, and entertain you at the same time. Our expert
speakers will debate the current issues, giving you the benefit of their wide
experience and differing points of view, so you can decide for yourself which
technologies will meet your needs and which are a waste of your time.

This course is chaired by Lauren Wood and taught by Ann Wrightson, Irina Bolychevsky, Peter Krautzberger, and Steve Neale.

Classes for 2017

The Trends and Transients course runs on

Mathematics on the web

Taught by Peter Krautzberger.

We will discuss current trends for handling mathematical content in XML
workflows, with a primary focus on web production.

In particular, we will cover

  • an overview of common formats for authoring/storing math content
  • MathML and the role it plays in today’s web
  • tools and techniques for rendering mathematics in a web context
  • enhancing accessibility of math content for the web
  • future directions for math on the web

Prior experience with MathML or other formats is not necessary but helpful.

Interoperability in healthcare standards

Taught by Ann Wrightson.

Over the last decade or so there has been strong & often noisy competition
between standards for system-to- system communications in the Health sector.
“The wonderful thing about standards is that there are so many of them”
(variously attributed to Andrew Tanenbaum, Patricia Seybold, Grace Hopper &
others) has become a byword ….and until quite recently the usual response
from standards pundits, at least in public, was some variant of “Obviously,
everyone should use this standard!”. Now there’s a different approach emerging
out of eHealth collaboration in Europe, recognizing that competition between
standards is unwinnable, wasteful and takes attention away from the key shared
problem of safe interoperability.

In this session, you will learn about the latest thinking on interoperability at
scale for clinical communications, how & why many standards end up competing
in the same space – & how all that plays with XML, JSON & FHIR (the
latest-generation HL7 standard).

XML, Related Formats, and Linked Open Data in Corpus Linguistics and Natural
Language Processing

Taught by Steve Neale.

XML has a long association with the representation of linguistic data, being
well-suited to both the structure and the unpredictability of language. It
allows data — whether at the paragraph, sentence or token (word) level
— to be organised in a well-formed, structured manner; at the same time,
a range of syntactic and semantic features can be represented as attributes,
seamlessly and flexibly used to describe structured elements where appropriate.
In fields such as corpus linguistics and natural language processing (NLP), XML
has been used to represent a range of corpora from large-scale endeavours such
as the British National Corpus (BNC) to more focused contributions such
as the semantically-annotated SemCor.

Corpora are generally built with linguistic principles in mind — designed
according to appropriate and balanced reflections on demographics, contexts and
genres — and with XML being well-suited to query and information
retrieval, it is no surprise that it remains a popular format for corpus design.
However, ontology-centric formats such as RDF and OWL — both of which can
be represented as XML — are now widely used to deliver huge knowledge
bases such as DBpedia as linked open data (LOD). These ontology-based formats
— better suited to representing the specific relationships between
entities than traditional XML schema — have driven fresh approaches for
NLP that allow more semantically-oriented queries to be made on large-scale and
often unstructured textual data.

The talk will begin with an overview of XML in the context of linguistic data,
focusing on the qualities that make it ideal for representing structured
information about language and how it is being used in a current project,
CorCenCC — The National Corpus of Contemporary Welsh (Corpws
Cenedlaethol Cymraeg Cyfoes)
. Next, focus will shift to RDF and OWL, and
how they better represent the kinds of semantic relationships that are difficult
to cater for with traditional XML schema. Finally, current trends in NLP will be
explored, with an emphasis on how and where XML is being used in relation to
linked open data.

Decentralising data silos and monopolies

Taught by Irina Bolychevsky.

The information age is transforming how we work, live and interact. We are all
increasingly dependent on digital services to travel around, decide where to
eat, learn, share info and maintain our public identity.

And yet, digital services are increasingly provided by a diminishing group of
super monopolies, whose real customers are advertisers. Or else, services spring
up, only to disappear or change once acquired, our data lost.

This session will be about the motivations behind the redecentralise movement,
some projects leading the charge and what role openness (in data, code and
standards) can play. We’ll talk about what models of decentralisation exist and
which are relevant here and explore what parallels we can draw with the open
data movement.