Design considerations and infrastructure implications

A number of key decisions need to be made based on the use cases for the deployment. The following sections outline the main decisions which are reflected in the deployment configurations listed in the next section.

  1. Ontoserver instances
  2. Authorisation server
  3. Syndication server
  4. Persistent storage

Ontoserver instances

Ontoserver can operate as both a read/write FHIR terminology authoring server, and as a read-only terminology query server. Depending upon the size of the organisation and the split of these two activities, one or more Ontoserver instances may be desirable.

Typically with resource level security and community authorisation server features, only one read/write authoring terminology server is necessary for an organisation.

For organisations only using terminology and not authoring, a read/write authoring terminology server may not be necessary, and one or more read-only instances may be needed.

For organisations with content authoring needs and a desire for strong control and separation between authoring and production use of terminology, multiple Ontoserver instances can be used to separate the authoring from the use functions. A staging terminology server can be added to test out deployments of content in a production-like environment before publishing the content on a production endpoint for downstream use.

Designing for...

Open allClose all

High availability for a read-only Ontoserver cluster is usually best achieved by subscribing more than one instance to a load balancer, and hosting the instances in multiple geographically-diverse locations.

This provides redundancy if an instance fails for some reason, or needs to be taken down for maintenance. It also mitigates against a failure of the network at any one location due to a power outage, human error or disaster scenario.

Another advantage of having multiple instances is that maintenance and updates can be performed on individual instances without causing downtime for users.

Because Ontoserver is distributed as a Docker image, Docker-based architectures such as Swarm, Kubernetes and Amazon ECS can also ease the task of designing redundancy into the system.

Ontoserver has been designed to facilitate high performance read operations on large code systems. Terminology data is stored in an index for fast searches, lookups and code validation. Response time varies greatly depending on the type of query Ontoserver is asked to fulfil. The FHIR API and FHIR Terminology Service API provide a very rich set of operations that can range in cost from very cheap to long-running. Performance is also dependent upon the size of the code system.

Ontoserver has a configuration parameter called ontoserver.fhir.too.costly.threshold. This specifically places a limit on the number of results returned from a ValueSet expansion, which can be a costly operation when the underlying CodeSystem has a large number of codes.

The simplest way to increase the capacity of a single Ontoserver instance is to install a HTTP cache in front of it. Ontoserver supports this well by setting the appropriate HTTP headers to inform any caches between it and the user of which requests are cacheable.

There are a number of HTTP cache implementations which can work well in front of Ontoserver with minimal configuration, such as Nginx and Varnish.

If a deployment is using the terminology server in a read-only fashion, e.g. searches, lookups, code validation, then horizontal scaling is also an option. Multiple Ontoserver instances can be set up behind a load balancer which can share requests between the instances, reducing the load on any one instance.

Authorisation server

Some organisations may choose to use an Ontoserver instance on an internal secured network and not require a complex authorisation model.

However, for organisations that need to provide more open endpoints and/or need to apply nuanced levels of access, an authorisation server will be required. The role of an authorisation server is to issue bearer tokens which Ontoserver can inspect to determine whether the bearer of that token is authorised to perform the operation requested.

These bearer tokens can be issued by any server capable of creating the right embedded claims matching the desired authorisations for Ontoserver, however configuring and achieving this can be a challenge. It is also possible that the directory used to identify users for an organisation which can perhaps issue tokens, cannot be freely enough to manage the authorisations desired for Ontoserver.

For this reason, Ontocloak, an enhanced version of Keycloak, has been created to operate as an authorisation server in a solution deployment.

While it can manage identities and authentication for uses, Ontocloak is intended to be used with an organisation’s existing Open ID Connect or SAML identity providers to achieve identity and authentication.

Ontocloak provides the ability to then configure the appropriate level of access to Ontoserver and/or Atomio endpoints for those identities, and enables the SMART on FHIR authorisation flow for Snapper, Shrimp and OntoCommand clients (and other SMART on FHIR clients).

Ontocloak can also be configured to integrate with an external SAML or OpenID Connect identity provider, or to federate identities from a Kerberos or Active Directory / LDAP source.

If authorisation is required in the solution, use of Ontocloak is recommended as it simplifies configuration.

Syndication server

A syndication server is a useful addition to a deployment if an organisation:

  • Creates binary indexes for SNOMED CT and/or LOINC
  • Wishes to stage content from internal or external sources other than in an authoring server where active editing is done
  • Wishes to have recorded release history from production publications
  • Uses ephemeral production read-only instances which require a static source of truth to rebuild from.

Atomio has been designed for this purpose. It can host multiple Atom syndication feeds and associated artefacts which Ontoserver can read, and has an API to manipulate the feed and entry content.

Persistent storage

Open allClose all

Resilient, persistent storage is always required for the Authoring server

  • Can be just the PostgreSQL database – index data on Ontoserver’s filesystem storage can be rebuilt/reacquired
  • Faster to recover a server if filesystem storage is restored too

Recovery points and potential data loss are a business decision based on acceptable levels of content development loss

Persistent storage is obviously required for the Authoring read/write server to record the evolution of the content being developed.

This can be limited to the Postgres database in use by Ontoserver, as the filesystem content can be rebuilt from the database content and/or reacquired from upstream syndication sources depending upon the content.

However it is usually advisable to backup and be able to restore this filesystem as well, as recovery will be faster and some upstream syndication sources may no longer have all the required assets depending upon who manages them.

Decisions around recovery points and potential data loss must be made based on the requirements of each individual deployment based on acceptable risk

Syndication server (if in use) should have persistent, resilient storage

Stores release history as

  • current restore point should Production or Staging servers be lost
  • release rollback points should they be required

Given its role and usage pattern,

  • storage does not have to be fast

recovery points can be taken when release candidates are created – relatively infrequent compared to transactional use.

The syndication server must also have persistent storage.

It acts as a store of the current and previous release points allowing the production and staging servers to use transient storage. Therefore its data must be preserved.

However given that it has a low volume of requests quite infrequently compared to Ontoserver, its storage does not need to be fast. Recovery points are logically taken when release candidates are created.

Typically no real need for persistent storage, state held persistently by

  • Syndication server if there is one
  • Authoring server if there isn’t

May keep state persistently as an optimisation if release content is very large (many versions of SNOMED CT)

  • PostgreSQL database can be rebuilt on Ontoserver startup quickly from the preload feed
  • Ontoserver’s filesystem storage of indexes can take time to download and install for big code systems (like SNOMED CT)

The staging and production servers can user persistent or transient storage.

Transient storage can be used because in the three deployment models discussed, release state is stored persistently by either the authoring server or the syndication server. Staging and Production Ontoserver instances can rebuild their state when they are started as per the content deployment process.

However if a very large quantity of SNOMED CT indexes is being used, the release content can become very large and take a long time to load into the staging and/or production servers when they are started. In this case it may be useful to use persistent storage for the production server so that if the container is lost, for example the Docker node hosting it goes down, a new Ontoserver container can be created in its place very quickly.

Can be achieved similarly for scaled or a single node

  1. Bring up a new container and preload from the new feed
  2. Once operational, swap the new container into service using a load balancer, DNS etc
  3. Drain connections from the old container and terminate

(graphic)

Zero down time deployments are possible, and are achieved through load balancing infrastructure like many other applications.

The process is to bring up a new Ontoserver container and wait for it to load all the content configured into its preload syndication feed. Once ready for service it is swapped with the existing instance into production. Once existing connections have been drained from the existing container it can be terminated.