Atomio Config

The following describes various configuration aspects of Atomio.

Atomio

Syndication server for terminology servers (especially Ontoserver).

Purpose

Atomio was created to host terminology content in a syndication feed/s for use in terminology servers. It publishes those feeds using an extension to the Atom Syndication Feed Format described at https://www.healthterminologies.gov.au/specs/v2/conformant-server-apps/syndication-api/syndication-feed.

API documentation

Atomio hosts its own API documentation and the Swagger UI at /swagger-ui.html wherever Atomio is hosted, which will render the OpenAPI 3 documentation Atomio also hosts at /v3/api-docs. This will contain the API documentation for the running Atomio version.

For an example, see https://synd.ontoserver.csiro.au/swagger-ui.html and https://synd.ontoserver.csiro.au/v3/api-docs.

Health checks and information

Atomio also by default exposes

a healthcheck endpoint at /actuator/health which is useful for checking the instance's health and readiness, and
an information endpoint at /actuator/info which is useful to determine the exact Atomio version which is deployed.

The healthcheck endpoint is particularly useful for configuring in container orchestration tools like Kubernetes, or dashboards.

Version 2.0.0 upgrade

Atomio 2.0.0 upgrades to Spring 3.2 which among many dependency updates, includes an upgrade to H2. The version change between H2 1.x and H2 2.x requires a database migration.

For those using Postgres (configuration explained below) this migration is unnecessary. For those using the H2 default database, the H2 migration requires that the database content is exported and reimported into a new H2 database - the binary format has changed.

To simplify this process a Docker image has been created which will run the migration process. It performs the following steps

Copies the current database to a backup location
Exports the database to a zip file in the backup location named .export..zip
Deletes the current database and creates a new blank H2 database at the same location
Imports the .export..zip file

The image name is quay.io/aehrc/atomio-h2-migration:1.0.0

There are a number of environment variables you can use to control this process.

Variable	Description	Default for Atomio
DATABASE_URL	The database URL passed to Atomio	jdbc:h2:/workspace/atomio/db
USERNAME	Database username to connect to the database	sa
PASSWORD	Database password to connect to the database	password

If the above environment variables are not set the migration will not run.

The process to perform the upgrade is

Shut down the existing version 1.x Atomio server
Start the atomio-h2-migration image in the Atomio image's place - it is key the container has access to the same disk mount Atomio has (with access to the H2 database), and appropriately set the above environment variables.
Start the new version 2.x Atomio server

The existing Atomio version 1 server's database is backed up by this process to a directory called migration-backup where the database is stored - for example at /workspace/atomio/migration-backup if the database URL is jdbc:h2:/workspace/atomio/db. If a rollback is required, the content of this directory can be restored and the Atomio version 1 image started.

Configuration

All configuration is done via Spring properties, which may be set with system properties which can be passed through to the Docker container via environment variables using Spring Boot's relaxed binding.

NOTE: if you wish to have Atomio clone entries or feeds from remote sources you need to set atomio.client.urlWhitelist described below.

Volume mounts

Atomio will by default (which can be overridden as described below) write its database and all downloaded artefacts into

/workspace/atomio

This could be volume mounted to somewhere appropriate for persistent storage.

Properties

The following are the default configuration items in the container which may be overridden

Spring settings

spring.datasource.url=jdbc:h2:/workspace/atomio/db
spring.datasource.driverClassName=org.h2.Driver
spring.datasource.username=sa
spring.datasource.password=password
spring.jpa.database-platform=org.hibernate.dialect.H2Dialect
spring.jpa.hibernate.ddl-auto = update
server.error.include-message=always
spring.servlet.multipart.maxFileSize=3221225472
spring.servlet.multipart.maxRequestSize=3221225472
spring.jpa.open-in-view=true

As seen above, by default an H2 database on disk will be used. There is no need to override these values unless you require a different configuration.

PostgreSQL support

To use PostgreSQL as Atomio's database, there is a postgres profile which can be enabled that will change the above H2 properties (driver class, dialect etc). It is then a matter of setting the spring.datasource.url, spring.datasource.username, and spring.datasource.password appropriately for the PostgreSQL database you are using.

For example

spring.profiles.active=postgres
spring.datasource.url=jdbc:postgresql://localhost/atomio
spring.datasource.username=username
spring.datasource.password=password

Storage self test

atomio.scheduled.storage.test.enabled=true
atomio.scheduled.storage.test.skip.sha.check=false
atomio.scheduled.storage.test.cron=0 0 0 * * *

These parameters control the server's storage self test - by default according to the above parameters the server will validate all the files it has referenced by entries, every night at midnight, and will validate the file against the entry's length and SHA256.

Because SHA256 calculation is expensive, but length checking is cheap, the SHA256 checking can be turned off. However simple length checking is still a useful sanity check that verifies the file still exists and is plausible.

The schedule it uses can be modified by specifying the required cron expression, bearing in mind that for large numbers of large files with SHA256 checking will take a while so it pays to keep frequency relatively low.

This feature can be disabled by changing atomio.scheduled.storage.test.enabled to false.

Any errors detected by this feature will be written into the server's log as error messages, log monitoring is required to identify issues.

Security

atomio.security.audience=atomio
atomio.security.hsts=true
atomio.security.enabled=false
atomio.security.anonymousFeedRead=false

By default, the application security is turned off - that is the server doesn't require authentication or authorisation for any of its operations.

This can be changed by setting atomio.security.enabled to true. This enables token based security and requires configuration for the server to validate token signatures.

The preferred way to do this is set (example from Keycloak's URL patterns)

atomio.security.issuer-uri=https://some.host/auth/realms/realm-name

The server will then discover on start up the certificates required for signature validation and the issuer value to check in the tokens. This will work for OAuth 2.0 or OIDC well-known configuration using Spring's discovery methods.

If issuer well-known discovery doesn't work or can't be used, JWKS can be used. By specifying the JWKS URL as follows the server will get the key to use from the authorisation server directly, the following examples are from Keycloak

atomio.security.jwk-set-uri=https://some.host/auth/realms/realm-name/protocol/openid-connect/certs

The issuer URI configuration or JWK URI configuration is preferred because it gracefully manages authorisation server signing certificate changes, however won't work unless the SSL certificate being used by the authorisation server is valid.

In terms of the security itself, when turned on the server will require tokens to have

an audience of "atomio" or whatever value is configured into atomio.security.audience
"SYND_READ" as an authority in the token to perform any GET operations
"SYND_WRITE" as an authority in the token to perform any POST, PUT, or DELETE operations

atomio.security.hsts can probably be set to false for situations where Atomio has a proxy server in front of it (which should be most deployments), and this should be the proxy's responsibility.

If security is enabled, Atomio will require an appropriately authorised bearer token when a feed, entry or artefact is requested. However there are circumstances where it is convenient to have Atomio openly advertise the feeds that it has, and the entries in those feeds. Setting atomio.security.anonymousFeedRead to true will enable this mode, where GET requests to list all feeds or get a specific feed's Atom XML will be accepted without authorisation, however all other requests (such as downloading an artefact) will require authorisation as defined above.

Authorisation auto discovery

If security is enabled atomio.security.enabled=true, Atomio supports providing clients with authorisation discovery metadata at

/.well-known/smart-configuration
/.well-known/openid-configuration

If atomio.security.issuer-uri is set, Atomio will attempt to proxy the issuer's OpenID configuration at /.well-known/smart-configuration and /.well-known/openid-configuration. This is the preferred approach requiring minimal configuration.

If atomio.security.issuer-uri cannot be used (e.g. does not work with the authorisation server being used), or the proxying of the issuers OpenID configuration does not work (e.g. the authorisation server does not support standard metadata locations) the following properties allow minimal manual configuration.

atomio.security.smartConfiguration.authorisationEndpointUrl
atomio.security.smartConfiguration.tokenEndpointUrl
atomio.security.smartConfiguration.grantTypesSupported

for example

atomio.security.smartConfiguration.authorisationEndpointUrl=https://my.auth.server/auth
atomio.security.smartConfiguration.tokenEndpointUrl=https://my.auth.server/token
atomio.security.smartConfiguration.grantTypesSupported=authorization_code,implicit,client_credentials,refresh_token

If these atomio.security.smartConfiguration properties are configured they must all be configured. This set represents the minimal set for SMART on FHIR.

CORS

The following is the default configuarion for CORS

atomio.cors.allowedOriginPatterns=
atomio.cors.allowedHeaders=X-Requested-With,Origin,Content-Type,Accept,Authorization,Access-Control-Allow-Headers
atomio.cors.allowedMethods=PUT,POST,GET,DELETE,OPTIONS
atomio.cors.exposeHeaders=Cache-Control,Content-Language,Content-Type,Expires,Last-Modified,Pragma
atomio.cors.maxAge=600

Any of these properties can be overridden by redefining them with different values. atomio.cors.allowedOriginPatterns supports a list of patterns as defined here

Base URL

atomio.base.url=

This setting controls the base URL of links generated in the Atom syndication format responses generated by the server. By default, this setting is blank, which signals to the server to generate the base URL from the request it receives, which is usually correct unless the server is behind a proxy.

Therefore if using a proxy in front of the server, this should be set to the base URL from which clients will be requesting.

Storage

atomio.artefact.storage.path=/workspace/atomio/artefacts

As mentioned above, the server uses /workspace/atomio/artefacts inside the container to store its artefacts. This can be changed using this parameter, however it is more likely that this setting will be left as is and this location volume mounted into the container to some external storage location.

Download URL prefix whitelist

For security reasons, Atomio needs to be provided a whitelist of URL prefixes of acceptable locations to download content from. This is to prevent someone requesting Atomio clone a feed or entry with file or internal URLs (e.g. intranet) to gain access to private/internal content. Values passed in must be valid URLs and will be used to determine if a URL begins with one of these whitelisted prefixes before downloading content. Multiple URL prefixes can be provided in a comma separated list.

atomio.client.urlWhitelist=http://foo.bar,https://another.url/some/limited/subpath

By default Atomio does not whitelist any URLs for download, even external URLs to itself.

Download timeouts

atomio.artefact.download.connection.timeout=60000
atomio.artefact.download.read.timeout=600000

These timeouts are used when the server is downloading remote artefacts when cloning another syndication feed. The settings are quite generous because usually the duration of this process is less important than success.

Sentry

Atomio can send error diagnostic messages to Sentry if you have an account. This is a useful way to monitor the deployed application for failures and spot multiple occurrences of common failures

Settings are

sentry.dsn
sentry.environment
sentry.servername

Atomio will automatically set sentry.release to the version being run.