MDQ (Metadata Query) in the UK federation

The size of the UK federation metadata aggregate has grown sufficiently large that scaling issues are appearing for some metadata consumers. We have therefore developed another approach where entities can request metadata for individual entities as and when required, instead of periodically downloading a single aggregate file that contains metadata for all the entities in the world. This is our MDQ (Metadata Query) service.

Problem statement

An entity (IdP or SP) can be described by SAML metadata, which contains all the technical details required for other entities to interoperate with it, for example certificates, endpoints, security algorithms, and logos. An IdP and SP wishing to interoperate will, in the simplest case, exchange metadata bilaterally. In the UK's research and education sector, the UK federation has traditionally collated metadata from each of the entities in the federation and published that metadata in a single metadata aggregate.

However, as the UK federation membership has grown, and our reach has become global through the eduGAIN metadata exchange, our aggregate file now contains a few thousand entities and is over 60 MB in size (as of August 2021). Several problems are becoming apparent:

  • The memory requirement for entities consuming our metadata aggregate is increasing, and out of memory errors are difficult to diagnose.
  • A new aggregate must be generated if any entity updates its metadata. Entities download the new aggregate, even if they do not interoperate with any of the entities that have changed
  • The federation operator has significant network load (60MB file x 2000 entities downloading it = 112 GB network traffic per updated aggregate)

Mechanisms have already been put in place that can partially alleviate some of these problems, for example we publish updated metadata once per day, and we support downloading of compressed metadata. However these are at-best temporary measures.

What is MDQ?

MDQ is a mechanism that allows entities to request only the metadata they need, as they need it.

Traditionally, your entity has been configured to regularly download and verify the whole metadata aggregate. In the MDQ model, you configure your entity to request metadata from the UK federation MDQ server using the MDQ protocol.

Benefits and risks

The major benefit to deployers is that you require a much lower memory footprint for your entity. We do not have definitive figures, although we have anecdotal evidence that a Shibboleth IdP can run with a Java heap size of 500MB, compared to the current recommendation to use a minimum heap size of 1.5 GB when consuming large metadata aggregates like the UK federation metadata.

Another benefit of the MDQ protocol is that metadata for an entity is available at a specific URL. Entities that primarily do bilateral SAML metadata exchanges often request a URL for metadata, and you can use the MDQ sever as a stable base URL. You can also use the URL to pre-fetch metadata for entities which you are risk-averse to any loss of service.

The major risk is that you move from a situation where your entity downloads metadata in the background, which allows retries if the metadata server is unavailable, to a situation where you query metadata just-in-time. To mitigate this risk, the UK federation has deployed a MDQ server with several mechanisms for resilience. However, we cannot guarantee delivery.

Another risk is that, by using a MDQ service, you indicate to the provider of that service which entities you interoperate with.

A risk specific to IdP operators is that your existing attribute release policies may implicitly reference a metadata aggregate, so we recommend that you review these before moving to MDQ.

In particular, a Shibboleth IdP which has a PolicyRequirementRule that includes a type of InEntityGroup (v3.2.0 and later), saml:AttributeRequesterInEntityGroup or saml:InEntityGroup will implicitly reference a metadata aggregate with an EntitiesDescriptor container. The UK federation's MDQ service does not use this concept, sending only a single EntityDescriptor for each query, so the effect of using one of those types of PolicyRequirementRules is that attributes referenced in that rule will not be released.

Please talk to the UK federation support team if you want to discuss how to re-write your attribute release policies.

The InCommon per-entity Metadata Working Group final report details much more about the operational aspects of an MDQ server.

New signing key

Metadata from the MDQ server is signed using a different key to our metadata aggregates. The certificate is available from http://mdq.ukfederation.org.uk/ as is other information. Please remember to check the fingerprint of the certificate when you download it, by phoning the UK federation helpdesk.

Configuration example for Shibboleth IdP

Version 3 of the Shibboleth IdP supports the MDQ protocol. Here is an example of how to configure your IdP to use the UK federation MDQ server:

    <!-- UK federation MDQ service -->
    <MetadataProvider id="ukfMDQ" xsi:type="DynamicHTTPMetadataProvider">
        <!-- Verify the signature on the root element (i.e., the EntityDescriptor element) -->
        <MetadataFilter xsi:type="SignatureValidation" requireSignedRoot="true"
                certificateFile="%{idp.home}/credentials/ukfederation-mdq.pem" />

        <!-- Require a validUntil XML attribute no more than 30 days into the future -->
        <MetadataFilter xsi:type="RequiredValidUntil" maxValidityInterval="P30D" />

        <!-- The MetadataQueryProtocol element specifies the base URL for the query protocol -->
        <MetadataQueryProtocol>http://mdq.ukfederation.org.uk/</MetadataQueryProtocol>
    </MetadataProvider>

Please note that the requireSignedRoot attribute on the SignatureValidation filter was added in v3.2.0. You cannot use this configuration with an older version of the IdP because the MetadataResolverService will fail to load. If this affects you, we recommend that you upgrade your IdP to a supported version. Do not remove the SignatureValidation filter or you open your IdP to MITM (man in the middle) attacks. If you are unable to upgrade your IdP to a supported version of the IdP, you can use the requireSignedMetadata although this is a deprecated attribute that will be removed at some point.

Also note If you are an IdP operator whose attribute release policies had previously depended on your use of a metadata aggregate tagged with groupID="http://ukfederation.org.uk" or the equivalent for metadata aggregate files from other federations (see Benefits and risks above), you may find the method for tagging per-entity metadata as described in the Internet2 Metadata Distribution Service documentation useful.

Configuration example for Shibboleth SP

This example configures a Shibboleth V3 SP to use the UK federation MDQ service for all entities. The SP will query the MDQ server when it needs metadata for a specific IdP, and will cache the result. Reference documentation for the MDQ plugin is available on the Shibboleth wiki.

        <!-- UK federation MDQ service -->
        <MetadataProvider type="MDQ" id="ukf-mdq" ignoreTransport="true" cacheDirectory="ukf-mdq"
            baseUrl="http://mdq.ukfederation.org.uk/">
            <MetadataFilter type="Signature" certificate="ukfederation-mdq.pem"/>
            <MetadataFilter type="RequireValidUntil" maxValidityInterval="8640000"/>
        </MetadataProvider>

The V2 SP is capable of using a MDQ server, although it does not have the type="MDQ" option available. You can find V2-specific configuration instructions on the Shibboleth wiki, although as V2 has been denoted End-of-Life, our advice is to upgrade to a supported version before making configuration changes.

Configuration for non-Shibboleth software

Please contact your software vendor.