DNS caching problem in Shibboleth Service Provider (libcurl)

This problem was first noticed in July and August 2010 in 3 calls to the UK federation help desk.

Symptoms

  1. The calls came from IdPs who noted that the IdP was not releasing attributes to some SPs. Further investigation revealed that the IdP owners had been altering DNS records for their servers as part of a development process.
  2. The SP logs indicate SAML query exceptions to the Attribute Authority as an ERROR and crucially the IP address of the IdP in the log does not match the actual IP address of the IdP server.
  3. Restarting shibd on the SP fixes the problem.

Reasoning

After some investigation of these cases, and a search of relevant web resources including the shibboleth mailing lists, it looks like versions of libcurl before 7.20 cache and do not refresh DNS lookups. This means that if the DNS is altered so that an IdP endpoint resolves to a different IP address, the SP will note a DNS discrepancy and will not connect to the IdP. The SP uses libcurl for some connections but not all, which results in the hostname/IP address discrepancy for some operations but not all. Restarting the SP fixes the problem in the particular instance, by clearing the cache. However, the problem will reappear the next time a client IdP changes its IP address without changing the server name.

The bug in libcurl has been reported and has been fixed in version 7.20 and later (http://curl.haxx.se/changes.html). Version 7.20 shipped in February 2010 which means that it is unlikely to be deployed in most recently installed SPs.

We do not know how many SP's are affected.

Solutions

The problem can be resolved for any IdPs currently experiencing the problem by restarting the SP's shibd service, which clears the cache. However, the problem will reappear the next time a client IdP changes its IP address without changing the server name.

The SP operator can update libcurl to at least 7.20.

This problem can be avoided at the IdP end by not using the DNS to change the IP address of the IdP server. If the IdP needs to be moved to a different physical server then it is preferable to allocate the IP address of the original server to the new server.

References

curl developer blog

curl change log

'shibboleth-users' discussion on DNS Caching Duration