Long term: * Make a mechanism the cache (which does not exist at the time of writing this note) will be able to notify the NSAS that something has changed (address, new nameserver, etc). Because the cache will have access to the data and knows when it changes (it updates its structures), it is the best place. It will be caching even data like authority and additional sections. It will notify us somehow (we will need to tell it when). The changes we need to know about is when set of nameservers or set of addresses for a nameserver change and when a NS record or nameserver's A or AAAA record is explicitly removed from the cache. * Optimisation to pass max two outstanding queries on the network (but fetch everything from cache right away). The first can be done by having number of packets on the network, with max of 4 (each query are 2 of them, A and AAAA), if it drops to 2, another one can be send. * Add the cache cookies/contexts. * Logging. * Remove LRU from the nameserver entries, drop them when they are not referenced by any zone entry. This will remove duplicates, keep the RTTs longer and will provide access to everything that exists. This is tricky, though, because we need to be thread safe. There seems to be solution to use weak_ptr inside the hash_table instead of shared_ptr and catch the exception inside get() (and getOrAdd) and delete the dead pointer. * Better way to dispatch all calbacks in a list is needed. We take them out of the list and dispatch them one by one. This is wrong because when an exception happens inside the callback, we lose the ones not dispatched yet. What should be done in this situation anyway? Putting them back? Will anybody still call them? Taking them one by one? Or recommend that if the result is really needed, that destruction of it should be considered failure if it wasn't called yet? Make it the default (eg. signal failure by destruction or call that function from destructor)? * Make a zone entry hash table have multiple LRU lists, each one for part of the slots. This will prevent locking contention while still keeping close to the theoretical LRU behaviour (statistically, accesses to each of the part should be as common as to others). It might be a good idea to encapsulate the LRUs into the hash table directly (or create a class holding both the hash table and the LRU lists).