123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366 |
- Data Source Library Classes
- ===========================
- About this document
- -------------------
- This memo describes major classes used in the data source library,
- mainly focusing on handling in-memory cache with consideration of the
- shared memory support. It will give an overview of the entire design
- architecture and some specific details of how these classes are expected
- to be used.
- Before reading, the higher level inter-module protocol should be understood:
- http://bind10.isc.org/wiki/SharedMemoryIPC
- Overall relationships between classes
- -------------------------------------
- The following diagram shows major classes in the data source library
- related to in-memory caches and their relationship.
- image::overview.png[Class diagram showing overview of relationships]
- Major design decisions of this architecture are:
- * Keep each class as concise as possible, each focusing on one or
- small set of responsibilities. Smaller classes are generally easier
- to understand (at the cost of understanding how they work in the
- "big picture" of course) and easier to test.
- * On a related point, minimize dependency to any single class. A
- monolithic class on which many others are dependent is generally
- difficult to maintain because you'll need to ensure a change to the
- monolithic class doesn't break anything on any other classes.
- * Use polymorphism for any "fluid" behavior, and hide specific details
- under abstract interfaces so implementation details won't be
- directly referenced from any other part of the library.
- Specifically, the underlying memory segment type (local, mapped, and
- possibly others) and the source of in-memory data (master file or
- other data source) are hidden via a kind of polymorphism.
- * Separate classes directly used by applications from classes that
- implement details. Make the former classes as generic as possible,
- agnostic about implementation specific details such as the memory
- segment type (or, ideally and where possible, whether it's for
- in-memory cache or the underlying data source).
- The following give a summarized description of these classes.
- * `ConfigurableClientList`: The front end to application classes. An
- application that uses the data source library generally maintains
- one or more `ConfigurableClientList` object (usually one per RR
- class, or when we support views, probably one per view). This class
- is a container of sets of data source related classes, providing
- accessor to these classes and also acting as a factory of other
- related class objects. Note: Due to internal implementation
- reasons, there is a base class for `ConfigurableClientList` named
- `ClientList` in the C++ version, and applications are expected to
- use the latter. But conceptually `ConfigurableClientList` is an
- independent value class; the inheritance is not for polymorphism.
- Note also that the Python version doesn't have the base class.
- * `DataSourceInfo`: this is a straightforward tuple of set of class
- objects corresponding to a single data source, including
- `DataSourceClient`, `CacheConfig`, and `ZoneTableSegment`.
- `ConfigurableClientList` maintains a list of `DataSourceInfo`, one
- for each data source specified in its configuration.
- * `DataSourceClient`: The front end class to applications for a single
- data source. Applications will get a specific `DataSourceClient`
- object by `ConfigurableClientList::find()`.
- `DataSourceClient` itself is a set of factories for various
- operations on the data source such as lookup or update.
- * `CacheConfig`: library internal representation of in-memory cache
- configuration for a data source. It knows which zones are to be
- cached and where the zone data (RRs) should come from, either from a
- master file or other data source. With this knowledge it will
- create an appropriate `LoadAction` object. Note that `CacheConfig`
- isn't aware of the underlying memory segment type for the in-memory
- data. It's intentionally separated from this class (see the
- conciseness and minimal-dependency design decisions above).
- * `ZoneTableSegment`: when in-memory cache is enabled, it provides
- memory-segment-type independent interface to the in-memory data.
- This is an abstract base class (see polymorphism in the design
- decisions) and inherited by segment-type specific subclasses:
- `ZoneTableSegmentLocal` and `ZoneTableSegmentMapped` (and possibly
- others). Any subclass of `ZoneTableSegment` is expected to maintain
- the specific type of `MemorySegment` object.
- * `ZoneWriter`: a frontend utility class for applications to update
- in-memory zone data (currently it can only load a whole zone and
- replace any existing zone content with a new one, but this should be
- extended so it can handle partial updates).
- Applications will get a specific `ZoneWriter`
- object by `ConfigurableClientList::getCachedZoneWriter()`.
- `ZoneWriter` is constructed with `ZoneableSegment` and `LoadAction`.
- Since these are abstract classes, `ZoneWriter` doesn't have to be
- aware of "fluid" details. It's only responsible for "somehow" preparing
- `ZoneData` for a new version of a specified zone using `LoadAction`,
- and installing it in the `ZoneTable` (which can be accessed via
- `ZoneTableSegment`).
- * `DataSourceStatus`: created by `ConfigurableClientList::getStatus()`,
- a straightforward tuple that represents some status information of a
- specific data source managed in the `ConfigurableClientList`.
- `getStatus()` generates `DataSourceStatus` for all data sources
- managed in it, and returns them as a vector.
- * `ZoneTableAccessor`, `ZoneTableIterator`: frontend classes to get
- access to the conceptual "zone table" (a set of zones) stored in a
- specific data source. In particular, `ZoneTableIterator` allows
- applications to iterate over all zones (by name) stored in the
- specific data source.
- Applications will get a specific `ZoneTableAccessor`
- object by `ConfigurableClientList::getZoneTableAccessor()`,
- and get an iterator object by calling `getIterator` on the accessor.
- These are abstract classes and provide unified interfaces
- independent from whether it's for in-memory cached zones or "real"
- underlying data source. But the initial implementation only
- provides the in-memory cache version of subclass (see the next
- item).
- * `ZoneTableAccessorCache`, `ZoneTableIteratorCache`: implementation
- classes of `ZoneTableAccessor` and `ZoneTableIterator` for in-memory
- cache. They refer to `CacheConfig` to get a list of zones to be
- cached.
- * `ZoneTableHeader`, `ZoneTable`: top-level interface to actual
- in-memory data. These were separated based on a prior version of
- the design (http://bind10.isc.org/wiki/ScalableZoneLoadDesign) where
- `ZoneTableHeader` may contain multiple `ZoneTable`s. It's
- one-to-one relationship in the latest version (of implementation),
- so we could probably unify them as a cleanup.
- * `ZoneData`: representing the in-memory content of a single zone.
- `ZoneTable` contains (zero, one or) multiple `ZoneData` objects.
- * `RdataSet`: representing the in-memory content of (data of) a single
- RRset.
- `ZoneData` contains `RdataSet`s corresponding to the RRsets stored
- in the zone.
- * `LoadAction`: a "polymorphic" functor that implements loading zone
- data into memory. It hides from its user (i.e., `ZoneWriter`)
- details about the source of the data: master file or other data
- source (and perhaps some others). The "polymorphism" is actually
- realized as different implementations of the functor interface, not
- class inheritance (but conceptually the effect and goal is the
- same). Note: there's a proposal to replace `LoadAction` with
- a revised `ZoneDataLoader`, although the overall concept doesn't
- change. See Trac ticket #2912.
- * `ZoneDataLoader` and `ZoneDataUpdater`: helper classes for the
- `LoadAction` functor(s). These work independently from the source
- of data, taking a sequence of RRsets objects, converting them
- into the in-memory data structures (`RdataSet`), and installing them
- into a newly created `ZoneData` object.
- Sequence for auth module using local memory segment
- ---------------------------------------------------
- In the remaining sections, we explain how the classes shown in the
- previous section work together through their methods for commonly
- intended operations.
- The following sequence diagram shows the case for the authoritative
- DNS server module to maintain "local" in-memory data. Note that
- "auth" is a conceptual "class" (not actually implemented as a C++
- class) to represent the server application behavior. For the purpose
- of this document that should be sufficient. The same note applies to
- all examples below.
- image::auth-local.png[Sequence diagram for auth server using local memory segment]
- 1. On startup, the auth module creates a `ConfigurableClientList`
- for each RR class specified in the configuration for "data_sources"
- module. It then calls `ConfigurableClientList::configure()`
- for the given configuration of that RR class.
- 2. For each data source, `ConfigurableClientList` creates a
- `CacheConfig` object with the corresponding cache related
- configuration.
- 3. If in-memory cache is enabled for the data source,
- `ZoneTableSegment` is also created. In this scenario the cache
- type is specified as "local" in the configuration, so a functor
- creates `ZoneTableSegmentLocal` as the actual instance.
- In this case its `ZoneTable` is immediately created, too.
- 4. `ConfigurableClientList` checks if the created `ZoneTableSegment` is
- writable. It is always so for "local" type of segments. So
- `ConfigurableClientList` immediately loads zones to be cached into
- memory. For each such zone, it first gets the appropriate
- `LoadAction` through `CacheConfig`, then creates `ZoneWriter` with
- the `LoadAction`, and loads the data using the writer.
- 5. If the auth module receives a "reload" command for a cached zone
- from other module (xfrin, an end user, etc), it calls
- `ConfigurableClientList::getCachedZoneWriter` to load and install
- the new version of the zone. The same loading sequence takes place
- except that the user of the writer is the auth module.
- Also, the old version of the zone data is destroyed at the end of
- the process.
- Sequence for auth module using mapped memory segment
- ----------------------------------------------------
- This is an example for the authoritative server module that uses
- mapped type memory segment for in-memory data.
- image::auth-mapped.png[Sequence diagram for auth server using mapped memory segment]
- 1. The sequence is the same to the point of creating `CacheConfig`.
- 2. But in this case a `ZoneTableSegmentMapped` object is created based
- on the configuration of the cache type. This type of
- `ZoneTableSegment` is initially empty and isn't even associated
- with a `MemorySegment` (and therefore considered non-writable).
- 3. `ConfigurableClientList` checks if the zone table segment is
- writable to know whether to load zones into memory by itself,
- but as `ZoneTableSegment::isWritable()` returns false, it skips
- the loading.
- 4. The auth module gets the status of each data source, and notices
- there's a `WAITING` state of segment. So it subscribes to the
- "Memmgr" group on a command session and waits for an update
- from the memory manager (memmgr) module. (See also the note at the
- end of the section)
- 5. When the auth module receives an update command from memmgr, it
- calls `ConfigurableClientList::resetMemorySegment()` with the command
- argument and the segment mode of `READ_ONLY`.
- Note that the auth module handles the command argument as mostly
- opaque data; it's not expected to deal with details of segment
- type-specific behavior. If the reset fails, auth aborts (as there's
- no clear way to handle the failure).
- 6. `ConfigurableClientList::resetMemorySegment()` subsequently calls
- `reset()` method on the corresponding `ZoneTableSegment` with the
- given parameters.
- In the case of `ZoneTableSegmentMapped`, it creates a new
- `MemorySegment` object for the mapped type, which internally maps
- the specific file into memory.
- memmgr is expected to have prepared all necessary data in the file,
- so all the data are immediately ready for use (i.e., there
- shouldn't be any explicit load operation).
- 7. When a change is made in the mapped data, memmgr will send another
- update command with parameters for new mapping. The auth module
- calls `ConfigurableClientList::resetMemorySegment()`, and the
- underlying memory segment is swapped with a new one. The old
- memory segment object is destroyed. Note that
- this "destroy" just means unmapping the memory region; the data
- stored in the file are intact. Again, if mapping fails, auth
- aborts.
- 8. If the auth module happens to receive a reload command from other
- module, it could call
- `ConfigurableClientList::getCachedZoneWriter()`
- to reload the data by itself, just like in the previous section.
- In this case, however, the writability check of
- `getCachedZoneWriter()` fails (the segment was created as
- `READ_ONLY` and is non-writable), so loading won't happen.
- NOTE: While less likely in practice, it's possible that the same auth
- module uses both "local" and "mapped" (and even others) type of
- segments for different data sources. In such cases the sequence is
- either the one in this or previous section depending on the specified
- segment type in the configuration. The auth module itself isn't aware
- of per segment-type details, but changes the behavior depending on the
- segment state of each data source at step 4 above: if it's `WAITING`,
- it means the auth module needs help from memmgr (that's all the auth
- module should know; it shouldn't be bothered with further details such
- as mapped file names); if it's something else, the auth module doesn't
- have to do anything further.
- Sequence for memmgr module initialization using mapped memory segment
- ---------------------------------------------------------------------
- This sequence shows the common initialization sequence for the
- memory manager (memmgr) module using a mapped type memory segment.
- This is a mixture of the sequences shown in Sections 2 and 3.
- image::memmgr-mapped-init.png[]
- 1. Initial sequence is the same until the application module (memmgr)
- calls `ConfigurableClientList::getStatus()` as that for the
- previous section.
- 2. The memmgr module identifies the data sources whose in-memory cache
- type is "mapped". (Unlike other application modules, the memmgr
- should know what such types means due to its exact responsibility).
- For each such data source, it calls
- `ConfigurableClientList::resetMemorySegment` with the READ_WRITE
- mode and other mapped-type specific parameters. memmgr should be
- able to generate the parameters from its own configuration and
- other data source specific information (such as the RR class and
- data source name).
- 3. The `ConfigurableClientList` class calls
- `ZoneTableSegment::reset()` on the corresponding zone table
- segment with the given parameters. In this case, since the mode is
- READ_WRITE, a new `ZoneTable` will be created (assuming this is a
- very first time initialization; if there's already a zone table
- in the segment, it will be used).
- 4. The memmgr module then calls
- `ConfigurableClientList::getZoneTableAccessor()`, and calls the
- `getIterator()` method on it to get a list of zones for which
- zone data are to be loaded into the memory segment.
- 5. The memmgr module loads the zone data for each such zone. This
- sequence is the same as shown in Section 2.
- 6. On loading all zone data, the memmgr module sends an update command
- to all interested modules (such as auth) in the segment, and waits
- for acknowledgment from all of them.
- 7. Then it calls `ConfigurableClientList::resetMemorySegment()` for
- this data source with almost the same parameter as step 2 above,
- but with a different mapped file name. This will make a swap of
- the underlying memory segment with a new mapping. The old
- `MemorySegment` object will be destroyed, but as explained in the
- previous section, it simply means unmapping the file.
- 8. The memmgr loads the zone data into the newly mapped memory region
- by repeating the sequence shown in step 5.
- 9. The memmgr repeats all this sequence for data sources that use
- "mapped" segment for in-memory cache. Note: it could handle
- multiple data sources in parallel, e.g., while waiting for
- acknowledgment from other modules.
- Sequence for memmgr module to reload a zone using mapped memory segment
- -----------------------------------------------------------------------
- This example is a continuation of the previous section, describing how
- the memory manager reloads a zone in mapped memory segment.
- image::memmgr-mapped-reload.png[]
- 1. When the memmgr module receives a reload command from other module,
- it calls `ConfigurableClientList::getCachedZoneWriter()` for the
- specified zone name. This method checks the writability of
- the segment, and since it's writable (as memmgr created it in the
- READ_WRITE mode), `getCachedZoneWriter()` succeeds and returns
- a `ZoneWriter`.
- 2. The memmgr module uses the writer to load the new version of zone
- data. There is nothing specific to mapped-type segment here.
- 3. The memmgr module then sends an update command to other modules
- that would share this version, and waits for acknowledgment from
- all of them.
- 4. On getting acknowledgments, the memmgr module calls
- `ConfigurableClientList::resetMemorySegment()` with the parameter
- specifying the other mapped file. This will swap the underlying
- `MemorySegment` with a newly created one, mapping the other file.
- 5. The memmgr updates this segment, too, so the two files will contain
- the same version of data.
|