This directory holds the first release of the logging system.


Basic Ideas
===========
The BIND-10 logging system merges two ideas:

* A hierarchical logging system similar to that used in Java (i.e. log4j)
* Separation of message use from message text


Hierarchical Logging System
===========================
When a program writes a message to the logging system, it does so using an
instance of the Logger class.  As well as performing the write of the message,
the logger identifies the source of the message: different sources can write
to different destinations and can log different severities of messages.
For example, the "cache" logger could write messages of DEBUG severity or
above to a file while all other components write messages of "INFO" severity
or above to the Syslog file.

The loggers are hierarchical in that each logger is the child of another
logger.  The top of the hierarchy is the root logger, which does not have
a parent.  The point of the hierarchy is that unless a logger is explicitly
assigned an attribute (such as severity of message being logger), it picks
it up from the parent.  (In BIND-10, there is the root logger (named after
the program) and every other logger is a child of that.)  So in the example
above, the INFO/Syslog attributes could be associated with the root logger
while the DEBUG/file attributes are associated with the "cache" logger.


Separation of Messages Use from Message Text
============================================
By separating the use of the message from the text associated with this -
in essence, defining message text in an external file - it is possible to
replace the supplied text of the messages with a local language version.

Each message is identified by an identifier e.g. "LOG_WRITE_ERROR".
Within the program, this is the symbol passed to the logging system.
The logger system uses the symbol as an index into a dictionary to
retrieve the message associated with it (e.g. "unable to open %s for
input").  It then substitutes any message parameters (in this example,
the name of the file where the write operation failed) and logs it to
the destination.

In BIND-10, a the default text for each message is linked into the
program.  Each program is able to read a locally-defined message file
when it starts, updating the stored definitions with site-specific text.
When the message is logged, the updated text is output. However, the
message identifier is always included in the output so that the origin
of the message can be identified even if the text has been changed.


Using The System
================
The steps in using the system are:

1. Create a message file.  This defines messages by an identification - a
   mnemonic for the message, the convention being that these are a few
   words separated by underscores - and text that explains the message in
   more detail.  The file is described in more detail below.

   Ideally the file should have a file type of ".mes".

2. Run it through the message compiler to produce the .h and .cc files.  This
   step should be included in the build process. (For an example of how to
   do this, see src/lib/nsas/Makefile.am.) During development though, the
   message compiler (found in the "src/lib/log/compiler" directory) will need
   to be run manually.  The only argument is the name of the message file: it
   will produce as output two files in the default directory, having the same
   name as the input file but with file types of ".h" and ".cc".

3. Include the .h file in your source code to define message symbols, and
   make sure that the .cc file is compiled and linked into your program -
   static initialization will add the symbols to the global dictionary.

4. Declare loggers in your code and use them to log messages.  This is
   described in more detail below.

5. To set the debug level and run-time message file, call initLogger (declared
   in logger_support.h) in the main program unit.


Message Files
=============

File Contents and Format
------------------------
A message file is a file containing message definitions.  Typically there
will be one message file for each component that declares message symbols.
An example file could be:

-- BEGIN --

# Example message file

$NAMESPACE isc::log

% LOG_UNRECOGNISED_DIRECTIVE line %1: unrecognised directive '%2'
A line starting with a dollar symbol was found, but the first word on the line
(shown in the message) was not a recognised message compiler directive.

% LOG_WRITE_ERROR error writing to %1: %2
The specified error was encountered by the message compiler when writing to
the named output file.

-- END --

Points to note:
* Leading and trailing space are trimmed from the line.  Although the above
  example has every line starting at column 1, the lines could be indented
  if desired.

* Blank lines are ignored.

* Lines starting with "#" are comments are are ignored.  Comments must be on
  a line by themselves - inline comments will be interpreted as part of the
  text of the line.

* Lines starting $ are directives.  At present, just one directive is
  recognised:

  * $NAMESPACE, which has one argument: the namespace in which the symbols are
    created.  In the absence of a $NAMESPACE directive, symbols will be put in
    the anonymous namespace.

* Message lines.  These start with a "%" and are followed by the message
  identification and the message text, the latter including zero or more
  replacement tokens, e.g.

     % LOG_WRITE_ERROR error writing to %1: %2

  * There may be zero or more spaces between the leading "%" and the message
    identification (which, in the example above, is the string
    "LOG_WRITE_ERROR").

  * The message identification can be any string of letters, digits and
    underscores, but should not start with a digit.  The convention adopted
    in BIND 10 is for the first component (before the first underscore) to be
    a string indicating the origin of the message, and the remainder to
    describe the message.  So in the example above, the LOG_ indicates that
    the error originated from the logging library and the "WRITE_ERROR"
    indicates that there was a problem in a write operation.

  * The rest of the line - from the first non-space character to the
    last non- space character - is taken exactly for the text
    of the message. There are no restrictions on what characters may
    be in this text, other than they be printable.  (This means that
    both single-quote (') and double-quote (") characters are allowed.)
    The message text may include replacement tokens (the strings "%1",
    "%2" etc.).  When a message is logged, these are replaced with the
    arguments passed to the logging call: %1 refers to the first argument,
    %2 to the second etc.  Within the message text, the placeholders
    can appear in any order and placeholders can be repeated. Otherwise,
    the message is printed unmodified.

* Remaining lines indicate an explanation for the preceding message. These
  are intended to be processed by a separate program and used to generate
  an error messages manual.  They are ignored by the message compiler.

Message Compiler
----------------
The message compiler is a program built in the src/log/compiler directory.
It is invoked by the command:

    message [-h] [-v] -p] <message-file>

("-v" prints the version number and exits; "-h" prints brief help text.)  The
compiler produces source files for C++ and Python.

C++ Files
---------
Without the "-p" option, the message compiler processes the message file
to produce two files:

1) A C++ header file (called <message-file-name>.h) that holds lines of
the form:

   namespace <namespace> {
   extern const isc::log::MessageID LOG_WRITE_ERROR;
      :
   }

The symbols define the keys in the global message dictionary, with the
namespace enclosing the symbols set by the $NAMESPACE directive.

(This is the reason for the restriction on message identifiers - they
have to be valid C++ symbol names.)

2) A C++ source file (called <message-file-name>.cc) that holds the definitions
of the global symbols and code to insert the symbols and messages into the map.

Symbols are defined to be equal to strings holding the identifier, e.g.

   extern const isc::log::MessageID LOG_WRITE_ERROR = "LOG_WRITE_ERROR";

(The implementation allows symbols to be compared.  However, use of strings
should not be assumed - a future implementation may change this.)

In addition, the file declares an array of identifiers/messages and an object
to add them to the global dictionary:

    namespace {
    const char* values[] = {
        identifier1, text1,
        identifier2, text2,
        :
        NULL
    };

    const isc::log::MessageInitializer initializer(values);
    }

The constructor of the MessageInitializer object retrieves the singleton
global Dictionary object (created using standard methods to avoid the
"static initialization fiasco") and adds each identifier and text to it.
A check is made as each is added; if the identifier already exists, it is
added to "overflow" vector; the vector is printed to the main logging output
when logging is finally enabled (to indicate a programming error).

Python Files
------------
If the "-p" option is given, the compiler produces a Python module defining
the messages.  The format of this is:

import isc.log
    :
LOG_WRITE_ERROR = isc.log.create_message("LOG_WRITE_ERROR",
        "error writing to %1 : %2")

(The definition is output on one line - it is split across two lines in this
document for readability.)

The module can be imported into other Python code, and messages logged
in a similar way to C++ using the Python logging library.

Using the Logging - C++
=======================
1. Build message header file and source file as describe above.

2. The main program unit must include a call to isc::log::initLogger()
   (described in more detail below) to set the logging severity, debug log
   level, and external message file:

   a) The logging severity is one of the enum defined in logger.h, i.e.

        isc::log::DEBUG
        isc::log::INFO
        isc::log::WARN
        isc::log::ERROR
        isc::log::FATAL
        isc::log::NONE

   b) The debug log level is only interpreted when the severity is
      DEBUG and is an integer ranging from 0 to 99.  0 should be used
      for the highest-level debug messages and 99 for the lowest-level
      (and typically more verbose) messages.

   c) The external message file.  If present, this is the same as a
      standard message file, although it should not include any
      directives. (A single directive of a particular type will be
      ignored; multiple directives will cause the read of the file to
      fail with an error.)

   The settings remain in effect until the logging configuration is read,
   and so provide the default logging during program initialization.

3. Declare a logger through which the message will be logged.

       isc::log::Logger logger("name");

   The string passed to the constructor is the name of the logger (it
   can be any string) and is used when configuring it.  Loggers with
   the same name share the same configuration.

4. Issue logging calls using supplied macros in "log/macros.h", e.g.

       LOG_ERROR(logger, LOG_WRITE_ERROR).arg("output.txt");

   (The macros are more efficient that calls to the methods on the logger
   class: they avoid the overhead of evaluating the parameters to arg()
   if the settings are such that the message is not going to be output.)

Using the Logging - Python
==========================
1. Build message module as describe above.

2. The main program unit must include a call to isc.log.init()
   (described in more detail below) to set the to set the logging
   severity, debug log level, and external message file:

   a) The logging severity is one of the strings:

        DEBUG
        INFO
        WARN
        ERROR
        FATAL
        NONE

   b) The debug log level is only interpreted when the severity is
      DEBUG and is an integer ranging from 0 to 99.  0 should be used
      for the highest-level debug messages and 99 for the lowest-level
      (and typically more verbose) messages.

   c) The external message file.  If present, this is the same as a
      standard message file, although it should not include any
      directives. (Any that are there will be ignored.)

   The settings remain in effect until the logging configuration is read,
   and so provide the default logging during program initialization.

3. Declare a logger through which the message will be logged.

       isc.log.Logger logger("name")

   The string passed to the constructor is the name of the logger (it
   can be any string) and is used when configuring it.  Loggers with
   the same name share the same configuration.

4. Issue calls to the logging methods:

       logger.error(LOG_WRITE_ERROR, "output.txt");

Logging Initialization
======================
In all cases, if an attempt is made to use a logging method before the logging
has been initialized, the program will terminate with a LoggingNotInitialized
exception.

C++
---
Logging Initialization is carried out by calling initLogger().  There are two
variants to the call, one for use by production programs and one for use by
unit tests.

Variant #1, Used by Production Programs
---------------------------------------
void isc::log::initLogger(const std::string& root,
                          isc::log::Severity severity = isc::log::INFO,
                          int dbglevel = 0, const char* file = NULL,
                          bool buffer = false);

This is the call that should be used by production programs:

root
Name of the program (e.g. "b10-auth").  This is also the name of the root
logger and is used when configuring logging.

severity
Default severity that the program will start logging with.  Although this may
be overridden when the program obtains its configuration from the configuration
database, this is the severity that it used until then.  (This may be set by
a command-line parameter.)

dbglevel
The debug level used if "severity" is set to isc::log::DEBUG.

file
The name of a local message file.  This will be read and its definitions used
to replace the compiled-in text of the messages.

buffer
If set to true, initial log messages will be internally buffered, until the
first time a logger specification is processed. This way the program can
use logging before even processing its logging configuration. As soon as any
specification is processed (even an empty one), the buffered log messages will
be flushed according to the specification. Note that if this option is used,
the program SHOULD call one of the LoggerManager's process() calls (if you
are using the built-in logging configuration handling in ModuleCCSession,
this is automatically handled). If the program exits before this is done,
all log messages are dumped in a raw format to stdout (so that no messages
get lost).

Variant #2, Used by Unit Tests
------------------------------
    void isc::log::initLogger()

This is the call that should be used by unit tests.  In this variant, all the
options are supplied by environment variables. (It should not be used for
production programs to avoid the chance that the program operation is affected
by inadvertently-defined environment variables.)

The environment variables are:

B10_LOGGER_ROOT
Sets the "root" for the unit test.  If not defined, the name "bind10" is used.

B10_LOGGER_SEVERITY
The severity to set for the root logger in the unit test.  Valid values are
"DEBUG", "INFO", "WARN", "ERROR", "FATAL" and "NONE".  If not defined, "INFO"
is used.

B10_LOGGER_DBGLEVEL
If B10_LOGGER_SEVERITY is set to "DEBUG", the debug level.  This can be a
number between 0 and 99, and defaults to 0.

B10_LOGGER_LOCALMSG
If defined, points to a local message file.  The default is not to use a local
message file.

B10_LOGGER_DESTINATION
The location to which log message are written.  This can be one of:

   stdout               Message are written to stdout
   stderr               Messages are written to stderr
   syslog[:facility]    Messages are written to syslog.  If the optional
                        "facility" is used, the messages are written using
                        that facility.  (This defaults to "local0" if not
                        specified.)
   Anything else        Interpreted as the name of a file to which output
                        is appended.  If the file does not exist, a new one
                        is opened.

In the case of "stdout", "stderr" and "syslog", they must be written exactly
as is - no leading or trailing spaces, and in lower-case.

Python
------
To be supplied


Severity Guidelines
===================
When using logging, the question arises, what severity should a message
be logged at?  The following is a suggestion - as always, the decision
must be made in the context of which the message is logged.

One thing that should always be borne in mind is whether the logging
could be used as a vector for a DOS attack.  For example, if a warning
message is logged every time an invalid packet is received, an attacker
could simply send large numbers of invalid packets.  (Of course, warnings
could be disabled (or just warnings for that that particular logger),
but nevertheless the message is an attack vector.)

FATAL
-----
The program has encountered an error that is so severe that it cannot
continue (or there is no point in continuing).  When a fatal error
has been logged, the program will usually exit immediately (or shortly
afterwards) after dumping some diagnostic information.

ERROR
-----
Something has happened such that the program can continue but the
results for the current (or future) operations cannot be guaranteed to
be correct, or the results will be correct but the service is impaired.
For example, the program started but attempts to open one or more network
interfaces failed.

WARN
----
An unusual event  happened.  Although the program will continue working
normally, the event was sufficiently out of the ordinary to warrant
drawing attention to it.  For example, at program start-up a zone was
loaded that contained no resource records,

INFO
----
A normal but significant event has occurred that should be recorded,
e.g. the program has started or is just about to terminate, a new zone
has been created, etc.

DEBUG
-----
This severity is only enabled on for debugging purposes.  A debug level is
associated with debug messages, level 0 (the default) being for high-level
messages and level 99 (the maximum) for the lowest level.  How the
messages are distributed between the levels is up to the developer.
So if debugging the NSAS (for example), a level 0 message might record
the creation of a new zone, a level 10 recording a timeout when trying
to get a nameserver address, but a level 50 would record every query for
an address. (And we might add level 70 to record every update of the RTT.)

Note that like severities, levels are cumulative; so if level 25 is
set as the debug level, all debug levels from 0 to 25 will be output.
In fact, it is probably easier to visualise the debug levels as part of
the severity system:

    FATAL                High
    ERROR
    WARN
    INFO
    DEBUG level 0
    DEBUG level 1
       :
    DEBUG level 99       Low

When a particular severity is set, it - and all severities and/or debug
levels above it - will be logged.

To try to ensure that the information from different modules is roughly
comparable for the same debug level, a set of standard debug levels has
been defined for common type of debug output.  However, modules are free
to set their own debug levels or define additional ones.

Logging Sources v Logging Severities
------------------------------------
When logging events, make a distinction between events related to the
server and events related to DNS messages received.  Caution needs to
be exercised with the latter as, if the logging is enabled in the normal
course of events, such logging could be a denial of service vector. For
example, suppose that the main authoritative service logger were to
log both zone loading and unloading as INFO and a warning message if
it received an invalid packet. An attacker could make the INFO messages
unusable by flooding the server with malformed packets.

There are two approaches to get round this:

a) Make the logging of packet-dependent events a DEBUG-severity message.
DEBUG is not enabled by default, so these events will not be recorded
unless DEBUG is specifically chosen.

b) Record system-related and packet-related messages via different loggers
(e.g.  in the example given, server events could be logged using the
logger "auth" and packet-related events at that level logged using the
logger "pkt-auth".)  As the loggers are independent and the severity
levels independent, fine-tuning of what and what is not recorded can
be achieved.


Notes
=====
The message compiler is written in C++ (instead of Python) because it
contains a component that reads the message file.  This component is used
in both the message compiler and the server; in the server it is used
when the server starts up (or when triggered by a command) to read in
a message file to overwrite the internal dictionary.  Writing it in C++
means there is only one piece of code that does this functionality.