Browse Source

[master] Merge branch 'trac3800' (stats, control channel docs)

Tomek Mrugalski 10 years ago
parent
commit
7ce8ca5603
4 changed files with 406 additions and 27 deletions
  1. 146 2
      doc/guide/ctrl-channel.xml
  2. 9 0
      doc/guide/dhcp4-srv.xml
  3. 246 18
      doc/guide/stats.xml
  4. 5 7
      src/lib/config/command-socket.dox

+ 146 - 2
doc/guide/ctrl-channel.xml

@@ -8,8 +8,152 @@
 
   <chapter id="ctrl-channel">
     <title>Management API</title>
-    <para>
-      @todo: To be implemented in ticket 3880.
+
+    <para>A classic approach to daemon configuration assumes that
+    the server's configuration is stored in the configuration files
+    and when the configuration is changed, the daemon is restarted.
+    This approach has the significant disadvantage of introducing periods
+    of downtime, when client traffic is not handled. Another risk
+    is that if the new configuration is invalid for whatever reason,
+    the server may refuse to start, which will further extend the
+    downtime period, until the issue is resolved.</para>
+
+    <para>To avoid such problems, both DHCPv4 and DHCPv6 servers
+    introduced support for a mechanism that will allow
+    on-line reconfiguration, without requiring server shutdown.
+    Both servers can be instructed to open control sockets, which
+    is a communication channel. The server is able to receive
+    commands on that channel, act on them and report back status.
+    While the set of commands supported in Kea 0.9.2 is limited,
+    the number is expected to grow over time.</para>
+
+    <para>Currently the only supported type of control channel
+    is UNIX stream socket. For details how to configure it, see
+    <xref linkend="dhcp4-ctrl-channel" /> and <xref
+    linkend="dhcp6-ctrl-channel" />. It is likely that support
+    for other control channel types will be added in the future.
     </para>
 
+    <section id="ctrl-channel-syntax">
+    <title>Data syntax</title>
+    <para>Communication over control channel is conducted using JSON
+    structures. If configured, Kea will open a socket and will listen
+    for any incoming connections. A process connecting to this socket
+    is expected to send JSON commands structured as follows:
+
+<screen>
+{
+    "command": "foo",
+    "arguments": {
+	"param1": "value1",
+	"param2": "value2",
+	...
+    }
+}
+</screen>
+
+    <command>command</command> is the name of command to execute and
+    is mandatory. <command>arguments</command> is a map of parameters
+    required to carry out the given command.  The exact content and
+    format is command specific.</para>
+
+    <para>The server will process the incoming command and then send a
+    response of the form:
+<screen>
+{
+    "result": 0|1,
+    "text": "textual description",
+    "arguments": {
+	"argument1": "value1",
+	"argument2": "value2",
+	...
+    }
+}
+</screen>
+    <command>result</command> indicates the outcome of the command. A value of 0
+    means success while any non-zero value designates an error. Currently 1 is
+    used as a generic error, but additional error codes may be added in the
+    future.<command>text</command> field typically appears when result is
+    non-zero and contains a description of the error encountered, but it may
+    also appear for successful results. That's command specific.
+    <command>arguments</command> is a map of additional data values returned by
+    the server, specific to the command issued. The map is always present, even
+    if it contains no data values.</para>
+    </section>
+
+    <section id="ctrl-channel-client">
+    <title>Using control channel</title>
+
+    <para>ISC does not provide a client for using control channel.  The primary
+    reason for that is the expectation is that the entity using control channel
+    is typically an IPAM or similar network management/monitoring software which
+    may have quite varied expectations regarding the client and is even likely to
+    be written in languages different than C or C++. Therefore we only provide
+    examples how one can take advantage of the API.</para>
+
+    <para>The easiest way is to use a tool called <command>socat</command>,
+    a tool available from <ulink url="http://www.dest-unreach.org/socat/">socat
+    homepage</ulink>, but it is also widely available in Linux and BSD
+    distributions. Once Kea is started, one could connect to the control
+    interface using the following command:
+<screen>
+$ socat UNIX:/path/to/the/kea/socket -
+</screen>
+where <command>/path/to/the/kea/socket</command> is the path specified in the
+<command>Dhcp4/control-socket/socket-name</command> parameter in the Kea
+configuration file.</para>
+
+    <para>It is also easy to open UNIX socket programmatically. An example of
+    such a simplistic client written in C is available in the Kea Developer's
+    Guide, chapter Control Channel Overview, section Using Control Channel.</para>
+
+    </section>
+
+    <section id="commands-common">
+      <title>Commands supported by both DHCPv4 and DHCPv6 servers</title>
+
+      <section id="command-list-commands">
+      <title>list-commands command</title>
+
+      <para>
+	<emphasis>list-commands</emphasis> command retrieves a list of all
+	commands supported by the server. It does not take any arguments.
+	An example command may look like this:
+<screen>
+{
+    "command": "list-commands",
+    "arguments": { }
+}
+</screen>
+      </para>
+      <para>
+	The server will respond with a list of all supported commands. The
+	arguments element will be a list of strings. Each string will convey
+	one supported command.
+      </para>
+    </section> <!-- end of command-list-commands -->
+
+    <section id="command-shutdown">
+      <title>shutdown command</title>
+
+      <para>
+	<emphasis>shutdown</emphasis> command instructs the server to initiate
+	its shutdown procedure. It is the equivalent of sending a SIGTERM singal
+	to the process. This command does not take any arguments.  An example
+	command may look like this:
+<screen>
+{
+    "command": "shutdown",
+    "arguments": { }
+}
+</screen>
+      </para>
+      <para>
+	The server will respond with a confirmation that the shutdown procedure
+	has been initiated.
+      </para>
+    </section> <!-- end of command-shutdown -->
+
+    </section> <!-- end of commands supported by both servers -->
+
   </chapter>

+ 9 - 0
doc/guide/dhcp4-srv.xml

@@ -2888,6 +2888,15 @@ appropiately -->
         Communication over control channel is conducted using JSON structures.
         See the Control Channel section in the Kea Developer's Guide for more details.
       </para>
+
+      <para>DHCPv4 server supports <command>statistic-get</command>,
+      <command>statistic-reset</command>, <command>statistic-remove</command>,
+      <command>statistic-get-all</command>, <command>statistic-reset-all</command>
+      and <command>statistic-remove-all</command>, specified in
+      <xref linkend="command-stats"/>. It also supports
+      <command>list-commands</command> and <command>shutdown</command>,
+      specified in <xref linkend="command-list-commands" /> and
+      <xref linkend="command-shutdown" />, respectively.</para>
     </section>
 
     <section id="dhcp4-std">

+ 246 - 18
doc/guide/stats.xml

@@ -4,30 +4,258 @@
 <!ENTITY mdash  "&#x2014;" >
 ]>
 
-  <!-- Note: Please use the following terminology:
-       - daemon - one process (e.g. kea-dhcp4)
-       - component - one piece of code within a daemon (e.g. libdhcp or hooks)
-       - server - currently equal to daemon, but the difference will be more
-                  prominent once we add client or relay support
-       - logger - one instance of isc::log::Logger
-       - structure - an element in config file (e.g. "Dhcp4")
-
-       Do not use:
-       - module => daemon
-    -->
-
 <chapter id="stats">
   <title>Statistics</title>
 
   <section>
     <title>Statistics Overview</title>
-    
+
+    <para>Both Kea DHCP servers support statistics gathering since
+    0.9.2-beta version. A working DHCP server encounters various events
+    that can cause certain statistics to be collected. For
+    example, a DHCPv4 server may receive a packet (pkt4-received
+    statistic increases by one) that after parsing was identifier as
+    DHCPDISCOVER (pkt4-discover-received). The Server processed it and
+    decided to send DHCPOFFER representing its answer (pkt4-offer-sent
+    and pkt4-sent statistics increase by one). Such events happen
+    frequently, so it is not uncommon for the statistics to have
+    values in high thousands. They can serve as an easy and powerful
+    tool for observing a server's and network's health. For example,
+    if pkt4-received statistic stops growing, it means that the
+    clients' packets are not reaching the server.</para>
+
+    <para>There are four types of statistics:
+    <itemizedlist>
+      <listitem>
+	<simpara><emphasis>integer</emphasis> - this is the most common type.  It
+	is implemented as 64 bit integer (int64_t in C++), so it can hold any
+	value between -2^63 to 2^63 -1.</simpara>
+      </listitem>
+      <listitem>
+	<simpara><emphasis>floating point</emphasis> - this type is intended to
+	store floating point precision. It is implemented as double C++ type.
+	</simpara>
+      </listitem>
+      <listitem>
+	<simpara><emphasis>duration</emphasis> - this type is intended for
+	recording time periods. It uses boost::posix_time::time_duration type,
+	which stores hours, minutes, seconds and microseconds.</simpara>
+      </listitem>
+      <listitem>
+	<simpara><emphasis>string</emphasis> - this type is intended for
+	recording statistics in textual forma. It uses std::string C++ type.
+	</simpara>
+      </listitem>
+    </itemizedlist>
+    </para>
+
     <para>
-      TODO: Describe statistics collection here as part of ticket #3800.
-      For DHCPv4 specific statistics, see <xref linkend="dhcp4-stats"/>.
-      For DHCPv6 specific statistics, see TODO.
-      For DDNS specific statistics, see TODO.
+      During normal operation, DHCPv4 and DHCPv6 servers gather statistics.
+      For a list of DHCPv4 and DHCPv6 statistics, see <xref
+      linkend="dhcp4-stats"/> and <xref linkend="dhcp6-stats"/>, respectively.
+    </para>
+
+    <para>
+      To extract data from the statistics module, the control channel can be
+      used. See <xref linkend="ctrl-channel" /> for details. It is possible to
+      retrieve a single or all statistics, reset statistics (i.e. set to neutral
+      value, typically zero) or even remove completely a single or all
+      statistics. See section <xref linkend="command-stats"/> for a list of
+      statistic oriented commands.
     </para>
   </section>
-</chapter>
 
+  <section id="stats-lifecycle">
+    <title>Statistics Lifecycle</title>
+    <para>
+      It is useful to understand how the Statistics Manager module works. When
+      the server starts operation, the manager is empty and does not have any
+      statistics. When <command>statistic-get-all</command> is executed, an
+      empty list is returned. Once the server performs an operation that causes
+      a statistic to change, the related statistic will be created. In the general
+      case once a statistic is recorded even once, it is kept in the manager, until
+      explicitly removed, by <command>statistic-remove</command> or
+      <command>statistic-remove-all</command> being called or the server is shut
+      down. Per subnet statistics are explicitly removed when reconfiguration
+      takes place.
+    </para>
+    <para>
+      Statistics are considered run-time properties, so they are not retained
+      after server restart.
+    </para>
+    <para>
+      Removing a statistic that is updated frequently makes little sense as it
+      will be re-added when the server code next records that statistic.
+      The <command>statistic-remove</command> and
+      <command>statistic-remove-all</command> commands are intended to remove
+      statistics that are not expected to be observed in the near future. For
+      example, a misconfigured device in a network may cause clients to report
+      duplicate addresses, so the server will report increasing values of
+      pkt4-decline-received. Once the problem is found and the device is
+      removed, the system administrator may want to remove the
+      pkt4-decline-received statistic, so it won't be reported anymore. If a
+      duplicate address is detected ever again, the server will add this
+      statistic back.
+    </para>
+  </section>
+
+  <section id="command-stats">
+    <title>Commands for Manipulating Statistics</title>
+    <para>
+      There are several commands defined that can be used for accessing (-get),
+      resetting to zero or neutral value (-reset) or even removing a statistic
+      completely (-remove). The difference between reset and remove is somewhat
+      subtle.  The reset command sets the value of the statistic to zero or neutral
+      value. After this operation, the statistic will have a value of 0 (integer),
+      0.0 (float), 0h0m0s0us (duration) or "" (string). When asked for, a statistic
+      with the values metioned will be returned. <command>Remove</command> removes
+      a statistic completely, so the statistic will not be reported anymore. Please
+      note that the server code may add it back if there's a reason to record
+      it.
+    </para>
+
+    <section id="command-statistic-get">
+      <title>statistic-get command</title>
+
+      <para>
+	<emphasis>statistic-get</emphasis> command retrieves a single
+	statistic. It takes a single string parameter called
+	<command>name</command> that specifies the statistic name.  An example
+	command may look like this:
+<screen>
+{
+    "command": "statistic-get",
+    "arguments": {
+	"name": "<userinput>pkt4-received</userinput>"
+    }
+}
+</screen>
+      </para>
+      <para>
+	The server will respond with details of the requested statistic, with result
+	set to 0 indicating success and the specified statistic as the value of
+	"arguments" parameter. If the requested statistic is not found, the response
+	will contain an empty map, i.e. only { } as argument, but the status
+	code will still be set to success (0).
+      </para>
+    </section> <!-- end of command-statistic-get -->
+
+    <section id="command-statistic-reset">
+      <title>statistic-reset command</title>
+
+      <para>
+	<emphasis>statistic-reset</emphasis> command sets the specified statistic
+	to its neutral value: 0 for integer, 0.0 for float, 0h0m0s0us for time
+	duration and "" for string type. It takes a single string parameter
+	called <command>name</command> that specifies the statistic name.  An
+	example command may look like this:
+<screen>
+{
+    "command": "statistic-reset",
+    "arguments": {
+	"name": "<userinput>pkt4-received</userinput>"
+    }
+}
+</screen>
+      </para>
+      <para>
+	If the specific statistic is found and reset was successful, the
+	server will respond with a status of 0, indicating success and an empty
+	parameters field. If an error is encountered (e.g. requested statistic
+	was not found), the server will return a status code of 1 (error)
+	and the text field will contain the error description.
+      </para>
+    </section> <!-- end of command-statistic-reset -->
+
+    <section id="command-statistic-remove">
+      <title>statistic-remove command</title>
+
+      <para>
+	<emphasis>statistic-remove</emphasis> command attempts to delete a single
+	statistic. It takes a single string parameter called
+	<command>name</command> that specifies the statistic name.  An example
+	command may look like this:
+<screen>
+{
+    "command": "statistic-remove",
+    "arguments": {
+	"name": "<userinput>pkt4-received</userinput>"
+    }
+}
+</screen>
+      </para>
+      <para>
+	If the specific statistic is found and its removal was successful, the
+	server will respond with a status of 0, indicating success and an empty
+	parameters field. If an error is encountered (e.g. requested statistic
+	was not found), the server will return a status code of 1 (error)
+	and the text field will contain the error description.
+      </para>
+    </section> <!-- end of command-statistic-reset -->
+
+    <section id="command-statistic-get-all">
+      <title>statistic-get-all command</title>
+
+      <para>
+	<emphasis>statistic-get-all</emphasis> command retrieves all statistics
+        recorded. An example command may look like this:
+<screen>
+{
+    "command": "statistic-get-all",
+    "arguments": { }
+}
+</screen>
+      </para>
+      <para>
+	The server will respond with details of all recorded statistics, with result
+	set to 0 indicating that it iterated over all statistics (even when
+	the total number of statistics is zero).
+      </para>
+    </section> <!-- end of command-statistic-get-all -->
+
+    <section id="command-statistic-reset-all">
+      <title>statistic-reset-all command</title>
+
+      <para>
+	<emphasis>statistic-reset</emphasis> command sets all statistics to
+	their neutral values: 0 for integer, 0.0 for float, 0h0m0s0us for time
+	duration and "" for string type. An example command may look like this:
+<screen>
+{
+    "command": "statistic-reset-all",
+    "arguments": { }
+}
+</screen>
+      </para>
+      <para>
+	If the operation is successful, the server will respond with a status of
+	0, indicating success and an empty parameters field. If an error is
+	encountered, the server will return a status code of 1 (error) and the text
+	field will contain the error description.
+      </para>
+    </section> <!-- end of command-statistic-reset-all -->
+
+    <section id="command-statistic-remove-all">
+      <title>statistic-remove-all command</title>
+
+      <para>
+	<emphasis>statistic-remove-all</emphasis> command attempts to delete all
+	statistics. An example command may look like this:
+<screen>
+{
+    "command": "statistic-remove-all",
+    "arguments": { }
+}
+</screen>
+      </para>
+      <para>
+	If the removal of all statistics was successful, the server will respond
+	with a status of 0, indicating success and an empty parameters field. If
+	an error is encountered, the server will return a status code of 1 (error)
+	and the text field will contain the error description.
+      </para>
+    </section> <!-- end of command-statistic-remove-all -->
+
+  </section>
+
+</chapter>

+ 5 - 7
src/lib/config/command-socket.dox

@@ -20,16 +20,14 @@
 In many cases it is useful to manage certain aspects of the DHCP servers
 while they are running. In Kea, this may be done via the Control Channel.
 Control Channel allows an external entity (e.g. a tool run by a sysadmin
-or a script) to issue commands to the server which can influence the its
+or a script) to issue commands to the server which can influence its
 behavior or retreive information from it. Several notable examples
-envisioned are: reconfiguration, statistics retrival and manipulation,
+envisioned are: reconfiguration, statistics retrieval and manipulation,
 and shutdown.
-@note Currently, only the DHCPv4 component supports Control Channel.
-@todo: Update this text once Control Channel support in DHCPv6 is added.
 
 Communication over Control Channel is conducted using JSON structures.
-Currently (Kea 0.9.2) the only supported communication channel is UNIX stream
-sockets, but additional types may be added in the future.
+As of Kea 0.9.2, the only supported communication channel is UNIX stream
+socket, but additional types may be added in the future.
 
 If configured, Kea will open a socket and will listen for any incoming
 connections. A process connecting to this socket is expected to send JSON
@@ -47,7 +45,7 @@ commands structured as follows:
 @endcode
 
 - command - is the name of command to execute and is mandatory.
-- arguments - it may be absent, contain a single parameter or a map or parameters
+- arguments - contain a single parameter or a map or parameters
 required to carry out the given command.  The exact content and format is command specific.
 
 The server will process the incoming command and then send a response of the form: