Browse Source

[2893] Updated documentation for BPF.

Marcin Siodelski 11 years ago
parent
commit
4cd87e8b45

+ 3 - 3
doc/guide/libdhcp.xml

@@ -41,9 +41,9 @@
 
       <para>DHCPv4 requires special raw socket processing to send and receive
       packets from hosts that do not have IPv4 address assigned yet. Support
-      for this operation is implemented on Linux only, so it is likely that
-      DHCPv4 component will not work in certain cases on systems other than
-      Linux.</para>
+      for this operation is implemented on Linux, FreeBSD, NetBSD and OpenBSD.
+      It is likely that DHCPv4 component will not work in certain cases on
+      other systems.</para>
     </section>
 
 <!--

+ 5 - 0
src/lib/dhcp/iface_mgr.cc

@@ -176,6 +176,11 @@ bool Iface::delSocket(const uint16_t sockfd) {
 
 void
 Iface::resizeReadBuffer(const size_t new_size) {
+    // Do nothing if the new size is equal to the current size.
+    if (new_size == read_buffer_size_) {
+        return;
+    }
+
     read_buffer_size_ = new_size;
     read_buffer_ = static_cast<uint8_t*>(realloc(read_buffer_,
                                                  read_buffer_size_));

+ 15 - 5
src/lib/dhcp/iface_mgr.h

@@ -125,6 +125,16 @@ struct SocketInfo {
 /// Iface structure represents network interface with all useful
 /// information, like name, interface index, MAC address and
 /// list of assigned addresses
+///
+/// This class also holds the pointer to the socket read buffer.
+/// Functions reading from the socket may utilize this buffer to store the
+/// data being read from the socket. The advantage of using the
+/// pre-allocated buffer is that the buffer is allocated only once, rather
+/// than on every read. In addition, some OS specific code (e.g. BPF)
+/// may require use of fixed-size buffers. The size of such a buffer is
+/// returned by the OS kernel when the socket is opened. Hence, it is
+/// convenient to allocate the buffer when the socket is being opened and
+/// utilze it throughout the lifetime of the socket.
 class Iface {
 public:
 
@@ -352,7 +362,7 @@ public:
         return (read_buffer_size_);
     }
 
-    /// @brief Resizes the socket read buffer.
+    /// @brief Reallocates the socket read buffer.
     ///
     /// @param new_size New size of the buffer.
     void resizeReadBuffer(const size_t new_size);
@@ -416,11 +426,11 @@ public:
     /// be opened on this interface.
     bool inactive6_;
 
-    /// @brief Buffer holding the data read from the socket.
+private:
+
+    /// @brief Pointer to the buffer holding the data read from the socket.
     ///
-    /// This buffer may be pre-allocated when the socket on the interface
-    /// is being opened. The functions which read the data from the socket
-    /// may use this buffer as a storage for the data being read.
+    /// See @c Iface manager description for details.
     uint8_t* read_buffer_;
 
     /// @brief Allocated size of the read buffer.

+ 5 - 0
src/lib/dhcp/iface_mgr_bsd.cc

@@ -145,6 +145,11 @@ bool IfaceMgr::os_receive4(struct msghdr& /*m*/, Pkt4Ptr& /*pkt*/) {
 
 void
 IfaceMgr::setMatchingPacketFilter(const bool /* direct_response_desired */) {
+    // Ignore whether the direct response is desired or not. Even if the
+    // direct response is not desired we don't want to use PktFilterInet
+    // because the BSD doesn't support binding to the device and listening
+    // to broadcast traffic on individual interfaces. So, datagram socket
+    // supported by PktFilterInet is not really usable for DHCP.
     setPacketFilter(PktFilterPtr(new PktFilterBPF()));
 }
 

+ 13 - 17
src/lib/dhcp/libdhcp++.dox

@@ -159,32 +159,28 @@ address.
 Kea supports the use of raw sockets to create a complete Data-link/IP/UDP/DHCPv4
 stack. By creating each layer of the outgoing packet, the Kea logic has full
 control over the frame contents and it may bypass the use of ARP to inject the
-link layer address into the frame. The raw socket is bound to a specific interface,
-not to the IP address/UDP port. Therefore, the system kernel doesn't have
-means to verify that Kea is listening to the DHCP traffic on the specific address
-and port. This has two major implications:
+link layer address into the frame.
+
+The low level operations on raw sockets are implemented within the "packet
+filtering" classes derived from @c isc::dhcp::PktFilter. The implementation
+of these classes is specific to the operating system. On Linux the
+@c isc::dhcp::PktFilterLPF is used. On BSD systems the
+@c isc::dhcp::PktFilterBPF is used.
+
+The raw sockets are bound to a specific interface, not to the IP address/UDP port.
+Therefore, the system kernel doesn't have means to verify that Kea is listening
+to the DHCP traffic on the specific address and port. This has two major implications:
 - It is possible to run another DHCPv4 sever instance which will bind socket to the
 same address and port.
 - An attempt to send a unicast message to the DHCPv4 server will result in ICMP
 "Port Unreachable" message being sent by the kernel (which is unaware that the
 DHCPv4 service is actually running).
-In order to overcome these issues, the isc::dhcp::PktFilterLPF opens a
+
+In order to overcome these issues, the packet filtering classes open a
 regular IP/UDP socket which coexists with the raw socket. The socket is referred
 to as "fallback socket" in the Kea code. All packets received through this socket
 are discarded.
 
-In general, the use of datagram sockets is preferred over raw sockets.
-For convenience, the switchable Packet Filter objects are used to manage
-sockets for different purposes. These objects implement the socket opening
-operation and sending/receiving messages over this socket. For example:
-the isc::dhcp::PktFilterLPF object opens a raw socket.
-The isc::dhcp::PktFilterLPF::send and isc::dhcp::PktFilterLPF::receive
-methods encode/decode full data-link/IP/UDP/DHCPv4 stack. The
-isc::dhcp::PktFilterInet supports sending and receiving messages over
-the regular IP/UDP socket. The isc::dhcp::PktFilterInet should be used in all
-cases when an application using the libdhcp++ doesn't require sending
-DHCP messages to a device which doesn't have an address yet.
-
 @section libdhcpPktFilter6 Switchable Packet Filters for DHCPv6
 
 The DHCPv6 implementation doesn't suffer from the problems described in \ref

+ 54 - 6
src/lib/dhcp/pkt_filter_bpf.h

@@ -22,12 +22,45 @@
 namespace isc {
 namespace dhcp {
 
-/// @brief Packet handling class using Berkeley Packet Filtering
+/// @brief Packet handling class using Berkeley Packet Filtering (BPF)
 ///
-/// This class provides methods to send and recive DHCPv4 messages using raw
-/// sockets and Berkeley Packet Filtering. It is used by @c isc::dhcp::IfaceMgr
-/// to send DHCPv4 messages to the hosts which don't have an IPv4 address
-/// assigned yet.
+/// The BPF is supported on the BSD-like operating systems. It allows for access
+/// to low level layers of the inbound and outbound packets. This is specifially
+/// useful when the DHCP server is allocating new address to the client.
+///
+/// The response being sent to the client must include the HW address in the
+/// datalink layer. When the regular datagram socket is used the kernel will
+/// determine the HW address of the destination using ARP. In the case when
+/// the DHCP server is allocating the new address for the client the ARP can't
+/// be used because it requires the destination to have the IP address.
+///
+/// The DHCP server utilizes HW address sent by the client in the DHCP message
+/// and stores it in the datalink layer of the outbound packet. The BPF provides
+/// the means for crafting the whole packet (including datalink and network
+/// layers) and injecting the hardware address of the client.
+///
+/// The DHCP server receiving the messages sent from the directly connected
+/// clients to the broadcast address must be able to determine the interface
+/// on which the message arrives. The Linux kernel provides the SO_BINDTODEVICE
+/// socket option which allows for binding the socket to the particular
+/// interface. This option is not implemented on the BSD-like operating
+/// systems. This implies that there may be only one datagram socket listening
+/// to broadcast messages and this socket would receive the traffic on all
+/// interfaces. This effectively precludes the server from identifying the
+/// interface on which the packet arrived. The BPF resolves this problem.
+/// The BPF device (socket) can be attached to the selected interface using
+/// the ioctl function.
+///
+/// In nutshell, the BPF device is created by opening the file /dev/bpf%d
+/// where %d is a number. The BPF device is configured by issuing ioctl
+/// commands listed here: http://www.freebsd.org/cgi/man.cgi?bpf(4).
+/// The specific configuration used by Kea DHCP server is decribed in
+/// documentation of @c PktFilterBPF::openSocket.
+///
+/// Use of BPF requires Kea to encode and decode the datalink and network
+/// layer headers. Currently Kea supports encoding and decoding ethernet
+/// frames on physical interfaces and pseudo headers received on local
+/// loopback interface.
 class PktFilterBPF : public PktFilter {
 public:
 
@@ -42,7 +75,22 @@ public:
 
     /// @brief Open primary and fallback socket.
     ///
-    /// @param iface Interface descriptor.
+    /// This method opens the BPF device and applies the following
+    /// configuration to it:
+    /// - attach the device to the specified interface
+    /// - set filter program to receive DHCP messages encapsulated in UDP
+    /// packets
+    /// - set immediate mode which causes the read function to return
+    /// immediatelly and do not wait for the whole read buffer to be filled
+    /// by the kernel (to avoid hangs)
+    ///
+    /// It also obtains the following configuration from the kernel:
+    /// - major and minor version of the BPF (and checks if it is valid)
+    /// - length of the buffer to be used to receive the data from the socket
+    ///
+    /// @param iface Interface descriptor. Note that the function (re)allocates
+    /// the socket read buffer according to the buffer size returned by the
+    /// kernel.
     /// @param addr Address on the interface to be used to send packets.
     /// @param port Port number.
     /// @param receive_bcast Configure socket to receive broadcast messages