Parcourir la source

[2903] logged other errors from asio.

JINMEI Tatuya il y a 12 ans
Parent
commit
719a1cb238

+ 48 - 0
src/lib/asiodns/asiodns_messages.mes

@@ -53,6 +53,11 @@ The asynchronous I/O code encountered an error when trying to send data to
 the specified address on the given protocol.  The number of the system
 error that caused the problem is given in the message.
 
+% ASIODNS_SYNC_UDP_CLOSE_SOCKET_FAIL_ON_STOP failed to close a
+This is the same to ASIODNS_UDP_CLOSE_SOCKET_FAIL_ON_STOP but happens
+on the "synchronous UDP server", mainly used for the authoritative DNS
+server daemon.
+
 % ASIODNS_TCP_ACCEPT_FAIL failed to accept TCP DNS connection: %1
 Accepting a TCP connection from a DNS client failed due to an error
 that could happen but should be rare.  The reason for the error is
@@ -64,6 +69,44 @@ restrictive on this limitation, so consider adjusing the limit using
 a tool such as ulimit.  If you see other types of errors too often,
 there may be something overlooked; please file a bug report in that case.
 
+% ASIODNS_TCP_CLOSE_ACCEPTOR_FAIL failed to close listening TCP socket: %1
+A TCP DNS server tried to close a listening TCP socket (for accepting
+new connections) as a step of stopping itself, but failed to do that.
+This is generally an unexpected event and so is logged as an error.
+
+% ASIODNS_TCP_CLOSE_FAIL failed to close DNS/TCP socket with a client: %1
+A TCP DNS server tried to close a TCP socket used to communicate with a
+client, but it failed to do that.  This is expected to be very rare but
+we've seen it in real deployment.
+
+% ASIODNS_TCP_CLOSE_SOCKET_FAIL_ON_STOP failed to close a DNS/TCP socket while stopping the server: %1
+A TCP DNS server tried to close a TCP socket (which is either actively
+communicating with a client or is already unused) as a step of
+stopping itself, but failed to do that.  This is generally an
+unexpected event and so is logged as an error.
+
+% ASIODNS_TCP_GETREMOTE_FAIL failed to get remote address of a DNS TCP connection: %1
+A TCP DNS server tried to get the address and port of a remote client
+on a connected socket but failed.  It's expected to be rare but can
+still happen.  See also ASIODNS_TCP_READLEN_FAIL.
+
+% ASIODNS_TCP_READDATA_FAIL failed to get DNS data on a TCP socket: %1
+A TCP DNS server tried to read a DNS message (that follows a 2-byte
+length field) but failed.  It's expected to be rare but can still happen.
+See also ASIODNS_TCP_READLEN_FAIL.
+
+% ASIODNS_TCP_READLEN_FAIL failed to get DNS data length on a TCP socket: %1
+A TCP DNS server tried to get the length field of a DNS message (the first
+2 bytes of a new chunk of data) but failed.  This is generally expected to
+be rare but can still happen, e.g, due to an unexpected reset of the
+connection.  A specific reason for the failure is included in the log
+message.
+
+% ASIODNS_TCP_WRITE_FAIL failed to send DNS message over a TCP socket: %1
+A TCP DNS server tried to send a DNS message to a remote client but
+failed.  It's expected to be rare but can still happen.  See also
+ASIODNS_TCP_READLEN_FAIL.
+
 % ASIODNS_UDP_ASYNC_SEND_FAIL Error sending UDP packet to %1: %2
 The low-level ASIO library reported an error when trying to send a UDP
 packet in asynchronous UDP mode. This can be any error reported by
@@ -75,6 +118,11 @@ If you see a single occurrence of this message, it probably does not
 indicate any significant problem, but if it is logged often, it is probably
 a good idea to inspect your network traffic.
 
+% ASIODNS_UDP_CLOSE_SOCKET_FAIL_ON_STOP failed to close a DNS/UDP socket while stopping the server: %1
+A UDP DNS server tried to close its UDP socket as a step of stopping
+itself, but failed to do that.  This is generally an unexpected event
+and so is logged as an error.
+
 % ASIODNS_UDP_RECEIVE_FAIL failed to receive UDP DNS packet: %1
 Receiving a UDP packet from a DNS client failed due to an error that
 could happen but should be very rare.  The server still keeps

+ 8 - 4
src/lib/asiodns/sync_udp_server.cc

@@ -118,10 +118,10 @@ SyncUDPServer::handleRead(const asio::error_code& ec, const size_t length) {
         // Good, there's an answer.
         socket_->send_to(asio::const_buffers_1(output_buffer_->getData(),
                                                output_buffer_->getLength()),
-                         sender_, 0, ecode_);
-        if (ecode_) {
+                         sender_, 0, ec_);
+        if (ec_) {
             LOG_ERROR(logger, ASIODNS_UDP_SYNC_SEND_FAIL).
-                      arg(sender_.address().to_string()).arg(ecode_.message());
+                      arg(sender_.address().to_string()).arg(ec_.message());
         }
     }
 
@@ -147,8 +147,12 @@ SyncUDPServer::stop() {
     /// for it won't be scheduled by io service not matter it is
     /// submit to io service before or after close call. And we will
     /// get bad_descriptor error.
-    socket_->close(ecode_); // pass error_code just to avoid getting exception.
+    socket_->close(ec_);
     stopped_ = true;
+    if (ec_) {
+        LOG_ERROR(logger, ASIODNS_SYNC_UDP_CLOSE_SOCKET_FAIL_ON_STOP).
+            arg(ec_.message());
+    }
 }
 
 void

+ 1 - 1
src/lib/asiodns/sync_udp_server.h

@@ -153,7 +153,7 @@ private:
     bool stopped_;
     // Placeholder for error code object.  It will be passed to ASIO library
     // to have it set in case of error.
-    asio::error_code ecode_;
+    asio::error_code ec_;
     // The callback functor for internal asynchronous read event.  This is
     // stateless (and it will be copied in the ASIO library anyway), so
     // can be const

+ 27 - 6
src/lib/asiodns/tcp_server.cc

@@ -162,6 +162,8 @@ TCPServer::operator()(asio::error_code ec, size_t length) {
         CORO_YIELD async_read(*socket_, asio::buffer(data_.get(),
                               TCP_MESSAGE_LENGTHSIZE), *this);
         if (ec) {
+            LOG_DEBUG(logger, DBGLVL_TRACE_BASIC, ASIODNS_TCP_READLEN_FAIL).
+                arg(ec.message());
             return;
         }
 
@@ -169,11 +171,12 @@ TCPServer::operator()(asio::error_code ec, size_t length) {
         /// to allow inline variable declarations.)
         CORO_YIELD {
             InputBuffer dnsbuffer(data_.get(), length);
-            uint16_t msglen = dnsbuffer.readUint16();
+            const uint16_t msglen = dnsbuffer.readUint16();
             async_read(*socket_, asio::buffer(data_.get(), msglen), *this);
         }
-
         if (ec) {
+            LOG_DEBUG(logger, DBGLVL_TRACE_BASIC, ASIODNS_TCP_READDATA_FAIL).
+                arg(ec.message());
             return;
         }
 
@@ -184,6 +187,8 @@ TCPServer::operator()(asio::error_code ec, size_t length) {
         // all these calls to "new".)
         peer_.reset(new TCPEndpoint(socket_->remote_endpoint(ec)));
         if (ec) {
+            LOG_DEBUG(logger, DBGLVL_TRACE_BASIC, ASIODNS_TCP_GETREMOTE_FAIL).
+                arg(ec.message());
             return;
         }
 
@@ -250,13 +255,23 @@ TCPServer::operator()(asio::error_code ec, size_t length) {
         // (though we have nothing further to do, so the coroutine
         // will simply exit at that time).
         CORO_YIELD async_write(*socket_, bufs, *this);
+        if (ec) {
+            LOG_DEBUG(logger, DBGLVL_TRACE_BASIC, ASIODNS_TCP_WRITE_FAIL).
+                arg(ec.message());
+        }
 
         // All done, cancel the timeout timer. if it throws, consider it fatal.
         timeout_->cancel();
 
         // TODO: should we keep the connection open for a short time
         // to see if new requests come in?
-        socket_->close(ec);     // ignore any error on close()
+        socket_->close(ec);
+        if (ec) {
+            // close() should be unlikely to fail, but we've seen it fail once,
+            // so we log the event.
+            LOG_DEBUG(logger, DBGLVL_TRACE_BASIC, ASIODNS_TCP_CLOSE_FAIL).
+                arg(ec.message());
+        }
     }
 }
 
@@ -269,18 +284,24 @@ TCPServer::asyncLookup() {
 }
 
 void TCPServer::stop() {
-    // passing error_code to avoid getting exception; we simply ignore any
-    // error on close().
     asio::error_code ec;
 
     /// we use close instead of cancel, with the same reason
     /// with udp server stop, refer to the udp server code
 
     acceptor_->close(ec);
+    if (ec) {
+        LOG_ERROR(logger, ASIODNS_TCP_CLOSE_ACCEPTOR_FAIL).arg(ec.message());
+    }
+
     // User may stop the server even when it hasn't started to
-    // run, in that that socket_ is empty
+    // run, in that case socket_ is empty
     if (socket_) {
         socket_->close(ec);
+        if (ec) {
+            LOG_ERROR(logger, ASIODNS_TCP_CLOSE_SOCKET_FAIL_ON_STOP).
+                arg(ec.message());
+        }
     }
 }
 /// Post this coroutine on the ASIO service queue so that it will

+ 4 - 2
src/lib/asiodns/udp_server.cc

@@ -327,8 +327,6 @@ UDPServer::asyncLookup() {
 /// Stop the UDPServer
 void
 UDPServer::stop() {
-    // passing error_code to avoid getting exception; we simply ignore any
-    // error on close().
     asio::error_code ec;
 
     /// Using close instead of cancel, because cancel
@@ -340,6 +338,10 @@ UDPServer::stop() {
     /// submit to io service before or after close call. And we will
     //  get bad_descriptor error.
     data_->socket_->close(ec);
+    if (ec) {
+        LOG_ERROR(logger, ASIODNS_UDP_CLOSE_SOCKET_FAIL_ON_STOP).
+            arg(ec.message());
+    }
 }
 
 /// Post this coroutine on the ASIO service queue so that it will