Browse Source

[2562] in processNotify, catch the "no recpt" case and log it at debug level.

the behavior itself is the same, but it suppresses noise level in case
it's intentional configuration.  descriptions of other log messages in
this method were not really correct, so they were fixed, too.

also, it now asserts xfrin_session_ is non null; this should never be null
in the production code (even tests don't set it to null right now).
JINMEI Tatuya 12 years ago
parent
commit
0a2f3129eb
2 changed files with 43 additions and 14 deletions
  1. 25 4
      src/bin/auth/auth_messages.mes
  2. 18 10
      src/bin/auth/auth_srv.cc

+ 25 - 4
src/bin/auth/auth_messages.mes

@@ -374,12 +374,33 @@ XFRIN (Transfer-in) process.  It is issued during server startup is an
 indication that the initialization is proceeding normally.
 
 % AUTH_ZONEMGR_COMMS error communicating with zone manager: %1
-This is a debug message output during the processing of a NOTIFY request.
+This is an internal error during the processing of a NOTIFY request.
 An error (listed in the message) has been encountered whilst communicating
 with the zone manager. The NOTIFY request will not be honored.
+This may be some temporary failure, but is generally an unexpected
+event and is quite likely a bug.  It's probably worth filing a report.
 
 % AUTH_ZONEMGR_ERROR received error response from zone manager: %1
-This is a debug message output during the processing of a NOTIFY
-request. The zone manager component has been informed of the request,
+The zone manager component has been informed of the request,
 but has returned an error response (which is included in the message). The
-NOTIFY request will not be honored.
+NOTIFY request will not be honored.  As of this writing, this can only
+happen due to a bug inside the Zonemgr implementation.  Zonemgr itself
+may log more detailed cause of this, and these are probably worth
+filing a bug report.
+
+% AUTH_ZONEMGR_NOTEXIST received NOTIFY but Zonemgr does not exist
+This is a debug message produced by the authoritative server when it
+receives a NOTIFY message but the Zonemgr component is not running at
+that time.  Not running Zonemgr is completely valid for, e.g., primary
+only servers, so this is not necessarily a problem.  If this message
+is logged even if Zonemgr is supposed to be running, it's encouraged
+to check other logs to identify why that happens.  It may or may not
+be a real problem (for example, if it's immediately after the system
+startup, it's possible that Auth has started up and is running but
+Zonemgr is not yet).  Even if this is indeed an unexpected case,
+Zonemgr should normally be restarted by the Init process, so unless
+this repeats too often it may be negligible in practice (still it's
+worth filing a bug report).  In any case, the authoritative server
+simply drops the NOTIFY message; if it's a temporary failure or
+delayed startup, subsequently resent messages will eventually reach
+Zonemgr.

+ 18 - 10
src/bin/auth/auth_srv.cc

@@ -789,18 +789,15 @@ AuthSrvImpl::processNotify(const IOMessage& io_message, Message& message,
         return (true);
     }
 
-    // In the code that follows, we simply ignore the notify if any internal
-    // error happens rather than returning (e.g.) SERVFAIL.  RFC 1996 is
-    // silent about such cases, but there doesn't seem to be anything we can
-    // improve at the primary server side by sending an error anyway.
-    if (xfrin_session_ == NULL) {
-        LOG_DEBUG(auth_logger, DBG_AUTH_DETAIL, AUTH_NO_XFRIN);
-        return (false);
-    }
-
     LOG_DEBUG(auth_logger, DBG_AUTH_DETAIL, AUTH_RECEIVED_NOTIFY)
         .arg(question->getName()).arg(question->getClass()).arg(remote_ep);
 
+    // xfrin_session_ should have been set and never be replaced except in
+    // tests; otherwise it's an internal bug.  assert() may be too strong,
+    // but processMessage() will catch all exceptions, so there's no better
+    // way.
+    assert(xfrin_session_);
+
     const string remote_ip_address = remote_ep.getAddress().toText();
     static const string command_template_start =
         "{\"command\": [\"notify\", {\"zone_name\" : \"";
@@ -821,7 +818,18 @@ AuthSrvImpl::processNotify(const IOMessage& io_message, Message& message,
         xfrin_session_->group_recvmsg(env, answer, false, seq);
         int rcode;
         parsed_answer = parseAnswer(rcode, answer);
-        if (rcode != 0) {
+        if (rcode == CC_REPLY_NO_RECPT) {
+            // This can happen when Zonemgr is not running.  When we support
+            // notification-based membership framework, we should check if it's
+            // supposed to be running and shouldn't even send the command if
+            // not.  Until then, we log this event at the debug level as we
+            // don't know whether it's a real trouble or intentional
+            // configuration.  (Also, when it's done, maybe we should simply
+            // propagate the exception and return SERVFAIL to suppress further
+            // NOTIFY).
+            LOG_DEBUG(auth_logger, DBG_AUTH_DETAIL, AUTH_ZONEMGR_NOTEXIST);
+            return (false);
+        } else if (rcode != CC_REPLY_SUCCESS) {
             LOG_ERROR(auth_logger, AUTH_ZONEMGR_ERROR)
                       .arg(parsed_answer->str());
             return (false);