|
@@ -0,0 +1,382 @@
|
|
|
+The IPC protocol
|
|
|
+================
|
|
|
+
|
|
|
+While the cc-protocol.txt describes the low-level primitives, here we
|
|
|
+describe how the whole IPC should work and how to use it.
|
|
|
+
|
|
|
+Definitions
|
|
|
+-----------
|
|
|
+
|
|
|
+system::
|
|
|
+ The system that moves data between the users and does bookkeeping.
|
|
|
+ In our current implementation, it is implemented as the MsgQ daemon,
|
|
|
+ which the users connect to and it routes the data.
|
|
|
+user::
|
|
|
+ Usually a process; generally an entity that wants to communicate
|
|
|
+ with the other users.
|
|
|
+session::
|
|
|
+ Session is the interface by which the user communicates with the
|
|
|
+ system. Single user may have multiple sessions, a session belongs to
|
|
|
+ single user.
|
|
|
+message::
|
|
|
+ A data blob sent by one user. The recipient might be the system
|
|
|
+ itself, other session or set of sessions (called group, see below,
|
|
|
+ it is possibly empty). Message is either a response or an original
|
|
|
+ message (TODO: Better name?).
|
|
|
+group::
|
|
|
+ A named set of sessions. Conceptually, all the possible groups
|
|
|
+ exist, there's no explicit creation and deletion of groups.
|
|
|
+session id::
|
|
|
+ Unique identifier of a session. It is not reused for the whole
|
|
|
+ lifetime of the system. Historically called `lname` in the code.
|
|
|
+undelivery signal::
|
|
|
+ While sending an original message, a client may request an
|
|
|
+ undelivery signal. If the recipient specification yields no
|
|
|
+ sessions to deliver the message to, the system informs user about
|
|
|
+ the situation.
|
|
|
+sequence number::
|
|
|
+ Each message sent through the system carries a sequence number. The
|
|
|
+ number should be unique per sender. It can be used to pair a
|
|
|
+ response to the original message, since the response specifies which
|
|
|
+ sequence number had the message it response to. Even responses and
|
|
|
+ messages not expecting answer have their sequence number, but it is
|
|
|
+ generally unused.
|
|
|
+non-blocking operation::
|
|
|
+ Operation that will complete without waiting for anything.
|
|
|
+fast operation::
|
|
|
+ Operation that may wait for other process, but only for a very short
|
|
|
+ time. Generally, this includes communication between the user and
|
|
|
+ system, but not between two clients. It can be expected to be fast
|
|
|
+ enough to use this inside an interactive session, but may be too
|
|
|
+ heavy in the middle of query processing, for example. Every
|
|
|
+ non-blocking operation is considered fast.
|
|
|
+
|
|
|
+The session
|
|
|
+-----------
|
|
|
+
|
|
|
+The session interface allows for several operations interacting with
|
|
|
+the system. In the code, it is represented by a class.
|
|
|
+
|
|
|
+Possible operations include:
|
|
|
+
|
|
|
+Opening a session::
|
|
|
+ The session is created and connects to the system. This operation is
|
|
|
+ fast. The session receives session id from the system.
|
|
|
+
|
|
|
+Group management::
|
|
|
+ A user may subscribe (become member) of a group, or unsubscribe from
|
|
|
+ a group. These are fast operations.
|
|
|
+
|
|
|
+Send::
|
|
|
+ A user may send a message, addressed to the system, or other
|
|
|
+ session(s). This operation is expected to be non-blocking
|
|
|
+ (current implementation is based on assumption of how OS handles the
|
|
|
+ sends, which may need to be revisited if it turns out to be false).
|
|
|
+
|
|
|
+Receive synchronously::
|
|
|
+ User may wait for an incoming message in blocking mode. It is
|
|
|
+ possible to specify the kind of message to wait for, either original
|
|
|
+ message or response to a message. This interface has a timeout.
|
|
|
+
|
|
|
+Receive asynchronously::
|
|
|
+ Similar to previous, but non-blocking. It terminates immediately.
|
|
|
+ The user provides a callback that is invoked when the requested
|
|
|
+ message arrives.
|
|
|
+
|
|
|
+Terminate::
|
|
|
+ A session may be terminated. No more messages are sent or received
|
|
|
+ over it, the session is automatically unsubscribed from all the
|
|
|
+ groups. This operation is non-blocking. A session is terminated
|
|
|
+ automatically if the user exits.
|
|
|
+
|
|
|
+Assumptions
|
|
|
+-----------
|
|
|
+
|
|
|
+We assume reliability and order of delivery. Messages sent from user A
|
|
|
+to B are all delivered unchanged in original order as long as B
|
|
|
+exists.
|
|
|
+
|
|
|
+All above operations are expected to always succeed. If there's an
|
|
|
+error reported, it should be considered fatal and user should
|
|
|
+exit. In case a user still wants to continue, the session must be
|
|
|
+considered terminated and a new one must be created. Care must be
|
|
|
+taken not to use any information obtained from the previous session,
|
|
|
+since the state in other users and the system may have changed during
|
|
|
+the reconnect.
|
|
|
+
|
|
|
+Addressing
|
|
|
+----------
|
|
|
+
|
|
|
+Addressing happens in three ways:
|
|
|
+
|
|
|
+By group name::
|
|
|
+ The message is routed to all the sessions subscribed to this group.
|
|
|
+ It is legal to address an empty group; such message is then
|
|
|
+ delivered to no sessions.
|
|
|
+By session ID::
|
|
|
+ The message is sent to the single session, if it is still alive.
|
|
|
+By an alias::
|
|
|
+ A session may have any number of aliases - well known names. Only
|
|
|
+ single session may hold given alias (but it is not yet enforced by
|
|
|
+ the system). The message is delivered to the one session owning the
|
|
|
+ alias, if any. Internally, the aliases are implemented as groups
|
|
|
+ with single subscribed session, so it is the same as the first
|
|
|
+ option on the protocol level, but semantically it is different.
|
|
|
+
|
|
|
+The system
|
|
|
+----------
|
|
|
+
|
|
|
+The system performs these goals:
|
|
|
+
|
|
|
+ * Maintains the open sessions and allows creating new ones.
|
|
|
+ * Keeps information about groups and which sessions are subscribed to
|
|
|
+ which group.
|
|
|
+ * Routes the messages between users.
|
|
|
+
|
|
|
+Also, the system itself is a user of the system. It can be reached by
|
|
|
+the alias `Msgq` and provides following high-level services (see
|
|
|
+below):
|
|
|
+
|
|
|
+Notifications about sessions::
|
|
|
+ When a session is opened to the system or when a session is
|
|
|
+ terminated, a notification is sent to interested users. The
|
|
|
+ notification contains the session ID of the session in question.
|
|
|
+ The termination notification is probably more useful (if a user
|
|
|
+ communicated with a given session before, it might be interested it
|
|
|
+ is no longer available), the opening notification is provided mostly
|
|
|
+ for completeness.
|
|
|
+Notifications about group subscriptions::
|
|
|
+ When a session subscribes to a group or unsubscribes from a group, a
|
|
|
+ notification is sent to interested users. The notification contains
|
|
|
+ both the session ID of the session subscribing/unsubscribing and
|
|
|
+ name of the group. This includes notifications about aliases (since
|
|
|
+ aliases are groups internally).
|
|
|
+Commands to list sessions::
|
|
|
+ There's a command to list session IDs of all currently opened sessions
|
|
|
+ and a command to list session IDs of all sessions subscribed to a
|
|
|
+ given group. Note that using these lists might need some care, as
|
|
|
+ the information might be outdated at the time it is delivered to the
|
|
|
+ user.
|
|
|
+
|
|
|
+User shows interest in notifications about sessions and group
|
|
|
+subscriptions by subscribing to a group with well-known name (as with
|
|
|
+any notification).
|
|
|
+
|
|
|
+Note that due to implementation details, the `Msgq` alias is not yet
|
|
|
+available during early stage of the bootstrap of bind10 system. This
|
|
|
+means some very core services can't rely on the above services of the
|
|
|
+system. The alias is guaranteed to be working before the first
|
|
|
+non-core module is started.
|
|
|
+
|
|
|
+Higher-level services
|
|
|
+---------------------
|
|
|
+
|
|
|
+While the system is able to send any kind of data, the payload sent by
|
|
|
+users in bind10 is structured data encoded as JSON. The messages sent
|
|
|
+are of three general types:
|
|
|
+
|
|
|
+Command::
|
|
|
+ A message sent to single destination, with the undeliverable
|
|
|
+ signal turned on and expecting an answer. This is a request
|
|
|
+ to perform some operation on the recipient (it can have side effects
|
|
|
+ or not). The command is identified by a name and it can have
|
|
|
+ parameters. A command with the same name may behave differently (or
|
|
|
+ have different parameters) on different receiving users.
|
|
|
+Reply::
|
|
|
+ An answer to the `Command`. It is sent directly to the session where
|
|
|
+ the command originated from, does not expect further answer and the
|
|
|
+ undeliverable notification is not set. It either confirms the
|
|
|
+ command was run successfully and contains an optional result, or
|
|
|
+ notifies the sender of failure to run the command. Success and
|
|
|
+ failure differ only in the payload sent through the system, not in
|
|
|
+ the way it is sent. The undeliverable signal is failure
|
|
|
+ reply sent by the system on behalf of the missing recipient.
|
|
|
+Notification::
|
|
|
+ A message sent to any number of destinations (eg. sent to a group),
|
|
|
+ not expecting an answer. It notifies other users about an event or
|
|
|
+ change of state.
|
|
|
+
|
|
|
+Details of the higher-level
|
|
|
+---------------------------
|
|
|
+
|
|
|
+While there are libraries implementing the communication in convenient
|
|
|
+way, it is useful to know what happens inside.
|
|
|
+
|
|
|
+The notifications are probably the simplest. Users interested in
|
|
|
+receiving notifications of some family subscribe to corresponding
|
|
|
+group. Then, a client sends a message to the group. For example, if
|
|
|
+clients `receiver-A` and `receiver-B` want to receive notifications
|
|
|
+about changes to zone data, they'd subscribe to the
|
|
|
+`Notifications/ZoneUpdates` group. Then, other client (let's say
|
|
|
+`XfrIn`, with session ID `s12345`) would send something like:
|
|
|
+
|
|
|
+ s12345 -> Notifications/ZoneUpdates
|
|
|
+ {"notification": ["zone-update", {
|
|
|
+ "class": "IN",
|
|
|
+ "origin": "example.org.",
|
|
|
+ "serial": 123456
|
|
|
+ }]}
|
|
|
+
|
|
|
+Both receivers would receive the message and know that the
|
|
|
+`example.org` zone is now at version 123456. Note that multiple users
|
|
|
+may produce the same kind of notification. Also, single group may be
|
|
|
+used to send multiple notification names (but they should be related;
|
|
|
+in our example, the `Notifications/ZoneUpdates` could be used for
|
|
|
+`zone-update`, `zone-available` and `zone-unavailable` notifications
|
|
|
+for change in zone data, configuration of new zone in the system and
|
|
|
+removal of a zone from configuration).
|
|
|
+
|
|
|
+Sending a command to single recipient is slightly more complex. The
|
|
|
+sending user sends a message to the receiving one, addressed either by
|
|
|
+session ID or by an alias (group to which at most one session may be
|
|
|
+subscribed). The message contains the name of the command and
|
|
|
+parameters. It is sent with the undeliverable signals turned on.
|
|
|
+The user also starts a timer (with reasonably long timeout). The
|
|
|
+sender also subscribes to notifications about terminated sessions or
|
|
|
+unsubscription from the alias group.
|
|
|
+
|
|
|
+The receiving user gets the message, runs the command and sends a
|
|
|
+response back, with the result. The response has the undeliverable
|
|
|
+signal turned off and it is marked as response to the message
|
|
|
+containing the command. The sending user receives the answer and pairs
|
|
|
+it with the command.
|
|
|
+
|
|
|
+There are several things that may go wrong.
|
|
|
+
|
|
|
+* There might be an error on the receiving user (bad parameters, the
|
|
|
+ operation failed, the recipient doesn't know command of that name).
|
|
|
+ The receiving side sends the response as previous, the only
|
|
|
+ difference is the content of the payload. The sending user is
|
|
|
+ notified about it, without delays.
|
|
|
+* The recipient user doesn't exist (either the session ID is wrong or
|
|
|
+ terminated already, or the alias is empty). The system sends a
|
|
|
+ failure response and the sending user knows immediately the command
|
|
|
+ failed.
|
|
|
+* The recipient disconnects while processing the command (possibly
|
|
|
+ crashes). The sender gets a notification about disconnection or
|
|
|
+ unsubscription from the alias group and knows the answer won't come.
|
|
|
+* The recipient ``blackholes'' the command. It receives it, but never
|
|
|
+ answers. The timeout in sender times out. As this is a serious
|
|
|
+ programmer error in the recipient and should be rare, the sender
|
|
|
+ should at least log an error to notify about the case.
|
|
|
+
|
|
|
+One example would be asking the question of life, universe and
|
|
|
+everything (all the examples assume the sending user is already
|
|
|
+subscribed to the notifications):
|
|
|
+
|
|
|
+ s12345 -> DeepThought
|
|
|
+ {"command": ["question", {
|
|
|
+ "what": ["Life", "Universe", "*"]
|
|
|
+ }]}
|
|
|
+ s23456 -> s12345
|
|
|
+ {"reply": [0, 42]}
|
|
|
+
|
|
|
+The deep thought had an alias. But the answer is sent from its session
|
|
|
+ID. The `0` in the reply means ``success''.
|
|
|
+
|
|
|
+Another example might be asking for some data at a bureau and getting
|
|
|
+an error:
|
|
|
+
|
|
|
+ s12345 -> Burreau
|
|
|
+ {"command": ["provide-information", {
|
|
|
+ "about": "me",
|
|
|
+ "topic": "taxes"
|
|
|
+ }]}
|
|
|
+ s23456 -> s12345
|
|
|
+ {"reply": [1, "You need to fill in other form"]}
|
|
|
+
|
|
|
+And, in this example, the sender is trying to reach an non-existent
|
|
|
+session. The `msgq` here is not the alias `Msgq`, but a special
|
|
|
+``phantom'' session ID that is not listed anywhere.
|
|
|
+
|
|
|
+ s12345 -> s0
|
|
|
+ {"command": ["ping"]}
|
|
|
+ msgq -> s12345
|
|
|
+ {"reply": [-1, "No such recipient"]}
|
|
|
+
|
|
|
+Last, an example when the other user disconnects while processing the
|
|
|
+command.
|
|
|
+
|
|
|
+ s12345 -> s23456
|
|
|
+ {"command": ["shutdown"]}
|
|
|
+ msgq -> s12345
|
|
|
+ {"notification": ["disconnected", {
|
|
|
+ "lname": "s23456"
|
|
|
+ }]}
|
|
|
+
|
|
|
+The system does not support sending a command to multiple users
|
|
|
+directly. It can be accomplished as this:
|
|
|
+
|
|
|
+* The sending user calls a command on the system to get list of
|
|
|
+ sessions in given group. This is command to alias, so it can be done
|
|
|
+ by the previous way.
|
|
|
+* After receiving the list of session IDs, multiple copies of the
|
|
|
+ command are sent by the sending user, one to each of the session
|
|
|
+ IDs.
|
|
|
+* Successes and failures are handled the same as above, since these
|
|
|
+ are just single-recipient commands.
|
|
|
+
|
|
|
+So, this would be an example with unhelpful war council.
|
|
|
+
|
|
|
+ s12345 -> Msgq
|
|
|
+ {"command": ["get-subscriptions", {
|
|
|
+ "group": "WarCouncil"
|
|
|
+ }]}
|
|
|
+ msgq -> s12345
|
|
|
+ {"reply": [0, ["s1", "s2", "s3"]]}
|
|
|
+ s12345 -> s1
|
|
|
+ {"command": ["advice", {
|
|
|
+ "topic": "Should we attack?"
|
|
|
+ }]}
|
|
|
+ s12345 -> s2
|
|
|
+ {"command": ["advice", {
|
|
|
+ "topic": "Should we attack?"
|
|
|
+ }]}
|
|
|
+ s12345 -> s3
|
|
|
+ {"command": ["advice", {
|
|
|
+ "topic": "Should we attack?"
|
|
|
+ }]}
|
|
|
+ s1 -> s12345
|
|
|
+ {"reply": [0, true]}
|
|
|
+ s2 -> s12345
|
|
|
+ {"reply": [0, false]}
|
|
|
+ s3 -> s12345
|
|
|
+ {"reply": [1, "Advice feature not implemented"]}
|
|
|
+
|
|
|
+Users
|
|
|
+-----
|
|
|
+
|
|
|
+While there's a lot of flexibility for the behaviour of a user, it
|
|
|
+usually comes to something like this (during the lifetime of the
|
|
|
+user):
|
|
|
+
|
|
|
+* The user starts up.
|
|
|
+* Then it creates one or more sessions (there may be technical reasons
|
|
|
+ to have more than one session, such as threads, but it is not
|
|
|
+ required by the system).
|
|
|
+* It subscribes to some groups to receive notifications in future.
|
|
|
+* It binds to some aliases if it wants to be reachable by others by a
|
|
|
+ nice name.
|
|
|
+* It invokes some start-up commands (to get the configuration, for
|
|
|
+ example).
|
|
|
+* During the lifetime, it listens for notifications and answers
|
|
|
+ commands. It also invokes remote commands and sends notifications
|
|
|
+ about things that are happening.
|
|
|
+* Eventually, the user terminates, closing all the sessions it had
|
|
|
+ opened.
|
|
|
+
|
|
|
+Known limitations
|
|
|
+-----------------
|
|
|
+
|
|
|
+It is meant mostly as signalling protocol. Sending millions of
|
|
|
+messages or messages of several tens of megabytes is probably a bad
|
|
|
+idea. While there's no architectural limitation with regards of the
|
|
|
+number of transferred messages and the maximum size of message is 4GB,
|
|
|
+the code is not optimised and it would probably be very slow.
|
|
|
+
|
|
|
+We currently expect the system not to be at heavy load. Therefore, we
|
|
|
+expect the system to keep up with users sending messages. The
|
|
|
+libraries write in blocking mode, which is no problem if the
|
|
|
+expectation is true, as the write buffers will generally be empty and
|
|
|
+the write wouldn't block, but if it turns out it is not the case, we
|
|
|
+might need to reconsider.
|