ipc-high.txt 7.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185
  1. The IPC protocol
  2. ================
  3. While the cc-protocol.txt describes the low-level primitives, here we
  4. describe how the whole IPC should work and how to use it.
  5. Assumptions
  6. -----------
  7. We assume the low-level protocol keeps ordering of messages. That is,
  8. if A sends messages 1 and 2 to B, they get delivered in the same order
  9. as they were sent. However, if A sends message 1 to B and 2 to C, the
  10. order in which get them or the order in which they answer is not
  11. defined.
  12. We also assume that the delivery is reliable. If B gets a message from
  13. A, it can be sure that all previous messages were delivered too. If A
  14. sends a message to B, B either gets the message or either A or B is
  15. disconnected during the attempt.
  16. Also, we expect the messages don't get damaged or modified on their
  17. way.
  18. On unrecoverable error (errors like EINTR or short read/write are
  19. recoverable, since there's clear way how to continue without losing
  20. any messages, errors like connection reset are unrecoverable), the
  21. client should abort completely. If it deems better to reconnect, it
  22. must assume anything might have happened during the time and start
  23. communication from scratch, discarding any knowledge gathered from the
  24. previous connection (configuration, addresses of other clients, etc).
  25. Addressing
  26. ----------
  27. We can specify the recipient in two different ways:
  28. * Directly. Each connected client has an unique address. A message
  29. addressed to that address is sent only to the one client.
  30. * By a group. A client might subscribe to any number of groups.
  31. When a message is sent to the group, all clients subscribed to the
  32. group receive it. It is legal to send to an empty group.
  33. [NOTE]
  34. If it is possible a group may contain multiple recipients, it is
  35. discouraged to send messages expecting an answer addressed to the
  36. group. It is not known how many answers are to come. See below for
  37. details on one-to-many communication.
  38. Feedback from the IPC system
  39. ----------------------------
  40. The IPC system generates some additional information to aid the
  41. communicating clients.
  42. Undeliverable notification::
  43. If the client requests it (by a per-message flag) and the set of
  44. recipients specified is empty (either because the connection
  45. ID/lname is not connected or because the addressed group is empty),
  46. an answer message is sent from the daemon to notify it about
  47. the situation. However, since the recipient still can take a long
  48. time to answer (if it exists), clients that need high availability
  49. should not wait for the answer in blocking way.
  50. Notifications about connections and disconnections::
  51. The system generates notification about following events:
  52. * Client connected (sent with the lname of the client)
  53. * Client disconnected (sent with the lname of the client)
  54. * Client subscribed (sent with the name of group and lname of
  55. client)
  56. * Client unsubscribed (sent with the name of group and lname of
  57. client)
  58. List of group members:
  59. The daemon provides a command to list lnames of clients subscribed
  60. to given group, and lnames of all connections.
  61. Communication paradigms
  62. -----------------------
  63. Event notifications
  64. ~~~~~~~~~~~~~~~~~~~
  65. Sometimes, an event that may be interesting to other parts of the
  66. system happens. The originating module may not know what other modules
  67. are interested in that kind of event, nor it may know if any at all
  68. wants to know that. With such event, the originating module does not
  69. need any feedback.
  70. For each kind or family of notifications, there's a group. Everybody
  71. interested in that family of notifications subscribes to the group.
  72. When the event happens, it is sent (broadcasted) to the group, without
  73. requiring an answer.
  74. [[NOTE]]
  75. A care should be taken to avoid race conditions. Imagine one module
  76. provides some kind of state (let's say it's the configuration manager
  77. and the configuration is the shared state). The other modules are
  78. using notifications to update their copy when the configuration
  79. changes (eg. when the configuration changes, the configuration manager
  80. sends a notification with description of the change).
  81. The correct order is to first subscribe to the notifications and then
  82. request the whole configuration. If it was done the other way around,
  83. there would be a short time between the request and the subscription
  84. when an update to the state could happen without the module noticing.
  85. With first subscribing, the notification could come before the initial
  86. version is known or arrive even when the initial version already
  87. includes the change, but these are possible to handle, while the
  88. missing update is not.
  89. One-to-one RPC call
  90. ~~~~~~~~~~~~~~~~~~~
  91. Sometimes, a process needs to call remote function (or command) in
  92. other process. An example could be asking the configuration manager
  93. for the current configuration or asking it to change it, asking single
  94. process to terminate, etc.
  95. It may be that the group is a singleton group (eg. the command
  96. manager, there must be exactly one in a running system, and is used
  97. just as a stable name for the process) or an lname received by means
  98. of other communication (like a previous subscribe notification).
  99. A command message (containing the parameters, name of the command,
  100. etc) is sent, with the want-answer flag set. The other side processes
  101. the command and sends a result or error back.
  102. If the recipient does not exist, the daemon sends an error right away.
  103. There are still two ways this may fail to provide an answer:
  104. * The receiving module reads the command, but does not provide an
  105. answer. Clearly, such module is broken. There should be some (long)
  106. timeout for this situation, and loud logging to get it fixed.
  107. * The receiving module terminated at the exact time when daemon tried
  108. to send to it, or crashed handling the command. Therefore the
  109. sender listens for disconnect or unsubscription notifications
  110. (depending on if it was sent by lname or group name) and if the
  111. recipient disconnects, the sender knows it should not expect the
  112. answer any more.
  113. An asynchronous waiting for the answer is preferred.
  114. One-to-many RPC call
  115. ~~~~~~~~~~~~~~~~~~~~
  116. Sometimes it is needed to send a command to bunch of modules at once,
  117. usually all members of a group that can contain any number of clients.
  118. This would be done by requesting the members of the group from the
  119. daemon and then sending a one-to-one RPC call to each of them,
  120. tracking them separately.
  121. [NOTE]
  122. It might happen the list of group members changes between the time it
  123. was requested and the time the commands are sent. If a client gets
  124. disconnected, the sender gets an undeliverable error back from the
  125. daemon. If anything else happens (the client unsubscribes, connects,
  126. subscribes), it must explicitly synchronise to the state anyway,
  127. because we could have sent the commands before the change actually
  128. happened and it would look the same to the client.
  129. [WARNING]
  130. It would look better to first request the list of group members and
  131. then send the command to the group, and use the list to track the
  132. answers only. But that is prone to race conditions ‒ if there's any
  133. change between the request for the member list and sending the
  134. command, the actual recipients don't match the list and the server
  135. could get more answers than expected or could wait for answer of a
  136. module that no longer exists.
  137. Known limitations
  138. -----------------
  139. It is meant mostly as signalling protocol. Sending millions of
  140. messages or messages of several tens of megabytes is probably a bad
  141. idea. While there's no architectural limitation with regards of the
  142. number of transferred messages or their sizes, the code is not
  143. optimised and it would probably be very slow.
  144. We currently expect the system not to be at heavy load. Therefore, we
  145. expect the daemon to keep up with clients sending messages. The
  146. libraries write in blocking mode, which is no problem if the
  147. expectation is true, as the write buffers will generally be empty and
  148. the write wouldn't block, but if it turns out it is not the case, we
  149. might need to reconsider.