ipc-high.txt 13 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349
  1. The IPC protocol
  2. ================
  3. While the cc-protocol.txt describes the low-level primitives, here we
  4. describe how the whole IPC should work and how to use it.
  5. Definitions
  6. -----------
  7. system::
  8. The system that moves data between the users and does bookkeeping.
  9. In our current implementation, it is implemented as the MsgQ daemon,
  10. which the users connect to and it routes the data.
  11. user::
  12. Usually a process; generally an entity that wants to communicate
  13. with the other users.
  14. session::
  15. Session is the interface by which the user communicates with the
  16. system. Single user may have multiple sessions, a session belongs to
  17. single user.
  18. message::
  19. A data blob sent by one user. The recipient might be the system
  20. itself, other user or set of users (possibly empty). Message is
  21. either a response or an original message (TODO: Better name?).
  22. group::
  23. A named set of sessions. Conceptually, all the possible groups
  24. exist, there's no explicit creation and deletion of groups.
  25. session id::
  26. Unique identifier of a session. It is not reused for the whole
  27. lifetime of the system. Historically called `lname` in the code.
  28. undelivery notification::
  29. While sending an original message, a client may request an
  30. undelivery notification. If the recipient specification yields no
  31. sessions to deliver the message to, the system informs user about
  32. the situation.
  33. sequence number::
  34. Each message sent through the system carries a sequence number. The
  35. number should be unique per sender. It can be used to pair a
  36. response to the original message, since the response specifies which
  37. sequence number had the message it response to. Even responses and
  38. messages not expecting answer have their sequence number, but it is
  39. generally unused.
  40. non-blocking operation::
  41. Operation that will complete without waiting for anything.
  42. fast operation::
  43. Operation that may wait for other process, but only for a very short
  44. time. Generally, this includes communication between the user and
  45. system, but not between two clients. It can be expected to be fast
  46. enough to use this inside an interactive session, but may be too
  47. heavy in the middle of query processing, for example. Every
  48. non-blocking operation is considered fast.
  49. The session
  50. -----------
  51. The session interface allows for several operations interacting with
  52. the system. In the code, it is represented by a class.
  53. Possible operations include:
  54. Opening a session::
  55. The session is created and connects to the system. This operation is
  56. fast. The session
  57. receives session id from the system.
  58. Group management::
  59. A user may subscribe (become member) of a group, or unsubscribe from
  60. a group.
  61. Send::
  62. A user may send a message, addressed to the system, or other
  63. client(s). This operation is generally expected to be non-blocking
  64. (but it may be based on the assumption of OS buffering and the
  65. system not being overloaded).
  66. Receive synchronously::
  67. User may wait for an incoming message in blocking mode. It is
  68. possible to specify the kind of message to wait for, either original
  69. message or response to a message. This interface has a timeout.
  70. Receive asynchronously::
  71. Similar to previous, but non-blocking. It terminates immediately.
  72. The user provides a callback that is invoked when the requested
  73. message arrives.
  74. Terminate::
  75. A session may be terminated. No more messages are sent or received
  76. over it, the session is automatically unsubscribed from all the
  77. groups. A session is terminated automatically if the user exits.
  78. Assumptions
  79. -----------
  80. We assume reliability and order of delivery. Messages sent from user A
  81. to B are all delivered unchanged in original order as long as B
  82. exists.
  83. All above operations are expected to always succeed. If there's an
  84. error reported, it should be considered fatal and user should
  85. exit. In case a user still wants to continue, the session must be
  86. considered terminated and a new one must be created. Care must be
  87. taken not to use any information obtained from the previous session,
  88. since the state in other users and the system may have changed during
  89. the reconnect.
  90. Addressing
  91. ----------
  92. Addressing happens in three ways:
  93. By group name::
  94. The message is routed to all the sessions subscribed to this group.
  95. It is legal to address an empty group; such message is then
  96. delivered to no sessions.
  97. By session ID::
  98. The message is sent to the single session, if it is still alive.
  99. By an alias::
  100. A session may have any number of aliases - well known names. Only
  101. single session may hold given alias (but it is not yet enforced by
  102. the system). The message is delivered to the one session owning the
  103. alias, if any. Internally, the aliases are implemented as groups
  104. with single subscribed session, so it is the same as the first
  105. option on the protocol level, but semantically it is different.
  106. The system
  107. ----------
  108. The system performs these goals:
  109. * Maintains the open sessions and allows creating new ones.
  110. * Keeps information about groups and which sessions are subscribed to
  111. which group.
  112. * Routes the messages between users.
  113. Also, the system itself is a user of the system. It can be reached by
  114. the alias `Msgq` and provides following high-level services (see
  115. below):
  116. Notifications about sessions::
  117. When a session is opened to the system or when a session is
  118. terminated, a notification is sent to interested users. The
  119. notification contains the session ID of the session in question.
  120. The termination notification is probably more useful (if a user
  121. communicated with a given session before, it might be interested it
  122. is no longer available), the opening notification is provided mostly
  123. for completeness.
  124. Notifications about group subscriptions::
  125. When a session subscribes to a group or unsubscribes from a group, a
  126. notification is sent to interested users. The notification contains
  127. both the session ID of the session subscribing/unsubscribing and
  128. name of the group.
  129. Commands to list sessions::
  130. There's a command to list session IDs of all currently opened sessions
  131. and a command to list session IDs of all sessions subscribed to a
  132. given group. Note that using these lists might need some care, as
  133. the information might be outdated at the time it is delivered to the
  134. user.
  135. Note that in early stages of startup (before the configuration
  136. manager's session is opened), the `Msgq` alias is not yet available.
  137. Higher-level services
  138. ---------------------
  139. While the system is able to send any kind of data, the payload sent by
  140. users in bind10 is structured data encoded as JSON. The messages sent
  141. are of three general types:
  142. Command::
  143. A message sent to single destination, with the undeliverable
  144. notifications turned on and expecting an answer. This is a request
  145. to perform some operation on the recipient (it can have side effects
  146. or not). The command is identified by a name and it can have
  147. parameters. A command with the same name may behave differently (or
  148. have different parameters) on different receiving users.
  149. Reply::
  150. An answer to the `Command`. It is sent directly to the session where
  151. the command originated from, does not expect further answer and the
  152. undeliverable notification is not set. It either confirms the
  153. command was run successfully and contains an optional result, or
  154. notifies the sender of failure to run the command. Success and
  155. failure differ only in the payload sent through the system, not in
  156. the way it is sent. The undeliverable notification is failure
  157. reply sent by the system on behalf of the missing recipient.
  158. Notification::
  159. A message sent to any number of destinations (eg. sent to a group),
  160. not expecting an answer. It notifies other users about an event or
  161. change of state.
  162. Details of the higher-level
  163. ---------------------------
  164. While there are libraries implementing the communication in convenient
  165. way, it is useful to know what happens inside.
  166. The notifications are probably the simplest. Users interested in
  167. receiving notifications of some family subscribe to corresponding
  168. group. Then, a client sends a message to the group. For example, if
  169. clients `receiver-A` and `receiver-B` want to receive notifications
  170. about changes to zone data, they'd subscribe to the
  171. `Notifications/ZoneUpdates` group. Then, other client (let's say
  172. `XfrIn`, with session ID `s12345`) would send something like:
  173. s12345 -> Notifications/ZoneUpdates
  174. {"notification": ["zone-update", {
  175. "class": "IN",
  176. "origin": "example.org.",
  177. "serial": 123456
  178. }]}
  179. Both receivers would receive the message and know that the
  180. `example.org` zone is now at version 123456. Note that multiple users
  181. may produce the same kind of notification. Also, single group may be
  182. used to send multiple notification names (but they should be related;
  183. in our example, the `Notifications/ZoneUpdates` could be used for
  184. `zone-update`, `zone-available` and `zone-unavailable` notifications
  185. for change in zone data, configuration of new zone in the system and
  186. removal of a zone from configuration).
  187. Sending a command to single recipient is slightly more complex. The
  188. sending user sends a message to the receiving one, addressed either by
  189. session ID or by an alias (group to which at most one session may be
  190. subscribed). The message contains the name of the command and
  191. parameters. It is sent with the undeliverable notifications turned on.
  192. The user also starts a timer (with reasonably long timeout). The
  193. sender also subscribes to notifications about terminated sessions or
  194. unsubscription from the alias group.
  195. The receiving user gets the message, runs the command and sends a
  196. response back, with the result. The response has the undeliverable
  197. notification turned off and it is marked as response to the message
  198. containing the command. The sending user receives the answer and pairs
  199. it with the command.
  200. There are several things that may go wrong.
  201. * There might be an error on the receiving user (bad parameters, the
  202. operation failed, the recipient doesn't know command of that name).
  203. The receiving side sends the response as previous, the only
  204. difference is the content of the payload. The sending user is
  205. notified about it, without delays.
  206. * The recipient user doesn't exist (either the session ID is wrong or
  207. terminated already, or the alias is empty). The system sends a
  208. failure response and the sending user knows immediately the command
  209. failed.
  210. * The recipient disconnects while processing the command (possibly
  211. crashes). The sender gets a notification about disconnection or
  212. unsubscription from the alias group and knows the answer won't come.
  213. * The recipient ``blackholes'' the command. It receives it, but never
  214. answers. The timeout in sender times out. As this is a serious
  215. programmer error in the recipient and should be rare, the sender
  216. should at least log an error to notify about the case.
  217. One example would be asking the question of life, universe and
  218. everything (all the examples assume the sending user is already
  219. subscribed to the notifications):
  220. s12345 -> DeepThought
  221. {"command": ["question", {
  222. "what": ["Life", "Universe", "*"]
  223. }]}
  224. s23456 -> s12345
  225. {"reply": [0, 42]}
  226. The deep thought had an alias. But the answer is sent from its session
  227. ID. The `0` in the reply means ``success''.
  228. Another example might be asking for some data at a bureau and getting
  229. an error:
  230. s12345 -> Burreau
  231. {"command": ["provide-information", {
  232. "about": "me",
  233. "topic": "taxes"
  234. }]}
  235. s23456 -> s12345
  236. {"reply": [1, "You need to fill in other form"]}
  237. And, in this example, the sender is trying to reach an non-existent
  238. session.
  239. s12345 -> s0
  240. {"command": ["ping"]}
  241. msgq -> s12345
  242. {"reply": [-1, "No such recipient"]}
  243. Last, an example when the other user disconnects while processing the
  244. command.
  245. s12345 -> s23456
  246. {"command": ["shutdown"]}
  247. msgq -> s12345
  248. {"notification": ["disconnected", {
  249. "lname": "s23456"
  250. }]}
  251. The system does not support sending a command to multiple users
  252. directly. It can be accomplished as this:
  253. * The sending user calls a command on the system to get list of
  254. sessions in given group. This is command to alias, so it can be done
  255. by the previous way.
  256. * After receiving the list of session IDs, multiple copies of the
  257. command are sent by the sending user, one to each of the session
  258. IDs.
  259. * Successes and failures are handled the same as above, since these
  260. are just single-recipient commands.
  261. So, this would be an example with unhelpful war council.
  262. s12345 -> Msgq
  263. {"command": ["get-subscriptions", {
  264. "group": "WarCouncil"
  265. }]}
  266. msgq -> s12345
  267. {"reply": [0, ["s1", "s2", "s3"]]}
  268. s12345 -> s1
  269. {"command": ["advice", {
  270. "topic": "Should we attack?"
  271. }]}
  272. s12345 -> s2
  273. {"command": ["advice", {
  274. "topic": "Should we attack?"
  275. }]}
  276. s12345 -> s3
  277. {"command": ["advice", {
  278. "topic": "Should we attack?"
  279. }]}
  280. s1 -> s12345
  281. {"reply": [0, true]}
  282. s2 -> s12345
  283. {"reply": [0, false]}
  284. s3 -> s12345
  285. Known limitations
  286. -----------------
  287. It is meant mostly as signalling protocol. Sending millions of
  288. messages or messages of several tens of megabytes is probably a bad
  289. idea. While there's no architectural limitation with regards of the
  290. number of transferred messages and the maximum size of message is 4GB,
  291. the code is not optimised and it would probably be very slow.
  292. We currently expect the system not to be at heavy load. Therefore, we
  293. expect the system to keep up with clients sending messages. The
  294. libraries write in blocking mode, which is no problem if the
  295. expectation is true, as the write buffers will generally be empty and
  296. the write wouldn't block, but if it turns out it is not the case, we
  297. might need to reconsider.