cc-protocol.txt 9.4 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296
  1. protocol version 0x536b616e
  2. DATA 0x01
  3. HASH 0x02
  4. LIST 0x03
  5. NULL 0x04
  6. TYPE_MASK 0x0f
  7. LENGTH_32 0x00
  8. LENGTH_16 0x10
  9. LENGTH_8 0x20
  10. LENGTH_MASK 0xf0
  11. MESSAGE ENCODING
  12. ----------------
  13. When decoding, the entire message length must be known. If this is
  14. transmitted over a raw stream such as TCP, this is usually encoded
  15. with a 4-byte length followed by the message itself. If some other
  16. wrapping is used (say as part of a different message structure) the
  17. length of the message must be preserved and included for decoding.
  18. The first 4 bytes of the message is the protocol version encoded
  19. directly as a 4-byte value. Immediately following this is a HASH
  20. element. The length of the hash element is the remainder of the
  21. message after subtracting 4 bytes for the protocol version.
  22. This initial HASH is intended to be used by the message routing system
  23. if one is in use.
  24. ITEM TYPES
  25. ----------
  26. There are four basic types encoded in this protocol. A simple data
  27. blob (DATA), a tag-value series (HASH), an ordered list (LIST), and
  28. a NULL type (which is used internally to encode DATA types which are
  29. empty and can be used to indicate existance without data in a hash.)
  30. Each item can be of any type, so a hash of hashes and hashes of lists
  31. are typical.
  32. All multi-byte integers which are encoded in binary are in network
  33. byte order.
  34. ITEM ENCODING
  35. -------------
  36. Each item is preceeded by a single byte which describes that item.
  37. This byte contains the item type and item length encoding:
  38. Thing Length Description
  39. ---------------- -------- ------------------------------------
  40. TyLen 1 byte Item type and length encoding
  41. Length variable Item data blob length
  42. Item Data variable Item data blob
  43. The TyLen field includes both the item data type and the item's
  44. length. The length bytes are encoded depending on the length of data
  45. portion, and the smallest data encoding type supported should be
  46. used. Note that this length compression is used just for data
  47. compactness. It is wasteful to encode the most common length (8-bit
  48. length) as 4 bytes, so this method allows one byte to be used rather
  49. than 4, three of which are nearly always zero.
  50. HASH
  51. ----
  52. This is a tag/value pair where each tag is an opaque unique blob and
  53. the data elements are of any type. Hashes are not encoded in any
  54. specific tag or item order.
  55. The length of the HASH's data area is processed for tag/value pairs
  56. until the entire area is consumed. Running out of data prematurely
  57. indicates an incorrectly encoded message.
  58. The data area consists of repeated items:
  59. Thing Length Description
  60. ---------------- -------- ------------------------------------
  61. Tag Length 1 byte The length of the tag.
  62. Tag Variable The tag name
  63. Item Variable Encoded item
  64. The Tag Length field is always one byte, which limits the tag name to
  65. 255 bytes maximum. A tag length of zero is invalid.
  66. LIST
  67. ----
  68. A LIST is a list of items encoded and decoded in a specific order.
  69. The order is chosen entirely by the source curing encoding.
  70. The length of the LIST's data is consumed by the ITEMs it contains.
  71. Running out of room prematurely indicates an incorrectly encoded
  72. message.
  73. The data area consists of repeated items:
  74. Thing Length Description
  75. -------------- ------ ----------------------------------------
  76. Item Variable Encoded item
  77. DATA
  78. ----
  79. A DATA item is a simple blob of data. No further processing of this
  80. data is performed by this protocol on these elements.
  81. The data blob is the entire data area. The data area can be 0 or more
  82. bytes long.
  83. It is typical to encode integers as strings rather than binary
  84. integers. However, so long as both sender and recipient agree on the
  85. format of the data blob itself, any blob encoding may be used.
  86. NULL
  87. ----
  88. This data element indicates no data is actually present. This can be
  89. used to indicate that a tag is present in a HASH but no data is
  90. actually at that location, or in a LIST to indicate empty item
  91. positions.
  92. There is no data portion of this type, and the encoded length is
  93. ignored and is always zero.
  94. Note that this is different than a DATA element with a zero length.
  95. EXAMPLE
  96. -------
  97. This is Ruby syntax, but should be clear enough for anyone to read.
  98. Example data encoding:
  99. {
  100. "from" => "sender@host",
  101. "to" => "recipient@host",
  102. "seq" => 1234,
  103. "data" => {
  104. "list" => [ 1, 2, nil, "this" ],
  105. "description" => "Fun for all",
  106. },
  107. }
  108. Wire-format:
  109. In this format, strings are not shown in hex, but are included "like
  110. this." Descriptions are written (like this.)
  111. Message Length: 0x64 (100 bytes)
  112. Protocol Version: 0x53 0x6b 0x61 0x6e
  113. (remaining length: 96 bytes)
  114. 0x04 "from" 0x21 0x0b "sender@host"
  115. 0x02 "to" 0x21 0x0e "recipient@host"
  116. 0x03 "seq" 0x21 0x04 "1234"
  117. 0x04 "data" 0x22
  118. 0x04 "list" 0x23
  119. 0x21 0x01 "1"
  120. 0x21 0x01 "2"
  121. 0x04
  122. 0x21 0x04 "this"
  123. 0x0b "description" 0x0b "Fun for all"
  124. MESSAGE ROUTING
  125. ---------------
  126. The message routing daemon uses the top-level hash to contain routing
  127. instructions and additional control data. Not all of these are
  128. required for various control message types; see the individual
  129. descriptions for more information.
  130. Tag Description
  131. ------- ----------------------------------------
  132. msg Sender-supplied data
  133. from sender's identity
  134. group Group name this message is being sent to
  135. instance Instance in this group
  136. repl if present, this message is a reply.
  137. seq sequence number, used in replies
  138. to recipient or "*" for no specific receiver
  139. type "send" for a channel message
  140. "type" is a DATA element, which indicates to the message routing
  141. system what the purpose of this message is.
  142. Get Local Name (type "getlname")
  143. --------------------------------
  144. Upon connection, this is the first message to be sent to the control
  145. daemon. It will return the local name of this client. Each
  146. connection gets its own unique local name, and local names are never
  147. repeated. They should be considered opaque strings, in a format
  148. useful only to the message routing system. They are used in replies
  149. or to send to a specific destination.
  150. To request the local name, the only element included is the
  151. "type" => "getlname"
  152. tuple. The response is also a simple, single tuple:
  153. "lname" => "UTF-8 encoded local name blob"
  154. Until this message is sent, no other types of messages may be sent on
  155. this connection.
  156. Regular Group Messages (type "send")
  157. ------------------------------------
  158. When sending a message:
  159. "msg" is the sender supplied data. It is encoded as per its type.
  160. It is a required field, but may be the NULL type if not needed.
  161. In OpenReg, this was another wire format message, stored as an
  162. ITEM_DATA. This was done to make it easy to decode the routing
  163. information without having to decode arbitrary application-supplied
  164. data, but rather treat this application data as an opaque blob.
  165. "from" is a DATA element, and its value is a UTF-8 encoded sender
  166. identity. It MUST be the "local name" supplied by the message
  167. routing system upon connection. The message routing system will
  168. enforce this, but will not add it. It is a required field.
  169. "group" is a DATA element, and its value is the UTF-8 encoded group
  170. name this message is being transmitted to. It is a required field for
  171. all messages of type "send".
  172. "instance" is a DATA element, and its value is the UTF-8 encoded
  173. instance name, with "*" meaning all instances.
  174. "repl" is the sequence number being replied to, if this is a reply.
  175. "seq" is a unique identity per client. That is, the <lname, seq>
  176. tuple must be unique over the lifetime of the connection, or at least
  177. over the lifetime of the expected reply duration.
  178. "to" is a DATA element, and its value is a UTF-8 encoded recipient
  179. identity. This must be a specific recipient name or "*" to indicate
  180. "all listeners on this channel." It is a required field.
  181. When a message of type "send" is received by the client, all the data
  182. is used as above. This indicates a message of the given type was
  183. received.
  184. A client does not see its own transmissions. (XXXMLG Need to check this)
  185. Group Subscriptions (type "subscribe")
  186. --------------------------------------
  187. A subscription requires the "group", "instance", and a flag to
  188. indicate the subscription type ("sybtype"). If instance is "*" the
  189. instance name will be ignored when decising to forward a message to
  190. this client or not.
  191. "subtype" is a DATA element, and contains "normal" for normal channel
  192. subscriptions, "meonly" for only those messages on a channel with the
  193. recipient specified exactly as the local name, or "promisc" to receive
  194. all channel messages regardless of other filters. As its name
  195. implies, "normal" is for typical subscriptions, and "promisc" is
  196. intended for channel message debugging.
  197. There is no response to this message.
  198. Group Unsubscribe (type "unsubscribe")
  199. -------------------------------
  200. The fields to be included are "group" and "instance" and have the same
  201. meaning as a "subscribe" message.
  202. There is no response to this message.
  203. Statistics (type "stats")
  204. -------------------------
  205. Request statistics from the message router. No other fields are
  206. inclued in the request.
  207. The response contains a single element "stats" which is an opaque
  208. element. This is used mostly for debugging, and its format is
  209. specific to the message router. In general, some method to simply
  210. dump raw messages would produce something useful during debugging.