Messaging system ¶
DISCLAIMER This document describes the messaging system. I don’t know all the internals of that, so the knowledge described here is very sparse and maybe inaccurate.
High-level view ¶
To use the messaging layer, you need a Session. Sessions are connected to each other and make an undirected graph that we call a galaxy of sessions. On a galaxy of sessions, there is a list of services and the service with id 1 is the ServiceDirectory which contains that list.
The ServiceDirectory runs on a session which is the entry point for other sessions to connect. When a session exposes a service, it listens on some other port so that other sessions can connect to it when they need to contact the service.
+-------------+ +---------+ Service | | | Directory | | +------+------+ | | | | | | +-------------+ | | | | | +--------+ SomeService | | | | | | | +-------------+ | | +-------+------+ | +--------------+ | Client | +--------+ OtherService | | session +------------------+ | +--------------+ +--------------+
There is no real difference between a client session and a service session. In this diagram, it is just that the Client session does not expose any service, but apart from that, it is a normal session. A session can call other services whether it hosts services or not.
Address system ¶
There is a three-part address system in qimessaging used to call functions, trigger events, etc.
Services have a associated service ID. A service can give one or more objects (by returning them). When a service gives an object, this object gets the same service ID but a different object ID.
For example, the service directory is always service 1. If you have a service 2, that service is addressed as 2.1. If that service gives you an object, that object will have ID 2, which means its address is 2.2, the next will have ID 3. The object ID 1 is always reserved for the service itself.
There is a third part on addresses which is the function ID. Every function, event and property has an ID. User functions usually start at ID 100. IDs below 100 are hidden and reserved for qimessaging’s usage.
Exchanging objects ¶
It is possible to exchange objects between a session and a service by returning it from a function call or by giving it as an argument. When doing that, the object will be wrapped by qimessaging objects.
You can see in the diagram below how an object is exposed from the server on the right to the client on the left. Keep in mind that what I call client or server in this section has no real importance as they can be swapped. You can pass objects from service to client or from client to service and you will reach the state described in this diagram.
Network + | | +------------------+ | +-----------------+ | | | | | | RemoteObject +----------+ BoundObject | | | | | | +---------^--------+ | +--------+--------+ | | | | | | +---------+--------+ | +--------v--------+ | | | | | | AnyObject | | | AnyObject | | | | | | +------------------+ | +-----------------+ | | +
When the object is exposed from the server, it will be wrapped in a BoundObject that will own a reference to the AnyObject to keep it alive.
When the object is received by the client, you get an AnyObject which points to a DynamicObject.
If the same object is returned twice for example, two BoundObjects will be created and two RemoteObject will be created on the remote side. They will be independent and have different addresses.
As said before, BoundObject owns a reference to the AnyObject so that the RemoteObject can still contact it later.
There are three cases which causes that reference to be dropped:
When the RemoteObject dies, it will call the hidden method
terminate()which will ultimately destroy the BoundObject and release the reference.
- When the connection with the client is lost (legitimately or because of an error), all BoundObjects exposed to it will be destroyed, releasing their reference to the AnyObject.
- When the service is unregistered, all addresses will become invalid, thus the BoundObjects will also be destroyed, releasing their reference.
Sending an object through a RemoteObject or BoundObject ¶
both inherit from
which is a class
that has a list of
. They both inherit from it because you also
-s when sending an object as an
does not have an ID to refer to (it is not a service),
when sending an object through it, it must create an address. The service ID of
the address it forges is the one of the service it sends the object to. To avoid
having two objects with the same IDs on the service, it starts counting object
IDs from 2^31.
I don’t know what happens when object IDs overflow, probably bad things that they only talk about in books.
Lower-level view ¶
In the end, qimessaging all comes down to the exchange of
message is a structure with different fields:
- a magic number to identify qimessaging packets
- an ID and a type which will be detailed later
- a destination address with Service.Object.Function
- a variable-size signature to describe the payload of the message
- a buffer which contains the payload
The last two parts are the only variable part of the message, the rest is part of the fixed-size header.
Messages types and IDs ¶
Messages have different types described in message.hpp . Message IDs are always incremented when sending a new request, for example a call (and maybe other types).
Then the response to that call may be Reply, Error or Canceled. In whichever case, the response will have the same ID as the Call message to associate them together.
Exchanging messages ¶
At the lowest level, we use Boost.Asio to handle the TCP sockets. This is
wrapped in the class
which inherits from the virtual class
. It is possible to implement the messaging over something else
by specializing that class, for example to use UNIX sockets, pipes on the file
system (why not!), etc.
I don’t know the exact path of messages through the classes, but for a call,
first there is a
signal triggered in
is a class that receives that signal (probably
) and forwards it to the
concerned object by calling its
method. Then the object can handle
the call and reply later by doing a send on the socket.
If the message is a Reply, I think the
is directly connected to
and handles the messages. There is a
class somewhere that probably does something.
Middle-level view (serialization) ¶
which, as the name suggests,
is a context associated to a stream. It is used to hold various information to
know how to serialize things.
For example, when sending an object, we send the whole
with it. But
when getting a service twice, the
(which is a heavy structure) is
still the same and there is no need to send it again. For that, there is a
cache system and we keep a bit of information in the
which is “Have we already sent this
stream?”. That way we avoid sending it twice, and the remote side does not
expect to receive it a second time.
The other thing it is used for is capabilities. Capabilities are there to
support protocol evolutions. For example, when the
cache system was
added, a new capability was added so that we can identify processes using the
old protocol and not use that system with them. These capabilities are
associated with a remote host, so they are stored in the
Serialization is done by the class
class is a visitor that has access to the
and visits the type to
serialize. It fills-in a buffer with its associated signature so that they can
be sent in the
Deserialization is a done in a very similar way. When a message is received, the
associated signature is used to instantiate an empty
with the type
interfaces corresponding to it. This is done in
visitor visits this empty value and fills it in with
the data from the buffer.
What has gone bad (and why shared memory is half broken)? ¶
This technique is not the best one (IMHO). Values are default initialized and then updated, and this pattern seems broken to me.
When implementing shared memory, we did not change the signature protocol and we
handle shared buffers the same way as normal buffers in signatures. When
deserializing, because of that, we need to change the type of the
while we read the message buffer. It is only at that moment that we know that
the message does not contain the buffer but only shared memory coordinates and
that we must use the shared memory type interface for the
As it is implemented right now, the function deserialize updates the AnyReference and replaces the type of it.
There is a problem with that when the shared buffer is part of a tuple
) which has a static list of subtypes. This type interface
is instantiated from the signature of the message, and thus with a normal
BufferTypeInterface. When we reach that part when we need to instantiate the
shared buffer type interface, it is too late to replace the one in the structure
which was created at the beginning of the deserialization.
My solution to this, that I find more elegant and that solves the problem, is to
rewrite that part and instantiate the whole tree only once with the correct
values (not default initializing and then assigning). This implies writing
another visitor to visit the signature directly and not reusing the one on
. The system would instantiate the type interfaces from the bottom-up.