Basic principles of Wayland

Wayland is a client-server protocol. Clients that want to display their graphics on the screen (for example, apps that want their graphical interface displayed) connect and talk to the server, which is also called the Wayland compositor, because it puts clients' contents (for example, several windows) together to form a single output that is displayed on an screen.

The compositor may also want to apply some transformations to the clients' contents. For example, it may want to scale windows or even rotate them in 3D space to implement some kind of Overview mode. Or it may apply some cool effects like "wobbly windows". That's none of clients' concern, however: they are unaware about any transformations that the compositor applies to the contents they present.

Wayland has several layers. Wire format specifies how exactly pieces of data are serialized, transmitted and deserialized. These details are not important for us, since we're going to use a library that abstracts them away.

Wire format also specifies that this communications happen over a Unix domain stream socket. The socket is located in the filesystem at a path like /run/user/1000/wayland-0 (or, more accurately, $XDG_RUNTIME_DIR/$WAYLAND_DISPLAY). The socket is created by the server, usually upon its start-up. Since it's the server who usually forks off clients, it usually sets the WAYLAND_DISPLAY environment variable for them appropriately, so that they know where to connect to. TODO: get rid of repeated "usually"

Next, Wayland specifies the object model. Wayland is object-oriented: all the communication between a client and a server is formulated in the form of method calls on some objects. These objects are just a handy abstraction and do not actually exist anywhere - though the client and the server may want to keep some metadata about them.

Object methods are of two kinds: requests and events. Requests are methods called by the client, whereas events are issued by the server. For example, a wl_pointer object has the wl_pointer.set_cursor request that changes the cursor image and the wl_pointer.motion event.

Both requests and events can carry additional data (arguments) with them, just as a function/method call in a programming language. For example, wl_pointer.motion event has time and two coordinates as its arguments.

However, neither type of methods can have responses or return values. That's right, there's no mechanism for returning anything from a method call in Wayland. To see why, it's important to understand three related concepts:

Wayland is message-based. Method calls are transmitted as messages between the client and the server (the direction, naturally, depends on whether the method is a request or an event). A message contains the ID of the object, opcode (basically, statically known ID) of the method, and the arguments. How exactly are the contents of messages layed out is specified as a part of the wire format.

What's important is that sending a message doesn't block the client (or the server) from performing further work, including sending further messages. In other words, there's no roundtrip delay. Efficiently avoiding roundtrip delays was one of the design goals of Wayland, as they can significantly slow down the communication, and Wayland is designed to be blazing fast and efficient.

As a result, Wayland is asynchronous. When you call a method, you can proceed immediately, but you don't get a return value or a confirmation back. If a method logically requires a response, it is usually implemented in terms of another method call that carries the response in its arguments. For example, a client may specify that it wants a window full-screened with wl_shell_surface.set_fullscreen request. The compositor replies with wl_shell_surface.configure event, passing the new window dimensions (i.e. screen size) as its arguments. This scheme, obviously, involves a roundtrip delay, and seemingly has no benefit over potential "native" return values.

The important thing to understand here is that the logic implemented by the client and the server should be asynchronous too. The delay between set_fullscreen and configure may be caused not only by the small processing time (or even quite large network latency time if the communication is somehow forwarded over a network). The server may ask the user if they want to allow the client to become full-screen and only issue the configure event after (and if) the user agrees. Generally, the client should use the set_fullscreen request as an indication that it would like to be full-screened in the near future and continue to work normally as if nothing happened, and react to configure events at any point of operation in the same way (e.g. by resizing), no matter what they were caused by.

Another common pattern is a function that initializes and returns a new object. This could be implemented with the same request/event pair approach. But since the Wayland objects do not carry any data inside, the only thing the client and the server must negotiate about is the new object's ID. Naively, the server should pass the ID to the client; this, again, requires a round-trip delay. That is why in Wayland it's the client who passes the new object's ID to the server as one of the arguments to the request that logically creates the object. This way, there's no waiting or blocking required, and the client can proceed immediately, including making requests on the new object.

Although less frequently, the same techniques are also applied the other way around. Sometimes there's an event/request pair (for example, wl_shell_surface.ping and wl_shell_surface.pong); and when the server creates a new object, it sends its ID to the client as an event argument.

Lastly, Wayland specifies concrete types of objects (called interfaces) and the methods that can be called on them, including all that were mentioned above. This is known as Wayland core protocol. For example, there is wl_surface interface, and it has the wl_surface.attach request.

Unlike the layers we discussed before, the core protocol is designed to be extensible, i.e. it's possible to add new interfaces to it. There are several extensions out there, the most known being xdg-shell, which we'll talk about in the next section.

results matching ""

    No results matching ""