Monday, January 3, 2011

Observations on HelenOS device drivers framework

Some time ago we merged Lenka's new Device Drivers Framework (DDF henceforth) to HelenOS mainline. When I finally read through her thesis and had a closer look at the code I promised to do a writeup of my observations. Here it is. A bit of a long read though, be warned.

Let's start with a quick overview. Lenka's contribution can be broken down into several parts. First there is the devman server which (a) manages the physical device topology, driver task life cycle, driver attaching and detaching, (b) maintains a 'logical' device topology (devices grouped by their class).

Second, Lenka proposed that for any IPC communication between two drivers or an application and driver should be strictly defined in terms of function calls and the client- and server-side IPC glue code should be kept strictly separate in a library to be reused by all servers and clients using such code.

Third, when mapping the function calls to IPC, Lenka reserved the first field of the message for an interface id. This allowed her to communicate with a driver using several different protocols, namely a protocol for talking to the driver as such (create new instance) and for talking to the device itself (e.g. read/write, etc.).

As you can see Lenka took on quite a challenge and tried to cover area that extends quite far past mere device management. I will not try to cover it all here, if you are interested, go ahead and read her thesis. I will only focus here on parts that I found particularly interesting, where I found room for improvement or that inspired me somehow.

Summary of my observations. I have some practical concerns about the DDF, as it is implemented now:
  • Problematic support of pseudo-devices (e.g. file_bd)
  • Problematic support of 'multi-function' devices (e.g. i8042)

I believe these need to be addressed before we can convert our current drivers to the new driver model. Also, what I don't like is:
  • Making devices 'special' and making them use different communication than the rest of the system.

There are also two topics which I haven't seen mentioned in the thesis although they are highly relevant:
  • parallelism in device enumeration
  • persistent device naming

I have some ideas what could be handled better or differently, which I will outline below.

IPC communication suggestions

The problem

Lenka faced a practical problem when communicating to a driver. Sometimes she needed to control the driver as such using a generic driver protocol (e.g. add_device operation), sometimes she needed to talk to the driver using a device-specific protocol (e.g. char_read, char_write), etc.

For this reason Lenka came up with the idea that a driver could implement multiple different protocols. She reserves the first field of an IPC message for the protocol ID, moving the method ID effectively to the second field of the message.

Lenka envisaged another use of the multiple interface support, namely that a device could support multiple device-specific protocols and that the interfaces and their API stubs could be reused (in a duck-type manner) and combined, possibly for semantically completely different (but syntactically equivalent) sets of operations.

The first thing that I don't like at all is that here devices drivers are made somehow special and they use a communication protocol completely different from the rest of the system services. There is no reason for that, it will just make communication between different types of tasks incompatible.

Better server task model

The proposed mechanism looks useful and something similar could/should be used for all IPC communications, not just to the device drivers!

Recently Martin Děcký pushed some changes where he started a change in order to embed the interface ID (in the same sense as Lenka uses) to the first IPC message field (along with method ID) to use it in general IPC communication.

Great! This is way better for two reasons:
  1. It is generic, applies to all IPC communication equally
  2. It is more space-conscious, does not consume an entire argument

Still, I think we can do much better than that. (I already outlined this in the mailing list, but I'll repeat it here).

First, embedding the interface ID in every message is wasteful, it stays the same for the duration of an exchange, at the minimum.

Second, I think the problem being solved stems from the fact that we are trying to control different entities within the same server task. So, instead of trying to talk different protocols to the same entity, we are, in fact, trying to talk to different entities (each speaking different protocol, incidentally).

The model of a server task can be described in terms of object-oriented programming:
  • Lenka says: A server task is an object that implements one or more interfaces.
  • Jiří says: A server task is a collection of objects, each belonging to (exactly one) class (classes form a true inheritance hierarchy).

If we consider the driver to be a collection of objects (driver control object, device control object(s)), it also solves the problem for the case of the device driver.

Another way to look at this is that we add another level of addressing. We can imagine that we access the inside of the server task via different points of access. We can call them ports in analogy with TCP/IP ports (or RM ISO/OSI SAPs) which allow talking to different entities within the same IP interface (IP node, IP address).

One of the benefits of ports is that when a server provides access to a set of resources of some type (e.g. file, device) we can map each individual resource to a port. Then we don't need a special identifier (file descriptor, device descriptor) since the individual resource is already identified by the port (port ID). So we save one message field plus the framework actually understands that this is an identifier.

Port references can be optionally made unforgeable (which is a very useful property). Ports can be implemented in different ways and have numerous other benefits which I will not detail here if just not to scare you off ;-)

Implementation note: To allow the async framework to embed some information (any of interface ID, port ID, virtual path ID) automatically into a request slight changes to the async API are needed (when making a request we should pass a pointer to an exchange instead of integer/phone ID). Nevertheless the conversion of applications and servers is straight-forward and with a split API it can be implemented gradually, if needed.

Better use of message space

When talking to a port, we'll talk the same protocol the entire time. Therefore, we can establish that right upon connecting. Not only we eliminate redundancy, but we can tell that someone does not speak the expected protocol straight away (better than recognizing it upon the first operation).

So rather than embedding interface ID in the message, we can embed the port ID (having established the protocol spoken by a port already).

Still, an entire exchange (see async sessions) will always talk to just one port, so this still seems a little wasteful.

We could instead embed something that would allow us to identify different exchanges and multiplex parallel operations onto a single physical IPC connection.

A good way to implement that seems to embed a virtual path ID. Virtual paths would be set up and torn down by the async framework (session code) on the client side. Each virtual path would be assigned an ID. The server would create a fibril to service each virtual path (within each connection). The session management code would no longer create (kernel-based) IPC connections for concurrent communication, but instead it would create (userspace-based) virtual paths. This provides and alternative (although not necessarily better?) to the current implementation, which creates multiple physical connections.

The session API combines with this in the following way:
  • A client initiates a session to a server port.
  • On the session the client performs exchanges as usual.
  • Internally the async framework creates as many parallel virtual paths (within the physical connection) as necessary
  • The async framework on the server side creates a handler fibril for each incoming virtual path

Device model suggestions

Current model

In the current model we have two types of device nodes: nexus nodes and leaf nodes. A nexus node ('bus device') consists of several objects:
  • 1 driver control object (generic driver protocol)
  • 1 nexus (bus) device control object (device-specific protocol)
  • n device attachment points (bus-specific protocol)

A leaf node consists of:
  • 1 driver control object (generic driver protocol)
  • 1 leaf device control object (device-specific protocol)

Important to note is that the driver control object and device control objects always coexist (you can't have one without the other).


The biggest problem I see in the current model is that it requires a 1:1 relationship between a physical device (something to which you can attach a driver) and its logical functions (services provided by the device), at least for leaf drivers (you can't have one without the other).


Let's call a pseudo-driver any driver whose life-cycle is not naturally controlled by bus enumeration. An example is file_bd whose instance is naturally created when the user types in some command.

If we wanted to port file_bd to the current DDF we would need to also add some special pseudo nexus driver (possibly a special one for file_bd) which the application would tell to attach a new child. Then we would need to somehow communicate the parameters to the child and...aargh!

While the plain devmap-style drivers can be modified to speak a protocol compatible to the devman-style drivers, they are not fully fledged devices. We can't add them to device classes, so if we obtain a list of devices in some class, these pseudo-devices won't be included.

Multifunction device

Let's call a multifunction device any device which behind one physical interface provides access to multiple logical functions.

Buses can be considered multifunction devices (they expose multiple device attachment points). Our concern however are multifunction leaf devices. A typical example is i8042 (which provides two character interfaces) or a (partitioned) disk (which provides multiple block interfaces).

If we wanted to port these drivers to the current DDF, we would need to implement them in terms of a nexus device plus a leaf (connector) device. Even if we implemented both the nexus and connector device inside a single driver binary, it is just unnecessary complication. The connector device would do nothing but forward IPC communication to the nexus device. Why clutter device drivers with completely generic code?

Proposed solution

I propose a more uniform device model, where each (nexus or leaf device) consists of:
  • 1 driver object
  • 1 (physical) device object
  • n logical functions

Each logical function can be either a external / leaf function (exposed to clients outside of DDF) or internal / non-leaf function to which the DDF can attach more levels of the device hierarchy.

The choice of whether a logical function is external or internal could be up to the (nexus) device driver or, possibly, it could be made to depend on other factors (such as whether a device driver could be attached). Mixed-mode (leaf and nexus) drivers would be possible, if desired.

More importantly, multifunction leaf devices would be easy to implement (just change the flavor of the functions from internal to external and voila). No need for implementing connector device nodes in such driver.

Also we can make it possible for any server to call the DDF and say 'hey, I'm a pseudo driver' and to create logical functions. This enables us to construct (leaf) pseudo-device drivers in a simple and straightforward manner. These don't need to be included in the physical device tree (equivalent of UN*X /devices). Only their logical functions are included in the logical device function name space (equivalent of UN*X /dev).

An interesting problem is the existence of pseudo nexus devices. A pseudo nexus device must be included in the physical device hierarchy (because its children are not pseudo devices in the sense described above). If we choose to allow pseudo nexus drivers, we will need, in the end, some 'pseudo' node in the device tree as a root for them. Still, we will have the benefit of manual device lifetime management. I am not sure whether pseudo nexus devices are useful or not.

As a side note, the operation add logical function is (conceptually) more generic than add child because it does not imply what gets connected to the attachment point (logical function). In a multi-pathed device scenario the device node is not attached directly to the parent node. Instead for some bus B there is a multiplexer which connects to each B host adapter node. All device nodes then attach to this multiplexer. (So their connection to the bus host adapter node is indirect).

Providing device services

Currently devman manages device classes. A device driver will register a device in some class and then applications can e.g. list all devices in some class.

This is not necessarily bad, although it might be an unnecessary complication. If we implement location services in the CORBA sense (naming and trading service), they will provide the same functionality for all IPC services.

(Explanation of what I mean here by LS in the CORBA sense: Any server can register a (non-singleton) service, specifying its name (generated hierarchical unambiguous (string) identifier) and type (hierarchical name of a class/type of the service). Then we can list all services, services by class, etc.)

Now it is perfectly possible to export services provided by some device via the location services instead of implementing a special registry in devman.

So, in the scenario described above, there would seem to be little added value in using a special devman class database, instead of the more generic (and possibly smarter) location services.


I described what I see as important limitations of the current HelenOS DDF. Lenka's work inspired me with many interesting ideas, which I described. I proposed a number of improvement proposals either to the DDF itself or in IPC. The proposals form a coherent vision, but many of the individual ideas are to a degree independent and thus they can be implemented individually and gradually.

Thanks for reading.

No comments:

Post a Comment