Notes on Project Oberon

I recently read the first part of Project Oberon's book. Two things caught my attention in that document, and I want to share my thoughts about them here.

On abstractions

The document has this to say about them:

Abstraction is indeed the key of any modularization, and without modularization every hope of being able to guarantee reliability and correctness vanishes.

On a first read, this seems logical. I'd guess that for most people writing software this seems logical too. However, I've been doing some thinking on abstractions for a while, and I concluded that abstractions actually prevent us from guaranteeing reliability and correctness at the level of the whole system.

The key property of an abstraction is to hide certain implementation details so people can work with them using a standard interface (that of the abstraction). The document also agrees with this:

Every abstraction inherently hides details, namely those from which it abstracts.

This means that as long as we view a system only through its abstractions, we can indeed guarantee some degree of reliability and correctness.

However, we can't say the same about the whole system. It is built with components that may also act in ways that aren't covered by the abstraction (I guess that almost all components will behave like this). This means that if you actually want to guarantee reliability and correctess of the entire system, you have to pierce the "abstraction veil" and look at the actual properties and behaviours of the components being used in it, not the abstractions they represent.

For example, you can't study a system's reliability without understanding its performance properties, and you can't study performance properties only looking at abstractions, because most of them do not include performance properties at all. To build an understanding of performance properties you have to look at the actual thing behind the abstraction and understand how it's implemented.

So to me it seems that we don't want abstractions when trying to study certain things about a whole system. Instead, we want to view all of its components and then build our understanding from there. Abstractions hide the things that we care about.

The document adds "modularization" as a slight level of indirection, which I'm reading with the meaning of grouping some things together and only providing an abstract interface for other modules to use. But it's also possible to modularise something without hiding any implementation details from other mdoules that want to connect to it. In this case, modules are purely a mechanism to logically group some components and make it easier for other humans to work with them or talk about them. I think it's somewhat clear that with this narrower meaning, we can group a system in modules, but still have a full view of it that allows us to study its reliability and correctness at the whole system level, because there's no “abstracting” going on.

On system evolution

As a system evolves, it's likely that some abstractions will have to change. Imagine some hardware that introduces a feature specific to that hardware, which is not covered by any existing abstraction. Users relying on the abstraction won't be able to take full advantage of their hardware.

To correct this, I know of two approaches commonly used:

Change the abstraction by adding something to it to allow the new feature to be used.
Bypass any abstractions and get users to directly use the feature.

With 1, the abstraction tends to become "sparse", in the sense that the new thing added to the abstraction will be done only to allow one specific feature to be used. There is no abstraction of multiple implementations, but just an extra step that people have to go through to use the feature. In my opinion, this is a sign of "premature abstraction" (to mirror the idea of premature optimisation). Taken to the extreme, this approach ends in a scenario of “bag of APIs”, in which case the purpose of an abstraction gets weakened, because there is no cohesive interface through which features can be used.

With 2, the purpose of the abstraction weakens as well, because users will have to ignore it for certain cases. The more features that get added without a change in the abstraction, the weaker the abstraction gets.

In both cases, the tendency is to weaken the abstraction, and if this is allowed to continue, it's likely the abstraction will stop being used completely, or become a “legacy” layer that more modern abstractions will likely use.

Both of these cases assume that there is only one abstraction through which users should use the system or at least a specific part of the system.

Since we concluded the existing abstractions tend to get weaker as new features are built, why not create new abstractions instead? Oberon's approach is to define other/more abstractions. Any module can add abstractions in the system, and users can then make use of these new abstractions.

The important understanding is that existing abstractions shouldn't be "extended" with new functionality, even if the change is backwards compatible. Fixing bugs is fine, but adding new functionality to the same abstraction isn't. Just create a new one. This allows the system to evolve with little effort, and allows abstractions to grow naturally.

When a single feature is added, there isn't a need to introduce abstractions. A new module can allow direct access to the feature. As other components start implementing the feature (e.g. more hardware supporting some feature), a common interface will naturally emerge, and then a new abstraction can be created to provide the new feature through that interface. No need to touch existing abstractions at all.

It seems that a consequence of this approach is that it'll create more maintenance overhead: changes will have to take into account multiple abstractions that they can affect. I'll leave more discussion about this for another post, but I think this is solved by adopting the same philosophy of adding new things and leaving existing ones untouched. When existing things have to be changed, better tooling would alleviate almost all problems, but creating good tooling is something our field is surprisingly bad at.

It also seems that application code would have to change to use newer abstractions instead of the older ones, but that isn't the case: existing abstractions can still be used normally, because they are supported by the system. Just because a new abstraction is added, it doesn't mean the "old" one ceases to function - that's what makes the Oberon approach so interesting.

Bonus: programming language convention

The Project Oberon document also mentions a convention that it uses in the language used to build software for it:

Module names in the plural form typically indicate the definition of an abstract data type in the module.

It was only after reading this that it occurred to me that we've been overloading names with functionality for a long time. We have a lot of ways to change names in our code to add extra meaning to them. The hungarian notation is a somewhat popular example of this. Other examples include prepending interfaces with "I", dictating the order of verbs/nouns/adjectives when creating names, changing the case depending on the modifiers/meaning of a name (UPPER_CASE for constants, camelCase for functions).

The more I think about this, the more it seems that this comes from us using plaintext for code. It's an interesting observation that might be worth thinking about for a while.