Object-Oriented Programming isn't bad
This post is about a video called Object-Oriented Programming is Bad, by Brian Will. I came across it when I was studying for my degree, and I remember that I wasn’t able to follow through the arguments to the conclusion, although it left me feeling with a general sense of agreement. It popped up again as a suggestion while watching Hammock Driven Development, so I thought I’d revisit it and see if I understand it better and if I agree, now that I have some experience.
The result is mixed. Brian’s view is nuanced, with his preferred paradigm still making use of objects, and it reflects the frustrations that can be often encountered among software developers. He gives good advice for writing maintainable code, such as parameterising instead of using global state, ensuring variables have a restricted scope, or favoring a naturally arising architecture instead of one imposed at the early stages of a project. This seems to also be the theme in what I’ve read of his blog. However, his main statement that encapsulation does not work is based on arguments that I find debatable. Naturally, everything that follows on Object-Oriented Programming being bad becomes questionable as well.
Encapsulation
Let’s start by briefly summarising some terminology around objects. I match Brian’s definitions here to avoid any confusion around names.
Data is usually stored in variables. The term used for each unit of data (so akin to a variable) is in this case datatype. An object is a combination of data, called its state, and functions, that we call methods.
An object’s state is hidden in the form of private state. State can also be public, but since we’re discussing encapsulation, we’ll be strict and assume all state is private. One can interact with an object by sending it a message. That causes the receiving object to change its state or send messages to other objects.
Messages, strictly speaking
Brian’s argument against encapsulation begins like this:
In practice, sending a message to an object usually means a method call on it, but that’s not necessarily the case. Strictly speaking, messages only contain copies of state, not references to it. This means that when object A sends a message to object B, the message may contain part (or all) of A’s state, but since it’s a copy, B cannot effect change on A.
At this point, Brian asserts that:
…this rule is not observed at all, and probably for good reason, but if we take the rule seriously…
And so, considering that his whole argument against encapsulation is based on this restriction on message sending, we could already stop here. The premise is faulty; the conclusion holds no weight.
But as he says, let’s take the rule seriously. If we want A to be able to send messages to B, then A needs to have a reference to B as part of its state. Since there’s no way for A to get a reference to B through messages (a message only contains a copy of state), then A needs to receive a reference to B when A is created, and have B as part of its state.
Brian argues that if B is part of A’s state, then a mutation in B’s state causes a change in A’s. As a result, if we don’t want to expose that A depends on B (which would mean revealing A’s internal workings, and thus break encapsulation), A must be the only one that can send messages to B. There can be no sharing of objects; the only viable structure is a tree, where each object will have a single “parent”.
There are two problems with this argument. Firstly, exposing that A depends on B does not reduce encapsulation, because that fact has already been established beforehand. After all, we are passing in B when creating A.
Secondly, there is an assumption that B cannot partially or fully take responsibility for its own state. Although it’s true that ultimately a message means a state modification, B can control in which way the state is modified. In other words, if objects A and C are both sharing object B, both A and C can mutate B’s state, but only insofar as B allows it. Brian dismisses this control by calling it “trivial”, but it is a key part of maintaining encapsulation.
For example, consider a connection to a shared resource. We don’t want to or can’t open multiple connections, and the protocol is such that there can only be one open transaction in the connection. It is much easier to designate an entity (B, in this case) to manage the connection than to try to sync its use across several different entities (A and C).
This also does not necessarily leak implementation details of A and C to whoever is using them. We could change them to use separate connections (if possible at some point), or switch one or both to use other resources. From an architecture point of view, it becomes a minimal change with a very limited blast radius. The only party that will probably have to violate this encapsulation is the developer when working on the codebase, especially when debugging, but the developer is expected to have an overall understanding of the code they’re debugging (like the fact that it’s using a shared connection to access some resource). This is true in any paradigm, not only OO.
Proper and improper
Going back a bit, Brian has established that sharing an object breaks encapsulation and so proper Object-Oriented Programming requires a tree-like object graph, where a node “owns” and can only send messages to its direct children. He goes on to describe a scenario where we have cross-cutting concerns which, I have to agree, is wild.
He then introduces an improper OOP where we would define sub-systems that adhere to this tree-like structure, but the relationships between sub-systems are less constrained. This has the same problems regarding cross-cutting concerns and does not reduce the effort required to make a change significantly. Thus corners will be cut and the “improper” architecture guidelines will be violated eventually. Both proper and improper OOP are bad.
The conclusion does follow from the premise, but considering the contents of the previous section, it is clear that the premise is flawed. Firstly, the “messages pass no references” rule does not exist (by his own admission). Secondly, even if it did exist, it does not necessarily violate encapsulation in a way that renders the concept unusable.
So no, encapsulation is not a bad thing.
I just want to write code
We now leave arguments about encapsulation behind to talk about architecture and design. To summarise the main points:
- A premature, rigid architecture is detrimental to the evolution of the codebase, because any slight change will require tearing down said architecture.
- Procedural is better than OO because:
- There are less flows to think about and no self-imposed barriers.
- The latter requires assigning responsibilities and behaviours to actors, which often results in objects that do not reflect a domain concept (i.e. a real-world thing) and forces us to take decisions in ambiguous situations which would be unnecessary otherwise.
Let’s go through them in order.
Architecting too early
The first topic that Brian goes over is preestablished versus emerging architectures. This is of course framed in the context of Object-Oriented Programming, so I read two interpretations:
- Prematurely architecting the relationships between objects can be counterproductive.
- Prematurely encapsulating state as part of an object can be counterproductive.
I fully agree with the former: it is much better to start with a minimal architecture that responds to the current needs, and then modify it as the product grows. In that sense, I understand where Brian is coming from regarding point 2 (if he’s trying to make this point), but I think there is an important difference between the overall architecture and an object’s design.
Re-architecting a whole system, or sometimes even a sub-system, is a significant amount of effort. This is true in any paradigm. On the other hand, redesigning an object only affects itself and any objects that use it (as long as the objects were well-designed already, otherwise the consequences branch out).
Early encapsulation also has an advantage, which is that it conveys intent. Even if we avoid premature design, there is some architecture at any point, and an explicit description of each component (in this case in the form of its public interface) helps understand the context that we are working in.
Again, business needs can cause this architecture to change in unexpected ways. Revisiting existing code to make sure that it fits the current business case should be a frequent practice. Luckily, code is malleable.
The building metaphor™
I’ll make a brief pause here and rant a bit about the metaphor used to illustrate the point about premature design.
I, as many others, have heard comparisons between software and other products of engineering many times. After all, if we consider software engineering to be what we developers do, it makes sense to compare it to architecture, or electrical, telecommunications and civil engineering.
Long ago, the process to build software was similar to building a house: gather information (requirements), plan, execute, iron out issues that were discovered during execution, and deliver. We have evolved since then though, and we do things like iterative development, maintenance, and operations on deployed software.
In other words, the “house metaphor” is not valid anymore. And if it has been valid in the past it’s not because software development naturally resembles building a house, but because we tried to use the build-a-house process to make software.
All in all, I think the metaphor hurts the user more than it helps them, because they’re bound to have some opinion that means software is not like building a house. For example, I wouldn’t be too fond of letting my house’s architecture emerge naturally, and I’m sure Brian wouldn’t be either.
Does OOP force us to think too much? (or why procedural is better than OO)
We now move on to broaching the second point stated at the beginning of this section, which is reasons why procedural programming is supposedly better than Object-Oriented.
Cognitive load
One of the reasons procedural code is better than OO, says Brian, is that there are less things to think about. No inheritance, composition, or data flows. Just a call graph. That’s true, but I also think it’s less of a big deal than Brian makes it out to be. By his own admission, nowadays inheritance is used sparingly and avoiding an excess of layers, so there isn’t much of a graph to keep in mind. Then there’s also the composition graph, but (if we are doing it properly) the fact that an object is using another to achieve its function should be encapsulated. So there is no need to think about the composition graph, except when working on an object inside that graph. And even in that case, we only need to consider the immediate neighbours of the object in question.
On the other hand, if we have no encapsulation we must consider other things. How and when does our shared state change, and if there is the possibility of having an illegal state, how do we mitigate or nullify the risk? Are we expected to repeat validation logic everywhere in the code where the state changes? Of course, it would be nice to extract that logic into a function that we reuse throughout the code. We thus end up, informally, with an object, but in this case it’s not encapsulated. There’s no guarantee that there won’t be a state change that is not using the function, because there is no way to limit access to that state.
And so procedural code becomes a tangled mess where we have to keep the whole context in mind, because we cannot make assumptions on the way state is accessed and modified. We can try to look at other places in the codebase for reference, and with a bit of luck they are going to be consistent. Although probably they are not, if there’s more than one developer or team working on the same code, because someone forgot or didn’t know that there is a particular way to modify the state.
And that’s not necessarily their fault either. It can be that the developer is doing what they have always done on that codebase, but someone else has recently seen a way to improve state handling and made the changes. They would most likely have seen the change with encapsulation, because they wouldn’t have been able to see the state they were directly accessing before. But unfortunately, there’s no encapsulation and no structure. And so we still have to think, and constantly search through references to the state and functions that mutate it.
Fictitious types
When learning Object-Oriented Programming, we are shown examples that mirror the real world very neatly.
We know that Cat
s and Dog
s are Animal
s; Student
s have Enrolment
s to Courses
s.
We are taught to think and model in domain terms, and everything fits.
However, if we go a bit further things stop fitting so neatly. There are behaviours that do not belong in the domain. Instead, they are usually related to how we obtain, use, and make the domain objects available. For example, persisting the objects in databases or files, or choosing a particular candidate from a set of objects, or exposing their data through a JSON API.
Although I have seen file persistence handled in the object itself before, and database persistence can be partly handled in a similar way, there are indeed behaviours that do not belong. A connection or pool of connections to the database have to be maintained, and that is not very pertinent to a domain object. HTTP endpoints have the same problem, even though (de)serialisation could be considered responsibility of the domain object. So if you take “proper” Object-Oriented Design to be only about the domain, we most likely will have to introduce some “improper” elements.
This is one of the things Brian could be talking about when he mentions “behaviours that don’t naturally fit any of the obvious datatypes”.
Another is Utils
: in Java terminology, a non-instantiable class that contains only static methods.
Yet another one is something that operates on several different objects, or on containers of a specific object: if we want to work with Cat
s and Dog
s, then probably neither Cat
nor Dog
are good places for our logic.
As to the first, I would argue that most of the “introduced” concepts are actually part of the domain.
If there is a database that we have to connect to, then a Connection
is not that far removed from reality.
The same applies to the object that would serve other objects through HTTP.
Could you model these as plain functions instead of objects?
Definitely yes, in some cases.
Does having them as objects cause problems?
Not as long as they are kept lean and all of their logic is related to the very specific function they do.
Quite to the contrary: having the logic contained and encapsulated in these classes means we don’t have to worry about it in other parts of the codebase.
It does get out of hand once we introduce things like repositories, components, or services1. These usually contain logic that might as well be in the object itself. Though as I have mentioned already, there is an argument for logic that uses several different objects; I will address that further below.
Secondly, in my opinion and experience, there is not much use for the Util
class.
It often contains two types of functions: those that relate to objects we have no control over (and so we cannot add methods to them), like one that formats a LocalDate
; and those that operate on data that has no associated object.
In the first case, we could create a new object that inherits or composes from our target, and add the methods there.
In the second, we could separate the utility functions based on theme, but then we would probably end up with qualified utility classes instead (e.g. RandUtils
, FormattingUtils
).
Some OO practitioners favour creating an object for every datatype in the domain, even if that datatype can be represented by a primitive2.
This gives us a bigger pool to put our utility functions in, and so reduces the need for utility classes.
I suspect Brian would label this as “unobviously structured, indirect, and with too many entities”.
But these entities represent domain concepts, so if there are too many of them or they are unclear then it’s the domain that has a problem.
Also, there is very little difference between String phoneNumber
plus any parsing and manipulating logic associated with it and class PhoneNumber
.
The latter contains the same logic, and since it probably has some degree of encapsulation, you could even consider it simpler to think about in a higher-level scope.
Finally, there are the “high-level business logic” objects: the Doer
s, Handler
s or Service
s. They contain code that operates on several domain objects or binds together our object graph, and do not really contain any data pertinent to the domain.
The existence of these objects, and particularly their prevalence over domain ones, is one of Brian’s arguments against OOP.
I agree with him that having these types of object is a problem.
Their boundaries are often blurred and what they do is unclear.
I have seen Handler
classes that contain all the business logic of a program.
This isn’t a problem inherent to Object-Oriented Programming though.
In procedural code, a TransactionHandler
becomes handleTransaction()
, which tells us exactly as little, and can be a placeholder for just as much behaviour.
Barring that, it is true that we have objects that only exist to contain logic.
Even if we split the Handler
s into more concrete objects with better delineated responsibilities, and we move as much of the logic as possible to the domain objects, we will probably still have some classes with only behaviours and no data.
If that is something to be avoided, then yes, this is a shortcoming of Object-Oriented Programming.
Assigning responsibilities
I have already mentioned this, but I think it is good to repeat it. When discussing how assigning behaviours to objects sometimes leads to absurd situations, Brian roughly says:
Should a
Message
send()
itself? Or should we have aSender
object thatsend()
sMessages
? Or aReceiver
object thatreceive()
sMessages
? Or aConnection
thattransmit()
s?
This is indeed absurd. But there is a solution! Just stop designing early, and stop overdesigning.
The fact that we’re trying to fully model a program before we even start writing code is mind-boggling and I have not ever seen it done outside of university. There is no way we know exactly what the customer or user needs at this point, nor what technical challenges we are going to face. Why are we pretending we do? Let’s just start instead with a minimal subset of the solution that validates that we understand some of what the customer wants, and keep evolving from there.
And if at some point we have to send a message, there’s no need to think so hard about it! Just take the option that fits best with the available code, or if sending a message is the first thing we’re writing for some reason, we can just pick any! We are not programming on stone tablets, so we have the luxury of deleting and moving and redesigning!
Also, how do you end up with a codebase that has Message
s, Sender
s, Receiver
s and Connection
s, all at the same time, but not have an idea already of how the process to send a message is going to be?
Coming up with objects before thinking what they are going to do is a very bad way to design anything.
Just as if we were writing all the functions (but only the headers) in a procedural program before we do anything else3.
Wrapping up
I’m going to skip the part regarding how to write procedural code well because I think most of it is actually good advice. Not just for procedural, but also for OO. There are some things that I don’t agree with, but I consider them for the most part addressed in this post or a matter of opinion.
However, I’d like to note that earlier on in the video Brian labels OOP best practices as “bandaids”. Going on to give advice on how to write procedural code well and “mitigate its problems” does not leave a very good impression; it looks like he is applying bandaids to procedural programming as well. Of course, the available corpus on OOP is larger, but I doubt we would not have had the same amount of procedural best practices had it been the go-to paradigm for the last decades.
All in all, I don’t think Brian’s and my view are so different. I would favour relaxing the constraints that theoretical object-oriented design establishes. In practice, we can let the design appear naturally through several iterations (and diligent refactoring). His point on using objects only when needed resonates with me, although I probably consider that I need them more often than he does.
Regardless, I appreciate that Brian has made the effort to promote his view with a coherent and well-thought argument. It is a lot of work to plan and put together such a project, and I have only touched on his first video here (there are two more). Even if I don’t agree, it has made me think about why that is the case and how I can improve in my work. That’s something many of us should be doing more frequently.
-
I’m referring to Spring concepts here. There are many different definitions for these words, which only compounds the problem if we want to use them. ↩
-
The previous point about not architecting too early is worth considering here. It is possible to extract a primitive datatype into an object when the need arises, and that’s better than wrapping all datatypes with an object from the start. ↩
-
Does procedural require that we do that? No. Neither does OO. ↩