The Foundation of most software principles
Cohesion is a measure of how strongly related and focused the various responsibilities of a software unit are. Cohesive software units are easy to comprehend and are more reusable. A module that does only one thing (and does it well) is more likely to provide value in different contexts than a module that aggregates many unrelated behaviors. A software unit can refer to a software module, class, or function. Those principles apply at all levels of abstraction.
In essence, high cohesion means keeping parts of a codebase related to each other in a single place. In theory, the guideline looks pretty simple. In practice, however, you need to dive into the domain model of your software deep enough to understand which parts of your codebase are actually related. Perhaps, the lack of objectivity in this guideline is the reason why it’s often so hard to follow.
Let’s understand it better with an example:
This is an example of a cohesive class. We have an Interactor responsible for exposing specific business actions that belong to the Employee class. It’s cohesive since its only responsibility is to contain all the business logic regarding a single entity.
The above represents a noncohesive version of the same class. We have added a function that calculates the color of a text field based on an employee’s salary. It breaks the cohesion of the class because the rest of the functions apply business logic to the employee entity, while the mapColorForAnnualSalary function applies presentation logic. Responsibilities are mixed, and this class is not cohesive anymore. If we add all the responsibilities of the employee entity in a single class, soon, this class will become bloated with too much unrelated code.
The above is a cohesion violation, but not the worse type of it. A more severe violation would be adding a function to perform HTTP calls. The class then would combine business with low-level data logic.
High cohesion helps a lot with readability and reusability.
Coupling represents the degree to which a single software unit is dependent on others. In other words, it is the number of connections between two or more units and how strong those connections are—the fewer and softer the connections, the lower the coupling. Hard boundaries mean soft connections; soft boundaries are hard connections. High coupling is problematic. When everything is connected:
A change in one portion of the system can break the behavior in another, distant part of the system. Imagine several classes depending on a single function that has to change its signature. All its clients now need to change in order to compile. This is the so-called ripple effect. As Uncle Bob says: “This system exhibits the symptom of fragility.”
High coupling minimizes opportunities for reuse. Importing a module that depends on several other modules and third-party libraries makes you depend on all those as well. Those are called transitive dependencies. We have all felt the pain of being unable to make use of two third-party libraries simply because they depend on a different version of another library. Version-classing is a high coupling symptom.
Tightly-coupled units are hard to test. Testing a class with many collaborators often requires considerable setup and/or side effects. If we use mocks, having many collaborators heavily increases the effort to mock them. In case we avoid mocks, high coupling increases the number of side effects. That’s because a class can be reported as defective, while in reality, the defect resides in one of its dependencies. This case will consume more effort since we will need to put more time into investigating and correcting the defect.
Let’s take a look at an example to understand the above better:
We are re-using the previous EmployeeInteractor class; it also serves as an example of low coupling. The Employee Interactor has only one dependency, the Repository. It makes use of it in order to get data access. This dependency is necessary to comply with the high cohesion principle. Aka, not performing unrelated actions. Let’s now take a look at an example of high coupling:
It now has four dependencies the employee repository, file, shared preferences, and the UsersRepository. Those are unnecessary; the file and shared preferences dependencies should be moved to the Repository.
This example breaks the low coupling principle, the separation of concerns, and the high cohesion principle. The employee Interactor class now has more than one concern and knows about more than one business entity. Therefore, we can conclude that those objectives complement each other and often go side-by-side. A highly coupled software unit will most likely not be cohesive, and it will mix unrelated concerns.
Getting back to the connection between coupling & testability. If we used mocks, we would have to mock the behavior of all: Employee repository, Employee file, Employee shared prefs, and Users repository. Just to be able to test a single class. This is a lot of work, and it’s increasing exponentially. Besides the happy path, we have also to test the error cases.
We also have the option to avoid using mocks, in accordance with the Chicago School of TDD. Then, the Employee Interactor class tests could be reported defective for a bug in any of its dependencies.
Besides the number of dependencies, high coupling also depends on how tightly the dependencies are coupled. In other words, how soft or hard our boundaries are. Coupling with abstractions, interfaces, and protocols, is not tight. Coupling with concretions, classes, and implementations, is tight coupling. Depending on concrete classes reduces the reusability of the unit as we cannot use it in different contexts. A class that depends on interfaces/protocols rather than concretions is more flexible. It may have the same number of dependencies, but it’s not tied to concretions; therefore, the coupling is lower.
Levels of Coupling & Cohesion
We have used the term software unit to define cohesion and coupling. In our practical examples, the unit was a class. But as we explained, cohesion and coupling apply to any level of abstraction. They apply to units as small as functions or classes and as big as software modules or even whole software systems.
We identify a tightly coupled module when:
It depends on many other modules.
Many of its classes depend on classes in different modules. When there isn’t a single point, preferably an interface, that handles the external communication.
A tightly coupled module cannot be developed independently, and it is hardly reusable.
Moving to cohesion, we identify a non-cohesive module when the tasks it executes are not logically related. Cohesion is more subjective than coupling, but the two are closely related. We intentionally couple things that want to live together into cohesive units (types/classes/modules etc.).
Types of code based on Cohesion & Coupling
The diagram above categorizes the code based on cohesion and coupling. Let's break it down:
High cohesion and low coupling is the ideal, Clean situation.
Low cohesion and low coupling indicate that we have separated our boundaries in a poor logical way. Thus we have introduced multiple monoliths within the system. The monoliths are not very dependent on each other; that’s why we have low coupling. But the monoliths are also not focused; that’s why we have low cohesion. A monolith is doing much-unrelated stuff, making it harder to read, understand and maintain.
High cohesion and high coupling indicate the need to introduce more abstractions and boundaries into our system. The units are small and cohesive but depend on concrete implementations rather than abstractions. In other words, there aren’t enough boundaries, and the boundaries are soft. This makes them inflexible and more rigid to be reused in different contexts. We should also segregate and redesign those interfaces to minimize the number of dependencies.
Low cohesion and high coupling is the worst situation to be in. A code smell that emerges from this situation is what we call God object. A God object is an object that references a large number of distinct types and/or has too many unrelated or uncategorized methods. In other words, it’s an object that does too much, knows too much, and is known by too many.
Cohesion, coupling, and separation of concerns are interconnected. Usually, you achieve high cohesion and low coupling if you apply proper separation of concerns. Most technical principles aim to achieve those high-level technical objectives.