A couple of weeks ago, while I was working on one of my side projects, I was browsing through the source code of the Java Collections Framework. I’ve always thought of standard libraries and anything else that’s part of the JDK as magical code, super optimized for all contexts, and therefore probably written in a lower level language. But suddenly it hit me that no, I’m actually looking at plain Java code that looks surprisingly familiar to a lot of other Java code, except that it’s very extensively documented and often contains smart constructs that do an amazing job for performance.

This realization made me start thinking more about the nature of software, and how it looks similar across different levels of abstraction. The source code handling a service request on the edge of an API, orchestrating input validation, high-level business logic and a bit of exception handling, will often look very similar to the source code performing basic input validation on numbers and strings on a lower level of abstraction. But, as I realized, it also looks very similar to the source code inside the JDK, e.g. implementing how an ArrayList or a HashSet should function.
Survival Strategy
This fractal aspect of software, the self-similarity of software across multiple layers of abstraction, didn’t happen by accident. In fact, one could say it’s there by design, or more precisely: it’s the survival strategy for us, software engineers, in a world full of complexity. Breaking a complex problem down into smaller problems that we cope with in single functions, and then tying these relatively simple functions together into a working program, is one of the fundamentals of software engineering.
To make this more tangible, let’s have a look at a trivial example, the validation of dates. At the highest level, a date simply consists of eight digits, but we all know that not all combinations of eight digits produce valid dates. Don’t try this at home (because you should use the functions that are part of the standard libraries), but here’s how you can validate a date:
- A date has eight digits (YYYYMMDD).
- Months (MM) can only go from 01 to 12, and days of the month (DD) from 01 to 31.
- For April, June, September and November, days of the month can only go from 01 to 30, and for February from 01 to 28.
- Every fourth year is a leap year, in which case February goes from 01 to 29.
- However, every 100th year is a common year, but every 400th is a leap year again.
- For dates before 15 October 1582, the Gregorian reform should be taken into account.
- However, the date of adoption of the Gregorian calendar differs from country to country, and sometimes even within a country.
- There’s no year zero.
- And dates far in the past or far in the future probably need some special care too.
Notice that it’s easy to grasp the complexity at each of these levels, with the notable exception of the seventh level, and the last level basically being a disclaimer that helps to scope the problem down. But it’s certainly easier to perform, implement and explain the validation of dates this way than writing down one single rule trying to incorporate everything at once.
Also notice that even though the validation at level 5 operates at a much smaller scale than the validation at level 2, the logic and the effort needed to implement them would be more or less the same. It’s this self-similarity that lies at the basis of the fractal nature of software. And this has some consequences.
Like I already mentioned, the biggest consequence is of course that this is what allows us to write any working software at all. Imagine it wouldn’t be possible to write source code operating on more than two or three levels of abstraction for the entire system. We all know what this would lead to (and some of us have been there): either very simple programs doing some trivial work, or systems with methods that go on for ages, and are impossible to test, change, maintain or extend in a safe way.
Cutting Corners or Gold Plating?
But another consequence is that this allows us to reason about how much complexity we need to implement in order to produce a functioning system. Let’s go back to the validation of dates: do we always need to take into account all nine levels of complexity? If you only want to validate the birth dates of your website users, the first four levels would already suffice. Most people will feel more comfortable implementing the fifth level too, but since 2000 happens to be a leap year, the first four levels are actually enough. But if you’re only designing a graphical user interface, you don’t need to care about anything else than the first level.
This is also what makes software engineering so difficult: how do you know at which level you should stop? Or to put it in a different way: when are you cutting corners (too few levels), and when are you gold plating (too many levels)? If you’re working on an application to be used in history class, you may need all levels. But an application to keep track of your future appointments doesn’t need to be able to handle the Gregorian reform.
Lately, I’m getting more and more convinced that the problem with estimating in software engineering doesn’t stem from the creative aspect of it, but from its fractal nature. The difficulty isn’t trying to guess how long it takes to implement some logic, the difficulty is deciding what the minimal amount of logic is that you need to implement to produce a functioning system. If you underestimate it anywhere, you’ll need to refactor — basically another word for adding an extra level of complexity to the already existing code.
But I also think this is why architects often have a tendency towards lower estimates than the developers. They are used to operate with dates at a level where leap years are just an annoying detail, and not a bunch of broken unit tests. One level higher up, managers don’t understand why it takes so long to validate something as simple as eight digits. After all, it’s not like it needs to work all the way up to the Big Bang as long as it can cover dates back to the Roman Empire…
Software as a Fractal was originally published in Compendium on Medium, where people are continuing the conversation by highlighting and responding to this story.


