Refactoring might present itself as a dreaded beast, with hundreds of raging heads, eager to devour developers. At the same time, it also can be a fulfilling experience. And an opportunity to better know your application, its architecture and idiosyncrasies. The code base grows and rots, software gains technical debt, dirty fixes and quick hacks. After some time, there might be no easy, or quick, way of adding features and changing behavior. This article sheds light on the refactoring process and its concepts.
“The fundamental problem with program maintenance is that fixing a defect has a substantial (20-50 percent) chance of introducing another. So, the whole process is two steps forward and one step back.”
– Frederick P. Brooks, Jr. (The Mythical Man-Month)
What is Refactoring
Refactoring is the process of clarifying and simplifying the design of existing code, without changing its behavior. We can categorize them by goal and size: code refactoring and architecture refactoring. Code Refactoring aims to increase the quality of the code. It contains small tasks chained together, or not. Architecture Refactoring is the bigger counterpart. It reorganizes the existing code into new logical layers and represents deeper changes. It is usually done by changing the software architecture/infrastructure.
Refactoring should support making changes on the software, improve efficiency and code legibility. Code, if non-refactored, tends to rot. Every time we change code without refactoring it, rot worsens and spreads. Code rot increase frustration, costs time, and shortens the lifespan of useful systems. This alone can mean the difference between meeting an iteration deadline or not.
A popular metaphor for refactoring is cleaning the kitchen as you cook. When you see the preparation of several complex meals per day, you will see continuous cleaning and reorganizing. Someone handles the dishes, the pots, the kitchen itself, the food, the refrigerator. All clean and organized from moment to moment. Without this, continuous cooking would soon collapse.
Another good metaphor is that coding is like gardening. Your code base grows and your domain changes as time passes. If unattended, the plants grow and weed sprouts on unexpected places.
Some might say that it is a dangerous activity that risks destabilizing working code. Yes, if not done properly. Before trying to make significant changes on a system, ensure that you’re not causing any harm. This is almost a sort of “software development Hippocratic oath”. We can use documentation and testing to resolve problems introduced by refactoring. Updating the description of the actual behavior and having decent coverage with a unit test suite are always good practices.
Others might say it is a waste of resources. Refactoring is the way to pay the technical debt. Technical debt is the idea that the complexity of a system builds up, and that will need a solution sometime in the future. There is a time and a place to pay such things. You would not try to repay a loan until you had the cash to do it. It is not a good idea to go around refactoring during a critical stage in development.
Why should I refactor code?
There is no substitute for writing code, and no amount of up-front planning or experience can replace that. Also, hindsight is easier than foresight. Software is one of the most complex things created by humans, so it is not easy to consider everything beforehand. For large projects it can even be impossible for the team to consider everything before actually starting to develop it.
There are a multitude of reasons one might want to refactor. To be brief I will state a short list of possible reasons:
- Refactoring prevents code rot, keeping the code easy to maintain and extend
- To improve the design/architecture of the software
- To make software easier to understand.
- To find bugs
- To increase the software efficiency, speed or lower memory requirements
- To provide greater consistency for use
- To reduce code duplication
- To support new requirements
- Due to poor understanding of the requirements
When should I refactor code?
An easy rule is: refactor early, refactor often. Refactoring early means the necessary changes are still fresh on my mind. Refactoring often means the changes tend to be smaller. Delaying refactoring only makes a big mess which further makes it harder to refactor. Cleaning up as soon as you notice the mess prevents it from building up and becoming a problem later.
If your code base is difficult to understand and to change, it may be slowing you down. That is a sign that you need to do more refactoring to improve your efficiency. For most teams, this means putting more effort into the day-to-day refactoring workflows.
Again, for brevity’s sake, here a list of good refactoring timing:
- Assuming you branch for each task, then on each new branch, before it goes to QA
- If you develop all in the master trunk, then before each commit
- When maintaining old code, do refactoring on major releases that will get extra QA
- When refactoring will speed up the task at hand
- When the refactoring is quick and contained
- If your non-factored code has hit the three strikes rule
- You are at risk of digging a hole that will take more than a day to fix
How to refactor code
I will not cover the technical details on how to refactor each kind of problem. What I will cover is a general workflow as described by those authors. To better understand the practical steps, you can read books like:
- Refactoring by Kent Beck and Martin Fowler
- Refactoring to Patterns by Joshua Kerievsky
- Working Effectively with Legacy Code by Michael Feather.
To use refactoring effectively, you need to combine all the workflows.
Refactoring as a TDD step
TDD, meaning Test Driven Development. These are the main steps:
- Get change or feature request
- Create acceptance and edge cases tests (which will be failing)
- Make changes and improvement so that all tests pass
- Refactor and ensure the tests still pass
- Archive the code on a version control system
While making the test pass, we can focus on adding the new functionality, without thinking about how it should be best structured. Once things are working, we can concentrate on good design, now working in the safer refactoring mode of small steps on a green test base.
People should be always on the look-out for substandard code. As a team learns, the excellent solution from months ago now seems like a poor decision, or not good enough for the new requirements.
Fixing things right away is a good move if it is a simple fix, or if the fix will make it easier to add a feature. Use refactoring to clean up the problematic code. If it ends up being longer than it is reasonable, stash the refactoring and come back to it later. If it is too awkward to stash the work-in-progress, make a note of the refactoring and work on that after finishing the feature.
Use the boy-scout rule: always leave the code better than when you found it. It’s important to do that refactoring on a stable code-base, with all its tests on green (passing). To get to a green state you can stash the current work, or otherwise disable any work-in-progress that is causing tests to fail.
Refactoring to understand
Ward Cunningham explained it like this: Whenever you have to figure out what code is doing, you are building some understanding in your head. Once you have built it, you should move that understanding into the code. This way nobody has to build it from scratch in their head again. Code that is easy to understand means it is cheap to use and change. But clear code, like clear writing, is hard to do. Often you can only tell how to make it clear when someone else looks at it, or you come back to it at a later date.
Michael Feather talks about Scratch Refactoring on his book. It is what Martin Fowler calls “Refactoring to Understand”. It is the practice of taking the code that you do not understand (or cannot stand) and cleaning it up. So that you can get a better idea of what is going on before you start to actually work on it, or to help in debugging it. Here are some examples of this exploratory refactoring:
- Renaming variables and methods once you figure out what they mean
- Deleting code that you don’t want to look at (or think is not working)
- Breaking complex conditional statements down
- Breaking long routines into smaller ones
Don’t bother reviewing and testing all these changes. The point is to move fast. This is a quick and dirty prototype to give you a view into the code and how it works. Learn from it and throw it away.
Scratch refactoring also lets you test out different refactoring approaches and learn more about its techniques.
When starting a new task, you refactor code before making changes. This way you can confirm your understanding of the code and make it easier and safer to put your change in. Add regression tests to safeguard your refactoring work. Then make your fix or changes and test it again.
After refactoring, ensure that your application still behaves the same way. You might need extra unit and integration tests and a throughout manual test.
Often you start working on adding new functionality and realize the existing structures do not play well with what you need to do. In this situation it usually pays off to begin by refactoring the existing code into the shape that you need. Often the refactoring change is faster than if you tried to add the change.
A common analogy for this is prep work before painting. Scraping the surface and taping the edges are not painting, but often makes the actual painting go faster, and results in a longer lasting job.
Many teams schedule refactoring as part of their planned work, using “refactoring stories”. They use these to fix larger areas on problematic code that need dedicated attention.
Planned refactoring is a necessary element, but also is a sign that the team has not done enough refactoring using the other workflows. If most of your refactoring is planned, you should consider incorporating the other refactoring workflows into your work.
Long term refactoring
Some refactoring requires bigger changes that do not fit in a single episode. This large-scale restructuring can still use refactoring. The team needs to agree on a rough end-state as well as a rough plan to get there. Then, during their regular work, take the opportunity to refactor towards the desired direction.
One common technique for long-term refactoring is using Branch by Abstraction. To achieve that use an abstraction layer that supports the current and new implementations at the same time.
Refactoring to patterns
Refactoring does not only occur at low code levels. Joshua Kerievsky, in his recent book makes the case that we should use refactoring to introduce design patterns into code. He argues that patterns are often over-used and introduced too early into systems.
He follows Fowler’s original format and shows specific recipes for getting your code from point A to point B. Kerievsky’s recipes are generally higher level than Fowler’s, and often use Fowler’s refactoring as building blocks. Kerievsky also introduces the concept of refactoring towards a pattern. He describes how design patterns have several implementations, or depths of implementation.
- Presentation slides: https://martinfowler.com/articles/workflowsOfRefactoring/
- Book: Refactoring by Kent Beck and Martin Fowler
- Book: Refactoring to Patterns by Joshua Kerievsky
- Book: Working Effectively with Legacy Code by Michael Feather