I know that refactoring is "changing the structure of a program so that the functionality is not changed". I was talking with some of the guys I'm working with on my final year project at university and I was surprised that they have a much more expansive (for want of a better word) view of refactoring. I consider refactoring to be things like extracting methods and renaming classes. They also suggested things like changing data structures (like a Java LinkedList to an ArrayList ), changing algorithms (using merge sort instead of bubble sort), and even rewriting large chunks of code as refactoring. I was quite sure that they were wrong, but I wasn't able to give a good reason why because what they were suggesting did change the program (and presumably make it better) without changing its behaviour. Am I right, and more importantly, why?
299k 44 44 gold badges 473 473 silver badges 548 548 bronze badges asked Jun 22, 2009 at 7:15 David Johnstone David Johnstone 24.4k 14 14 gold badges 69 69 silver badges 72 72 bronze badgesRefactoring is a controlled technique for improving the design of an existing code base. Its essence is applying a series of small behavior-preserving transformations, each of which "too small to be worth doing". However the cumulative effect of each of these transformations is quite significant. By doing them in small steps you reduce the risk of introducing errors. You also avoid having the system broken while you are carrying out the restructuring - which allows you to gradually refactor a system over an extended period of time.
Refactoring goes hand-in-hand with unit testing. Write tests before you refactor and then you have a confidence level in the refactoring (proportional to the coverage of the tests).
answered Jun 22, 2009 at 7:23 Mitch Wheat Mitch Wheat 299k 44 44 gold badges 473 473 silver badges 548 548 bronze badgesThe Fowler quote is of course relevant, and yes, it goes hand-in-hand with unit tests. But, does this really answer the questions asked: Are the examples mentioned refactoring or just modifying code? Who's right, the OP or his colleagues, and why?
Commented Jun 25, 2009 at 9:11Fowler draws a clean line between changes to code that do, and those that do not, affect its behavior. He calls those that do not, "refactoring". This is an important distinction, because if we divide our work into refactoring and non-refactoring code modification activities (Fowler calls it "wearing different hats"), we can apply different, goal-appropriate techniques.
If we are making a refactoring, or behavior-preserving code modification:
If we are making a behavior-changing code modification:
If we lose sight of this distinction, then our expectations for any given code modification task are muddled and complex, or at any rate more muddled and more complex than if we are mindful of it. That is why the word and its meaning are important.
answered Jun 22, 2009 at 16:32 Carl Manaster Carl Manaster 40.2k 17 17 gold badges 106 106 silver badges 157 157 bronze badges+1, exactly. Especially the rationale you give; the muddled expectations. When writing my own answer, I had this in mind, even if I failed to write it down as neatly :)
Commented Jun 23, 2009 at 18:23To give my view:
Small, incremental changes that leave the code in a better state than it was found
Definitely Yes: "Cosmetic" changes that are not directly related to features (i.e. it's not billable as a change request).
Definitely No: Rewriting large chunks clearly violates the "small, incremental" part. Refactoring is often used as the opposite of a rewrite: instead of doing it again, improve the existing.
Definitely Maybe: Replacing data structures and algorithms is somewhat of a border case. The deciding difference here IMO is the small steps: be ready to to deliver, be ready to work on another case.
Example: Imagine you have a Report Randomizer Module that is slowed down by it's use of a vector. You've profiled that vector insertions are the bottleneck, but unfortunately the module relies on contigous memory in many places so that when using a list, things would break silently.
Rewriting would mean throwing the Module away an building a better and faster one from scratch, just picking some pieces from the old one. Or writing a new core, then fitting it into the existing dialog.
Refactoring would mean to take small steps to remove the pointer arithmetics, so that the switch. Maybe you even create a utility function wrapping the pointer arithmetics, replacing direct pointer manipulation with calls to that function, then switch to an iterator so that the compiler complains about places where pointer arithmetics is still used, then switch to a list , and then remove the ultility function.
The idea behind is that code gets worse on its own. When fixing bugs and adding features, quality decays in small steps - the meaning of a variable subtly changes, a functions gets an additional parameter that breaks isolation, a loop gets a bit to complex etc. None of these is a real bug, you can't tell a line count that makes the loop to complex, but you hurt readability and maintenance.
Similarly, changing a variable name or extracting a function, aren't tangible improvements of their own. But alltogether, they fight the slow erosion.
Like a wall of pebbles where everyday one falls to the ground. And everyday, one passerby picks it up and puts it back.