Monday, June 29, 2009

Forgotten Refactorings

Shooting pool at City Billiards in 2002, my friend Hans offhandedly says, "You should read Martin Fowler's Refactoring." That moment changed my life. Well, technically, the hours spent reading the book over the following week changed my life, but let's not quibble.

Seven years later, I still believe this is the book to read when starting out on the path to being an engineer. It's not so high-level or inaccessible as Design Patterns (although still a foundational book), nor is it as low level and implementation specific as Clean Code or Implementation Patterns (also worthwhile reads).

At the time I read Refactoring, I was neck deep in PocketPC and PalmOS development. C and C++ code without unit tests. Difficult to write and difficult to change. The better developers knew the operating system API by memory, and the best developers could explain how the API interacted with the hardware. I remember one job interview where I was asked what "Hello" + 4 yields.* My answer of "I would never write code like that" was met with, "Of course you wouldn't, but you'll need to debug it." How true.

Refactoring laid out a vision of good, clean code that went beyond simply working correctly. This wasn't exactly new information. My peers and I all valued short methods, small classes, high cohesion, and low coupling even if we had a lot to learn about how to achieve those properties. But we also valued shipping code on the delivery date and not working the weekend too frequently. So when timelines got tight, we cranked out sort-of working code and moved on to the next project, which was usually the maintenance phase of the same project.

The beauty of Refactoring was that it laid out specific instructions on how to make crummy code better. And those instructions were almost mechanical in their execution... first do this, then do this, then finish. In fact, the instructions for each refactoring are in a section called "Mechanics". The promise was: if you follow the instructions then your code will improve and nothing can go wrong. I loved this. Finally, I had a guide and body of knowledge around what to do about all that old code I had to sort through.

So why did it all go so wrong?

Why does refactoring so often introduce bugs into your system?

Why is refactoring a word you can barely mention to QA and Operations without horrified reactions?

Most importantly, what happened to the "and nothing can go wrong" part of the refactoring promise?

I'll tell you what went wrong: our tool vendors screwed us.**

I've been interviewing Java candidates all week, and I always ask about refactoring. Everyone knows what refactoring is. Many people know the keyboard shortcuts. It couldn't be easier to recall the various refactorings available in the IDE... there is an entire freakin' menu of them. And not a single one of those menu items is implemented correctly in terms of Fowler's Mechanics:

Chapter 1, Page 7: The First Step in Refactoring: Whenever I do refactoring, the first step is always the same. I need to build a solid set of tests for that section of code.

Page 7! This isn't buried in some appendix, it is front and center. And in case you're just skimming the book and missed it, it's called out in a sidebar with a lightbulb on the very next page:

Chapter 1, Page 8 Before you start refactoring, check that you have a solid suite of tests. These tests must be self-checking.

I don't know how much more emphasized step 1 of refactoring could be: don't touch anything that doesn't have coverage. Otherwise, you're not refactoring; you're just changing shit. And not a single IDE enforces or encourages you to have test coverage before mucking about with the refactorings. I propose we make a change to our behavior. If QA or Operations asks us what introduced a defect, and our answer is a refactoring without coverage, then we reply, "I was just changing some shit." I think this would work to eliminate reckless refactorings. Hmmm. You know what might work better? How about we just try a little harder to bring our code under test. Reading "Working Effectively with Legacy Code" is a good place to start, as well as using functional testing frameworks like DBUnit or maybe FIT. The choice is yours, I guess. The second option seems a little more useful in the long term, however, I kinda like the idea of developers telling QA and their managers that software is late because they decided it was a good time to "change some shit".

But please, whatever you do, stop saying refactoring when you're making changes to untested code. You're ruining it for the rest of us who want to do real refactoring.

* "Hello" + 4 yields 'o' as a character. Strings are just character arrays lined up in contiguous memory, so H is offset 0, e offset 1, and o offset 4. I thank God this information is no longer useful to me.

** Yeah, it's more our fault than the tool vendors, isn't it? Still, it's a lot easier to blame a nameless entity than the guy sitting across from you.


Unknown said...

Umm, "hello" + 4 yields a pointer to the string "o", not 'o' as a character. In C anyway.

Hamlet D'Arcy said...

Ha! I got the answer wrong in the interview and I got the answer wrong 8 years later in a blog post. Thankfully, no one lets me near their C code any more.

ben said...

Great post. Changing shit is so fun, but so dangerous.

Anonymous said...

Sorry Milivoj, C strings ARE interpreted as char pointers. There's really no such thing as a string pointer, strings are just char arrays.

Anonymous said...

Refactoring w/o tests = Refucktoring.

Artem Russakovskii said...

"The better developers new the oper...". "new"? Seriously?

Unknown said...

In C++, setting the value to "Hello" + 4 results in undefined behavior. If your environment allows it, though, it can lead to some interesting results:
"Hello" + 4 = '!';
If you don't segfault, you've successfully changed the literal in the data segment. To make things more interesting, some compilers use string folding to optimize space consumed by string literals. In other words, all instances of "Hello" in your application point to the same memory location. So, if somewhere completely different in your application you have the following line:
char* myString = "Hello";
myString may contain "Helo!"

That's a fun problem to debug - let me tell you!

Keith Thompson said...

cryptyk: Not quite.

"Hello" + 4 yields a value (not an lvalue) of pointer type; you can't assign to it.

You can write *("Hello" + 4) = 'x', but since C++ string literals are const, that's illegal (the compiler doesn't have to reject it, but it does have to issue a diagnostic). (C string literals aren't const, so it's legal in C, but the behavior is undefined, so Don't Do That.)

Note that indexing is defined in terms of pointer arithmetic, so *("Hello" + 4) can be more simply written as "Hello"[4].

Keith Thompson said...

A C string is, by definition, "a contiguous sequence of characters terminated by and including the first null character". It's not a pointer. A string literal is implicitly converted to a pointer to its first element in most, but not all, contexts. One of the exceptions is sizeof; sizeof "hello world" yields the size of the string (12 bytes including the terminating null), not the size of a pointer.

And a "pointer to a string" is by definition a pointer to its first character.

Anonymous said...

I can't help but notice that most of the comments say nothing about the point of the post...

Anyway, I agree on the point about IDEs and refactoring.

Hamlet D'Arcy said...

@Geoff Wozniak
i'm glad someone pointed it out

Anonymous said...

@kieth Fair enough, I guess a string literal isn't always quite the same thing as a char pointer. Never knew about that facet of sizeof behavior. Thanks.

Mike Woodhouse said...

I had a similar life-changing experience when I read the book. I realised that there was a "proper" name for what I'd been dong for years. Later (too much later, but I'm a slow learner) I realised that what I had been doing wasn't refactoring, because it isn't really refactoring in the Fowler sense unless there are tests. Now I refactor when possible, but sometimes (other peoples' legacy code) all I can do is tidy up.

David M. Longworth said...

Clean Code is the Robert Martin book, right?

And Implementation Patterns is, of course, Kent Beck's book.

Now on to your point. When you 'blame' the IDE vendor (knowing the blame lies with each of us in the end) in what ways could an IDE make the work of refactoring better, easier?

Hamlet D'Arcy said...

@David M. Longworth
Yes, Martin and Fowler respectively.

I'm not sure there is anything an IDE maker could feasibly do. You could give a warning when a refactoring touches code without coverage but that would be kinda lame. My IDE already gives me too many refactoring warnings that I mostly ignore.

Maybe a better spot to work on the issue is the continuous integration server. A post-checkin report that shows # lines of code changed without test coverage per developer per week? That would be an interesting report. Although, I've found that developers not motivated enough to write tests are frequently the same developers who aren't motivated by public shaming. So you'd need to /do/ something with the report other than post it on the wall. Maybe include it in the release notes to QA so they can bang the drum of receiving untested code?

Tetsuo said...

The automatic refactoring tools work pretty well, and most are completely safe, even without tests (omg, I'll burn in Hell!).

Things go really wrong when the technology you use rely too much on string identifiers (for example, XML configurations and JSF EL bindings).

Or if it's a dynamic-typed language, which makes safe automatic refactoring pretty much impossible without 100% very good test coverage.

Unknown said...

The reality is the amonut of projects that didn't use unit test is still more than those using unit test. they all live well. I would like to say unit testing is a must only for agile development because agile desn't have a disciplined process (at least that desciplined than the dev processes used in many safety- and security-critical projects. That's a reason why agile can't be used in such projects. Agile has to be used together with more disciplined dev processes like CMMI-based (usually CMMI level 3 or above)). Code in agile projects changes often, if no unit testing, defects rate will be extremely high. For more desciplined processes, refactoring will not be that often, and they have other practices to ensure the success of refactoring (just like unit testing is used in agile dev). every technique has both pros and cons. unit testing is no exception. it's not a silverbullet. Of course, more desciplined processes like CMMI is not a silverbullet, but nor is agile. every techinique has its own place. the point is to "use the right technique, and use it right". no dev process can solve all the problems in software development. don't be so absolute for one technique.

Unknown said...

Is unit testing really that great? Depends. Usually, unit testing tries to isolate the dependencies (such as network, database, configuration files, remote servers, or third party components etc) of a method that is being unit tested, by using mock objects. To test the behaviors of the method in error conditions, you generate whatever exceptions you expect in the mock object, and check that these exceptions are caught and handled. The problem is: defects are usually caused by the interactions between the method under excercise and the dependencies, which unit testing tries hard to eliminate (ie, not test because unit testing doesn't do well in these fields). Another problem is: defects are often caused by unexpected exceptions and conditions, which unit testing can't deal with at all. For exceptions that you can expect, you probably will correctly handle them even without unit testing.

Will unit testing result in good design and high quality code? Again, depends. If you view good design and high quality code as "easy to be unit tested", the answer is yes. If you view the question from other directions, the answer is usually no. To make code unit testable usually involves the creation of lots small tiny methods, dependency injections that are normally not necessary and lots of interface definitions (for creation of mock objects) that are usually not necessary and lead to code duplication (a class usually has a corresponding interface. some interfaces contain only one or two methods that need to mock). Are these good design? Many books on design and coding will tell you the answer is no. Strictly speaking, here "good" only means "good for unit testing", and "high quality" means "high quality for unit testing".

Will unit testing let you write code more efficiently? Every who did lot of unit testing should know the answer.

A neutral report (not written by TDD fans) revealed that the defect rate of projects using TDD and agile was only marginally higher (if it's not equal or lower) than those using other dev approaches/processes. It's still hard to say which agile practice(s) contributed to the The marginal increase of defect rate: pair programming, early testing (many defects were not reported and communicated to the developers face-to-face because the testers are in the same team), customer involvement, simple and evolutionary design, or unit testing, or automated progression testing.

Remember unit testing gained some popularity only in recent 10 years. Unit testing is still new. Was there no refactoring before the advent of unit testing? Not really. Again, unit testing is required by agile methods. It works in that environments, this is because the nature of agile. but don't say it's wrong if it's not used everywhere.

Hamlet D'Arcy said...

@Karl thanks for the comments.

Program correctness is far from only 10 years old. Tony Hoare has been writing and working on program proofs since the 1960s. And unit testing is just one means of correctness proofing.

From the way you describe tests, I too conclude that unit testing isn't worth much. But I would take issue with the specific tests the team you were on wrote more than testing as a means to correctness proofing in general. Not that I have a complete picture, but reliance on mock objects is a smell that side effects are not isolated to small portions of the program. And an explosion of interfaces is a problem with your testing tools which sounds a lot like Java and a mock object framework.

Your description of "good design" is very vague. The statements don't strike me as incredibly meaningful. My design heuristics often focus on low coupling and high cohesion. When I focus on bringing these principles to a design, along with an obsession for limiting side effects, then I find that the resulting code is also quite easy to test. Some other heuristics that influence design are Bob Martin's patterns in Agile Software Development: SRP, DIP, etc.

As for the benefits of TDD. The original post didn't mention that, right? TDD is different than testing, and I'm an advocate for one and not the other.

Thanks for taking the time for such a long comment!

denny abraham said...

@karl You make good points on the potential pitfalls of unit testing in general. However, this does not strictly hold for TDD. TDD requires the definition a series of first-class clients of the code you are about to write. If you only cover failure or simple success cases, you can only guarantee other clients will work in those situations.

I am interested in your mention of design books as well as the study on TDD. Do you have links?

木須炒餅Jerry said...
This comment has been removed by a blog administrator.