Wednesday, March 04, 2009

Code reuse is overrated

Good coders code, great ones reuse. How can anyone disagree with that?

I don't disagree. The principle of not reinventing something after you've invented it is so old and so obvious (and so obviously useful) that no one would seriously dispute it.

What's very much worth disputing, though, is the value of "code-reuse percentage" as a metric, and the degree to which code reusability actually brings about any economies in the software business. I would argue that the economies are largely nonexistent, because of the generally high cost of achieving reusability in the first place -- and a failure to depreciate investments in reusability over time. If you're reusing code in your product that was written in 1997 (to support a 1997 file format, say) and that code is only still in the product for legacy reasons, should that really count as reuse? Shouldn't "reuse" be weighted according to whether the code actually gets executed or not?

Should you build out on bad code (flabby code; spaghetti code; stuff that may contain unreachable or deprecated methods, etc.) and count it as reusability? Or go back and do it right? If you build out on bad code, you've achieved reusability. If you go back and clean up the code, you've killed your reuse metrics but you may well score a longterm ROI win.

There are so many problems with "reuse percentage" as a metric that I won't litigate the case fully here but instead ask you to refer to the paper by Lim, the 2005 blog by Dennis Forbes, and the 2007 blog by Carl Lewis (for starters).

Will Tracz (an early advocate of reuse, ironically) pointed out in "Software Reuse Myths Revisited" that reusable code costs around 60% more to develop than code not designed with reuse in mind. That estimate (derived in 1994) is probably off by a factor of three or four (maybe ten, with Java). But it's moot, in any case, given that the cost of producing code is, in reality, a comparatively small part of the overall cost of producing and marketing commercial software. And that's what I'm really saying here, is that the potential for cost savings is not a proper motivation for reuse. There is no significant cost savings. It costs more to develop reusable code, and the payoffs are mitigated by longterm maintenance costs associated with a larger code base.

Someone will inevitably argue that although you may end up with more classes and interfaces if you design for reusability, the code will ultimately be more readable. I dispute that. The code becomes more complex generally and it's not necessarily true that it becomes more readable. Does JMenu really need to have 433 methods? Why? It got that way because someone (lazily) decided to use inheritance as a code-reuse mechanism, instead of designing JMenu to have just what it needs. You could argue, "Well, so what? The ancestor classes are already written, they never have to be written again, why not reuse them?" There are so many fallacies with that argument, it's hard to know where to begin. JMenu is at the bottom of a 7-classes-deep inheritance chain. The odds that nothing in that chain will ever be rewritten in the future are small. Touching the code in that chain entails risk (a breakage risk for subclasses); this is the kind of thing that keeps half of Bangalore in business, doing regression tests. At runtime, you're carrying around the baggage of 400-odd methods you don't need. The footprint of your software (on disk and in memory) is bigger, performance is affected, garbage collection is affected.

What I'm suggesting is not that you should rewrite JMenu. What I'm saying is that if you're Sun, and you're going to write something like Swing (from a clean sheet of paper), do it with common sense in mind rather than taking an "inherit-the-world" approach to reusability.

Rest assured, when I write code, I try (out of sheer laziness) to make as many lines classes and methods reusable as makes sense (and no more). And I guess that's the point. Sometimes it doesn't make sense to go out of your way to write highly reusable code. Sometimes it's more important to have something small and streamlined that works now, that's purpose-built and does what it does well. If you can do that, fine. If you can't, for some reason, that's fine too, but do what's appropriate to the situation.

That does not mean you abandon good programming practices. It doesn't mean you write poorly structured code. It means you write only as much code as you need, and resist the temptation to overfactor. Unfortunately, the latter can be quite hard, especially if you're steeped in the Java arts.

There's a place in this world for silverware, and there's a place for plastic spoons. And yes, you can recycle plastic spoons, but for gosh sakes, silverware is expensive. Let's not accumulate it needlessly.

4 comments:

  1. I think that's the beauty of successful opensource projects. You can produce a bunch of crap in terms of the classical assessment of 'good code' and put it out there. It's more a case of 'getting skin in the game' if you will - an itch that needed to be scratched, and scratched immediately!

    If it's software that actually ends up doing something useful that a critical mass of people really want, then it will be embraced, critiqued and improved upon by the community and the good-quality-code will immediately begin to infect the initial code base virally over time. The software will then iterate regularly, maintaining customer interest, whilst improving its architectural underpinnings and adding some of value to the users each time (an agile tenet!).

    And the other gem with F/OSS? If nobody embraces the software, it simply dies and disappears - very organic Darwinian principles.

    Of course, open source is just one approach that may work due to it's 'network effect' in a globally diverse community and similar Tom Gilb-style software inspection may also add value, but at greater expense at the beginning of the software development cycle.

    ReplyDelete
  2. A good article but certainly from my perspective, code reuse shouldn't be about specific lines but about specific modules and functions. The benefits of reuse will be minor if applied to reusing the single access call to a system, but if it is reusing a business function then it will be significant - this is what should reuse efforts should focus on - code reuse in the large.

    If you are reusing functions to ask a customer to enter their details - then doesn't it make sense that all the departments in the company who may use this should do it in the same way. How a department might then handle that data may differ - and thus reuse of code may reduce, but there will be far greater benefits, both to business and coders if core functions are similar - for those common repeatable based functions. The plus side of reusing large business functions is the benefits become more quantifiable...thus helping to justify the approach or alternatively identifying where it is not worthwhile.

    ReplyDelete
  3. Great points, Kas.

    Man, I am living this nightmare at the moment: Just imagine being hired into a project in which a junior javascript developer attempted to use multiple-inheritance to reuse javascript code.

    I guess besides ranting about my current situation, I want to corroborate your point that it's important to design a system that concentrates on the current requirements.

    From experience, I've learned that anticipating future requirements can be a job saver. However, spending the 60% extra effort to build a "future-proof" component can also be a job killer. After drifting from one camp to the other a few times, I finally learned that writing code that anticipates* future requirements, but doesn't fully implement them, gets me the best results.

    *By anticipation, I mean using my experience to identify potentially useful future requirements. Adding configurable parameters and breaking up methods into more atomic functionality helps a lot here. Avoid building anything at the API level (public scope), though. This is where you'll get mired in the fallacy of foresight and the multiplicity of end-user feature fulfillment.

    ReplyDelete
  4. I agree code reuse is overrated. One reason I have felt this way (which is pointed out subtley in earlier comments to this post) is that there is good value in rewriting the implementation of certain paradigms/patterns from time to time, to improve upon the general code you think you've created. In particular, what often happens is that you decide at some time T1 that it's good to generalize what you've written. So you document, clean up, generalize (all of which takes time). Then comes a golden moment at T2, when you decide to actually re-use. And maybe another time at T3. Then, at T4, you try to re-use what you have but you run into a bug. However, as you consider fixing that bug, you worry about how it might affect T2 and T3 implementations (investigation which also takes time). You get the idea. Unless you're rock solid that every generalization you make is executed extremely well, you're re-use will be partial and complex.

    Instead, what I find works best is a "less is more" approach. Re-use can certainly work well if you execute well, and it's a boon when the thing you're re-using is complex to write. Put another way -- spend your precious re-use time working on that code which is truly difficult to write again. Don't spend your re-use time generalizing lots of relatively simple concepts under the pretense that you might be able to make tremendous re-use of prior code.

    ReplyDelete

Please add your comment here!