SRY: Sometimes Repeat Yourself

Any programmer worth their salt will have heard of the DRY principle: Don’t Repeat Yourself. The idea is that repetition is bad: it makes for more code to read through, and it makes code harder and more error-prone to maintain because you have to make the same change in multiple places.

These are valid points, but like with any guideline or rule of thumb, it’s oversimplified, and the real answer is always “it depends”. So today I’d like to offer some counterarguments, in the form of situations where a little repetition can be a good thing. I’m calling it SRY: Sometimes Repeat Yourself.

For divergence

Sometimes repetition is by accident, rather than by necessity. A Person class and a Country class may both have a name property, with corresponding getters and setters, but that doesn’t mean you should extract a NamedEntity superclass (or mixin, depending on the language). Imagine you did, and you later wanted to split the person’s name into a firstName and lastName (not recommended, by the way)?

In general, if it is conceivable and valid for the repeated pieces of code to diverge and change independently of each other, they do not have to be the same, they just happen to be the same right now, and it is probably a good idea to leave the repetition in.

For readability

Some people go to great lengths to avoid repeating even a single line of code. This isn’t always a good idea, because it comes at the cost of more abstraction layers. But wait, you say, I thought abstraction was a good thing? Again, it depends. Abstractions can make it easier to reason about a piece of code, but they can also make code harder to read. Consider the canonical for loop in C-like languages:

for (int i = start; i < end; i++) {
  doSomething();
}

You’re likely to see this all over the place, and indeed it is a bit error-prone, so you might think: let’s create an abstraction, so I have to write it only once!

for_numbers_between(start, end, doSomething);

This might be perfectly clear to you at the time you write it, but it does add a layer of opacity to the code. The for loop was completely transparent to anyone familiar with this idiom, but when stumbling upon your helper function, they wonder: what exactly does this function do? Is end included in the range? Does the function pass the loop counter to doSomething? Can I return a boolean to break out of the loop? In JavaScript: what will the value of this be in the callback? The only way to be sure is to look up the definition of for_numbers_between, which will most likely be buried far away in some utils package.

So when adding abstractions like these, besides considering the gain in reduced repetition, also be sure to consider the loss in transparency, and weigh them against each other.

For searchability

Somewhat related to the previous point, some people want to avoid repetition in strings like file paths and naming prefixes. For example, suppose your style guide says that all CSS classes should be prefixed with dirname-filename- to avoid clashes between packages:

.profile-description-edit-bio {
  ...
}

.profile-description-edit-interests {
  ...
}

When referring to such a CSS class in code (or even SASS or another CSS preprocessing language), it’s tempting to extract the prefix into a string constant and put it at the top of the file somewhere:

const CSS_PREFIX = "profile-description-edit-";

...

element.addClass(CSS_PREFIX + "bio");

Not only does this suffer from the same readability issue as in the previous section, but it also lacks another useful property: searchability. If a reader sees the CSS rule for .profile-description-edit-bio in the style sheet, they might wonder where it is used. Since this is the über-dynamic stringly typed world of the web, the only viable way of finding out is to search for the name of the CSS class throughout the project. But thanks to the string pasting, that particular usage is not going to show up. In the best case, the reader tries to search for some prefixes or suffixes of the class name, like bio, and might find it that way among a pile of false positives. In the worst case, they might conclude that the CSS rule is unused, and remove it! Conversely, any use of such string pasting in the codebase makes removal of actually unused CSS rules difficult to do.

So here too, it’s better to have the string constant spelled out in its entirety every time it’s used. Just pretend it’s an identifier, like the name of a variable or a function; you don’t chop those up either. It’s a bit more verbose, but much easier to maintain with in the long term.