My last post in Aaron's series drew a mixed review from some readers, and I'm sure this one will do the same. But that's part of the fun!
One of the biggest threats to maintainability is code that's not properly formatted. When I'm called in by a customer to debug some legacy code, often the first thing I'll have to do is re-format the code so that I can actually figure out what's going on. Many of Aaron's posts have covered readability, but one of the key, key, key things that he hasn't hit yet is indentation. The secret is simple: Whether you do it the way I like, the way Aaron likes, or some other way completely, just make certain that you do it consistently. Choose a style and stick to it, not just throughout a query but throughout every query you right. This way not only will you be able to read your code, but once they get used to your style so will other people.
The goal of an indentation scheme is to help the reader by grouping similar constructs along the same line. Code is hierarchical; if you consider a procedure, that CREATE PROCEDURE statement is the root node. Every statement in the body of the procedure is a descendant of the CREATE PROCEDURE statement, and each statement has its own descendants--for example, the predicates in the JOINs or the WHERE clause, or the elements in the column list. Improperly indented CASE expressions are a big annoyance to me, and these are similarly hierarchical, with the CASE keyword at the highest level, followed by the WHEN and ELSE conditions, their predicates, and the results of each case. Keeping all of this in mind when writing code makes it easy to see and understand what's going on even if you've never seen a given piece of code before.
Unfortunately, sometimes people get lazy. Or some simply don't care. I hope that no one reading this post falls into the latter category. As for the former, I hope that this is one area that you won't be lazy about. There are a number of automated formatters on the market these days--some even free--so there is really no excuse. Still, I see code all the time that is confusingly indented.
Here's an example of some code that needs work:
SELECT
c.AddressLine1,
c.FirstName, c.LastName,
o.Subtotal, o.ShippingCost,
CASE WHEN o.TotalQty > 10 THEN 'big'
WHEN o.TotalQty BETWEEN 5 AND 9 THEN 'medium'
ELSE 'small' END AS BoxSize
FROM dbo.Orders AS o
INNER JOIN dbo.Customers AS c ON
o.CustomerId = c.CustomerId
WHERE o.OrderDate > CONVERT(DATE, GETDATE()-1)
AND c.State = 'MA'
If you think I've overdone this example, think again. I see code easily this carelessly formatted on a regular basis; I often wonder if the person writing the code was actually trying to confuse the next person who comes along. Where does the CASE expression start and end? How quickly do you notice that there are some predicates used as part of the JOIN condition, and others in a WHERE clause? Did it take you a moment or two to find the end of the SELECT list?
In this case, in my opinion, even no indentation at all would have been preferable to the mess above. In the following example I've removed all indentation, while adding a few line breaks to put each element on its own line--something I do when writing my own code.
SELECT
c.AddressLine1,
c.FirstName,
c.LastName,
o.Subtotal,
o.ShippingCost,
CASE
WHEN o.TotalQty > 10 THEN 'big'
WHEN o.TotalQty BETWEEN 5 AND 9 THEN 'medium'
ELSE 'small'
END AS BoxSize
FROM dbo.Orders AS o
INNER JOIN dbo.Customers AS c ON
o.CustomerId = c.CustomerId
WHERE
o.OrderDate > CONVERT(DATE, GETDATE()-1)
AND c.State = 'MA'
The important thing is that you pick an style and stick with it. Don't make a mess, and don't be like a developer I recently worked with who asked me to help him debug an extremely complex query. I walked over to his desk and found that it was indented much like the code above, so the first thing I did was to reformat it so that we could tell what was actually going on. I asked the developer how he usually formats his code and his response was less than ideal: "I just write it on one line until I feel like pressing Enter. It depends on my mood that day." True story, I'm sorry to say.
My personal rules are simple: Each item in the SELECT list gets its
own line. Or more than one line, in the case of CASE expressions and
other complex expressions. Parenthesis that surround expressions sit on
their own lines, and the contents are indented, similar to code between
braces in C-style languages. And each predicate in a JOIN condition or
a WHERE clause gets its own line. A few things I don't do: I do not indent JOINs beyond the level of the FROM clause, as I feel that they are siblings in the hierarchy. I do not put the ON on the line as the first predicate, as many people do, because I feel that it applies to the entire set of predicates in the JOIN. And I don't use the commas at the start of the line method because I think it just looks bad.
This style was developed in order to minimize horizontal
scrolling, at the expense of some additional vertical scrolling. Some people don't like it, but it works extremely well for me and
people I've worked with have become converts. But I'm not telling you (in this post at least) to use my style. I'm just telling you to pick a style, some style, any style, and stick with it religiously.
Below is the same code as above reformatted once more. I think you'll agree that even if you don't like exactly how I do things, it's a lot easier to read and would be easier to maintain.
SELECT
c.AddressLine1,
c.FirstName,
c.LastName,
o.Subtotal,
o.ShippingCost,
CASE
WHEN o.TotalQty > 10 THEN 'big'
WHEN o.TotalQty BETWEEN 5 AND 9 THEN 'medium'
ELSE 'small'
END AS BoxSize
FROM dbo.Orders AS o
INNER JOIN dbo.Customers AS c ON
o.CustomerId = c.CustomerId
WHERE
o.OrderDate > CONVERT(DATE, GETDATE()-1)
AND c.State = 'MA'
Enjoy!