I recently found a rather old post from
Frans Bouma's blog,
"Stored procedures are bad, m'key?".
Since comments are closed (he posted it almost a year ago), I have to reply here.
I'll state my bottom line at the top: Stored procedures are not only
not bad, they're necessary for maintaining loosely coupled, application
agnostic, flexible databases. And they're also necessary for
maintaining loosely coupled, database agnostic, flexible database-based
applications.
Mr. Bouma's argument centers around three claims: Ad-hoc
queries are easier to construct in application code, stored procedures
don't provide greater security than the application already does, and
in certain cases stored procedure execution plan caching can be
detrimental to performance. And all three of these arguments are true!
But none of them has any bearing whatsoever on the real benefits of
stored procedures.
Let's have a quick refresher on Computer Science 101, as it applies to object-oriented design: Loose coupling and high cohesion.
What does that mean in plain English? Each module, or object, should
have one very specific job, which it can perform for a VARIETY of other
modules or objects on request (high cohesion). Furthermore, no module
or object (or as few as possible) should depend upon the inner workings
of another module or object. This latter problem is known as tight
coupling, and its consequences are dire. Change one piece of code, end
up changing every other piece of code that uses it.
News flash for Mr. Bouma: This is the reason object-oriented
programming was invented. We try to separate components from one
another so that their functions can be re-used and re-applied to other
modules, other objects, and if we've done our job really well, even
other applications entirely.
So how does this apply to the relationship between a database and an application?
The database, undoubtedly, should be completely application
unaware. The database is serving up data. It has no idea what
application is requesting the data, or whether an application is
requesting the data. It needs to have no such idea. All it needs to do
is keep serving and all is happy. Furthermore, the database doesn't
care if multiple applications request data, or if those applications
request the same data. The database is set up to provide a single point
of access for all data requests in its domain of knowledge (i.e. the
data in the database being queried).
And now to the other side, the application itself. I am an
advocate of the application being as database agnostic as possible. I
don't believe this is entirely possible, but it is certainly a goal to
which we can aspire. Applications should request data from the database
using standardized, documented interfaces, after which that data should
be composed as quickly as possible (at the lowest level) into native
objects. This allows for changes in the data interface to have as
little ripple effect as possible in the application.
Mr. Bouma makes the claim that, "changes to a relational model
will have always an impact on the application that targets that model".
Again, he is correct. Which is why applications should NOT target a
model. Applications should have absolutely no knowledge
of database schema, including table names, column names, data types, or
any other information. This is the role of stored procedures. We can,
using stored procedures, completely encapsulate this metadata and provide standardized interfaces into the data.
Need to change a datatype in the database? Perhaps you won't need to
change the output values that the application receives from the stored
procedures. Perhaps you won't need to change the parameter input
values. No change is needed in application code. Need to change a
column name? Same thing. Need to re-architect the entire schema? Again, just change your stored procedures. The application will keep running as if nothing changed.
Contrast this to Mr. Bouma's suggestion that we drop stored procedures
altogether and instead build all queries within application code.
Suddenly, any small change in the database needs to be completely regression tested throughout not only one app, but every application
that uses the database. You've created an extremely tight, perhaps
unbreakable coupling between the application and the database.
Large-scale changes to the schema will most likely never be possible.
Will there ever really be time to re-write all of that application
code?
Now, back to the three central arguments: Yes, ad-hoc queries
are easier to construct in application code; but it's also argued by
many database experts
that ad-hoc queries are a sign of either poor database design, poor
application design, or both. Either way, dynamic SQL isn't too hard to
work with, and I've seen plenty of extremely ad-hoc applications in
which its use is hardly a stumbling block. Next, security. It's true
that if a user isn't authenticated in the application, he or she won't
be able to use that application to access the data. So in that
case, the stored procedure does not provide greater access control. But
the same is certainly not true for every other application that uses
the same database. Unless, of course, code is duplicated across every
application. The database should be the final word on its data. This
includes data security and data integrity. Application code simply
cannot do this if the database need ever be shared. Finally, Bouma's
assertions about cache plans are simply not worth touching. The WITH
RECOMPILE option has been around long enough that DBAs and developers
know it's there and know how to use it.
Now that we're at the bottom of this post, I'll re-state my
conclusion: Stored procedures are necessary for proper object-oriented
database-based application development. Not using them will force code
duplication, tight coupling between applications and databases, and
greatly increased man hours for every change made to either application
or database.