The World’s Leading Microsoft .NET Magazine
   
 
timstall

Donate Today!

Search Box

 

Calendar

««Mar 2010»»
SMTWTFS
  12
3
456
7
8
9
10
111213
1415161718
19
20
21222324252627
28293031

My RSS Feeds








Mailing List

Most Popular Tags

                                                           

Why would someone put business logic in a stored procedure?

posted Thursday, 4 June 2009

I have a strong bias against injecting business logic into the stored procedures. It is not scalable, is hard to reuse, is a pain to test, has limited access to your class libraries, etc... However, I think there are some legit cases to put advanced T-SQL in a SP, but you're playing with fire.

Here are some reasons - basically either performance, functionality, or legacy constraints force you to.

  1. Aggregations, Joins, and Filters - If you need to sum up 1000 child rows, it's not feasible to return all that data to the app server and sum it in C#. Obviously using the SQL sum() function would be the better choice. Likewise, if your logic is part of a complex filter (say for a search page), it performs much faster to filter at the database and only pull over the data you need.
  2. Batch calls for performance - SQL Server is optimized for batch calls. A single update with a complex where clause will likely run much faster than 1000 individual updates with simple where clauses. For performance-critical reasons, this may force logic into the stored procedure. However, in this case, perhaps you can code-generate the SQL scripts from some business-rules input file, so you're not writing tons of brittle SQL logic.
  3. Database integration validation - Say you need to ensure that code is unique across all rows in the table (or for a given filter criteria). This, by definition, must be done on the database.
  4. Make a change to legacy systems where you're forced into using the SP - Much of software engineering is working with legacy code. Sometimes this forces you into no-win situations, like fixing some giant black box stored procedure. You don't have time to rewrite it, the proc requires a 1 line change to work how the client wants it, and making that change in the stored proc is the least of the evils.
  5. The application has no business tier - Perhaps this procedure is for a not for an N-tier app. For example, maybe it's for a custom report, and the reporting framework can only call stored procs directly, without any middle-tier manipulation.
  6. Performance critical code - Perhaps "special" code must be optimized for performance, as opposed to maintainability or development schedule. For example, you may have some rules engine that must perform, and being closer to the core data allows that. Of course, sometimes there may be ways to avoid this, such as caching the results, scaling out the database, refactoring the rules engine, or splitting it into CRUD methods that could be batched with an ORM mapping layer.
  7. Easy transactions - It can be far easier for a simple architecture to rollback a transaction in SQL than in managed code. This may press developers into dumping more logic into their procs.

Note that for any of these reasons - consider at least still testing your database logic, and refactoring your SQL scripts.

These "reasons" are probably a bad idea:

  1. Easy deployment - You can update a SQL script in production real quick! Just hope that the update doesn't itself have an error which accidentally screws all your production data. Also consider, why does the procedure need to be updated out-of-cycle in the first place? Was it something that would ideally have been abstracted out to a proper config file (whose whole point is to provide easy changes post-deployment)? Was it an error in the original logic, which could have been more easily prevented if it had been coded in a testable-friendly language like C#? Also, keep in mind that you can re-deploy a .Net assembly if you match its credentials (strong name, version, etc...), which is very doable given an automated build process.
  2. It's so much quicker to develop this way! Initially it may be faster to type the keystrokes, but the lack of testability and reusability will make the schedule get clobbered during maintenance.

In my personal experience, I see more data in the SP for the bad reasons ("it's real quick!")- the majority of the time it can be refactored out, to the benefit of the application and its development schedule.

tags:  

links: digg this    technorati    




1. Alvaro Santamaria left...
Saturday, 6 June 2009 10:20 am

I agree partially with your comments but I would clarify my position:

- I would use T-SQL to apply "data logic", never "business logic". Data logic is logic that is used to guarantee the data integrity and to decouple a complex schema from the business tier that uses it. This data logic can become a bit complex in some cases but it still should be applied transparently to the tiers that use the database.

- The easy deployment bad reason seems like a "red herring" because what you are saying is that it is risky and not easy taking into account security, etc.

- What I don't see is the lack of testability or reusability. It can be slightly harder to unit test than object oriented programming languages but achievable. For reusability, you can call SPs or UDFs from T-SQL.

Conclusion: In the database, apply just "data logic" and not "business logic".


2. Michael Harmon left...
Sunday, 7 June 2009 6:12 pm

I would expect someone posting on a .Net blog to have this viewpoint. I've worked in IT for over 20 years. Everyone wants their chosen platform to be considered the best choice for scalability, testing, etc, etc.

Business logic has been programmed into the database since Sybase invented stored procedures. These capabilities have been greatly expanded with MS SQL Server over the years.

As far as scalability is concerned, I would have to disagree with you. I have been hired several times to rewrite programs or logic from a .Net environment using T-SQL style stored procedures. The same functionality using stored procedures took one third of the time.

I am not saying that this will always be the case, but, there are benefits to not having to pass the network traffic back and forth. If you are using a middle tier to filter data, and not the database, SQL server could be passing a huge amount of data of the wire. To me, this would be nonscalable and nonsensicle.

Also, SQL server can create a query plan for stored procedures in advance that can help it to choose the best path for processing. If the .Net or other application is passing queries to the database, not as stored procedures, this capability is not available. If 1000 users are passing the same query to the database, each instance would have to be analyzed separately by SQL server and would not benefit from precompilation.

That being said, T-SQL has historically not been the most flexible language. It's strengths are data manipulation and validation. It's weaknesses are that it is not designed for complex calculations. Debugging, also, can be cumbersome.

These are some of the reasons that Microsoft, since SQL Server 2005, allows writing stored procedures using .Net languages. This promises the best of both worlds. The ability to incorporate complex business logic within the database itself. Any .Net language can be used to create .Net CLR stored procedures.

That being said, I have seen it recommended many times that, if you are going to write a data intensive procedure, use the T-SQL language for the stored procedure. If you are doing some complex calculation, or need better error handling, then write the procedure in .Net.


3. Bruce W Cassidy left...
Monday, 8 June 2009 5:58 pm

I don't believe there is a "one approach fits all" solution to this.

We take for granted the n-tier nature of application development now. It wasn't that long ago that 2 tier was the big thing, and at that point it became common place to "inject" business logic into the database. Now it doesn't make as much sense, but we're often spending time maintaining or enhancing existing applications. So it still happens.

Also, in some cases it makes sense to have logic that works with bulk data in the database. For example, if you have code to close a period, it would make sense to have this live on the database where it can work with the database in a bulk fashion.

Finally, it also makes sense for the database to present an "API" to the application in the form of a well structured layer of stored procedures or views (or combination of the two) -- that way we can prevent intimate coupling between the database and the application. This layer also introduces another level where data validation can be done ("defense in depth" via layers is valid for more than just security.) Some of that data validation often encompasses some business rules about the data. So it makes sense to have that sort of rule on the database itself to ensure the quality of the data.

I believe that software architecture is as much an art as a science, and working out where to put things like "business logic" isn't as black and white as you make it out to be.


4. Patrick Zurflueh left...
Saturday, 5 December 2009 6:05 pm

My favourite approach: http://www.dulcian.com/Articles/Thick%20Database%20Revisited_final.htm