Database Abstraction Layers Must Live!

July 15th, 2009 § 9 comments

I come preaching true hope, against the fallacies.

I’ve heard the arguments for and against database abstraction layers (DALs) time and time again. I must say first, I agree with them all, both sides, equally. Interestingly, I can put the vocal proponents of each side of the argument in one of two boxes: a programmer guy box, or a database guy box. For some unknown reason though, they never seem to see eye to eye.

Honestly though, I like to put myself in the middle of that argument. I see both sides. I think fine tuning an application’s core business with vendor specific features is tremendously important, after all, that is why there are so many competing database vendors. Generally speaking of database driven projects, I feel like planning to use a specific vendor up front, knowing its pro’s and con’s, and tailoring an application to the chosen database’s strengths can only help in the long run. Also, I feel that building a database model first before any code, offers many performance and scalability advantages than does code first development.

That said, I also see value in using a database as a simple data-store when the actual database is not a key component of the overall application. That’s right, it is completely valid to say that the data-storage & database component of an application sometimes is not the key component; a database guy probably will never agree with you there. Just as there are programmers who swear by this code first, database later mantra, there are database developers that will swear by the database first, code later mantra.

The fact is, each project is unique. It’s this uniqueness of projects and their execution that ultimately shapes the perspectives of developers as well as the tools they write and consume. To say that one mantra is clearly a better choice over another is simply being ignorant.

The Use Case of Abstraction Layers

To be honest, I don’t really buy the “I might switch database vendors at some point” argument either, as Jeremy Zawodny points out. For larger projects (on the scale of the facebooks, the twitters, etc), switching the database underneath after a project has been in production is a monumental task- regardless if you have an abstraction layer or not. Chances are, you used some of the database specific features, not to mention, you now have a large set of mission critical data that also has to be ported. Long story short, its never as easy as swapping the abstraction layers database adapter out.

What I will buy though, is there are some problems that fall in thicker end of the Pareto Principle that can be solved with a database abstraction layer. For the uninitiated, the Pareto Principle is effectively the 80/20 rule. In software use cases, when applying this term- the 80% use case is the majority of use cases. These use cases are generally not that interesting in terms of database interaction. To give it a label, we can call these the CRUD, BREAD, or <<insert your favorite terminology here>> operations. That is not to say that these operations are not important, but they are not special. In fact, they are so un-special, that we can just about apply a standard query syntax (SQL 92) to them, and expect that the query is both portable between databases and common across applications that wish to use them.

This is where database abstraction fits in. As a developer, you’ll come across this problem time and time again. A large portion of an application are CRUD screens and the smaller more interesting part of your application is your reporting screens. With an abstraction layer, we are able to code against both a unified API as well as have a layer that will produce consistent and vendor compatible queries. This allows us to build more specialized data access layers (patterns) for multiple database vendors with great ease. You want Table Gateway- done, you want Row Gateway- done, you want Active Record- done. Each can be implemented to tackle the 80% part of the 80/20 rule when applied to the database centric business code of an application.

The Slow Path & The Fast Path

When I talk about this 80/20 rule in terms of the applications we write, I like to further refine the terminology so that it easier to visualize. The most prominent terms that helps developers visualize the 80/20 rule in their application is the slow path of your application, and the fast path of your application. Each of these terms has a set of characteristics that set each apart from one another:

Slow Path:

  • Performance is not of primary importance
  • Has an interactive nature
  • Validation and verification of data are of high priority
  • Application to data-store interactions are fairly trivial
  • Does not comprise applications core business logic

Fast Path:

  • Performance is of importance
  • Limited interactive nature, information flow is fairly static (non-interactive)
  • Flow of information consist of already verified and validated data (originates from the databsae)
  • Application to data-store interaction can become complex (JOINs, SUB-SELECTS, VIEWS)
  • Is the core business of the application

To get a better understanding of how the terms are applied, lets look at a typical web application. Generally speaking, there are a few web based forms that users interact with. These forms are the entry point of a code path that does not get a lot of throughput. This is generally because forms are submitted by people, and people can only type and submit forms so fast. In addition to this being a less traveled code path, it also has a few checks along the way- validation of data, and verification of data. Typically, the problems of verification and validation of data are not too unique to the application being executed. In fact, the web forms, validation and verification problems have been solved over and over again by various libraries.

On the other side of the equation, there is the aggregation and merging of the stored data (which inevitably came from the aforementioned web forms.) Since the unique aggregation and processing of this data is the core aspect of business of said application, it stands to reason that this code path will be more well traveled by users. This, is the fast path. The problems solved in this code path are generally unique and since they are unique, it’s hard to find an off the shelf solution to these problems.

Since this is where the money is to be made, it also stands to reason that developers should concentrate their efforts in the fast path of their application. This means they should solve the slow path problems of their application with existing tried and tested solutions- this includes generic forms solutions, validation and verification libraries and yes, database abstraction layers.

Getting Cozy With Zend_Db, a Database Abstraction Layer

Not that we’ve made a use case for DAL’s, what would one look like? Well, I’ll use Zend Frameworks Zend_Db as my use case.

The connection code:

You’ll note that since this factory takes a standardized array, it makes it trivial to swap out various connection information for different adapters.

Simple queries:

Here you’ll notice the generic and abstracted nature of this API. Since there are several tasks in database interaction that are consistent across the board, those such as INSERT, UPDATE and DELETE, it makes sense that we can create a generic API for handling such interactions. These interactions (INSERT, UPDATE and DELETE) represent the mutation methods of a database and as such, represent the most predominant way of getting data into a system.

For all intents and purposes though, simple SELECTs are fairly standardized too. They are standardized enough as to compliment the INSERT, UPDATE, and DELETE abstractions so that we can find actual rows to do these mutation operations.

Now that we have a simple and consistent API for doing simple SELECTs, INSERTs, UPDATEs, and DELETEs; we can implement something a little more interesting: the table & row gateway:

Immediately, you should see the inherent value in the above example. Rudimentary and common tasks can now be handled with a consistent and simple API. But what happens when you’ve started using this DAL, and you want to use a vendor specific feature? Well..

As you can see, the query method of our database adapter will allow us to pass custom SQL into the database thus taking advantage of vendor specific features.

What if you want to combine both paradigms for ultimate flexibility?

While that above example is “a bit out there”, it does show that even with a DAL, if it’s flexible enough, you can code as close to or as far away from the database as you like. Ultimately the mantra here is: lets get the job done in the most effective, efficient and sound way possible.


Simply put, a database abstraction layer is just another tool in the toolbox. You don’t have to completely change your paradigm of programming, nor do you have to apply an all-or-none approach to using a DAL. When applied correctly, you can build out the slow path of your application in little to no time, while leaving extra time for developing and fine-tuning the fast path of your application. And to keep code from becoming unruly, simply apply some best-practices code organization to your project.

Tagged , , , , ,

  • wilbur

    I have been wondering about this sort of thing for years!

  • Evan Byrne

    I really enjoyed reading this article, it was very informative and the attention to detail was great. The world needs more PHP articles like this!

  • Paul

    Thanks for writing a well balanced and honest article in this world of rants for the purpose of generating traffic. Design decisions are rarely black and white. I also like Zend_Db’s API.

  • Steven Tyler

    I’m sure you’ll forgive me for being brutal, but you kind of lost me at Line One, by criticising a five-year-old Jeremy Zawodny post. Firstly the man knows more about databases than you and me combined, and secondly it’s five years old. I didn’t read any further – I’m sure that your post was lovely though.

  • Ralph Schindler

    Steven Tyler,

    I totally agree, I have his book on High Performance MySQL. But when I saw the post making its rounds last week, 4 years to the day, I figured it was time to address it- this time with a spin on best practices, Zend_Db, and patterns in mind. You should read all the way through, then render your opinion.

    This is a post I’ve been meaning to write for a while, but Jeremy’s post gave me an opportunity to provide an alternate argument to the usefulness of Abstraction Layers, as well as give insight into the myriad of way one could use Zend_Db, and the paradigms of programming it fits within.

    Many developers tend to acquire and all-or-none mindset, and I am simply trying to convey that its ok to both “drink the ZF kool-aid” as well as use the lower level extension to do work. Heck, use elements of CakePHP, Symfony, or sprinkle some Django (python), Rails (ruby), or Ant (Java) into a project.

    If its a tool that’s been validated, tested, and generally considered a “best-practice” kinda tool, use it- and Be Effective.

  • Gabriel PREDA

    I’m with Jeremy on this one…

    PHP programmers must know how SQL even those simple “INSERT INTO…”…
    Real database abstraction layers hide a lot of SQL.

    What I preach for is “data-access abstraction layer”:

    PDO provides a data-access abstraction layer, which means that, regardless of which database you’re using, you use the same functions to issue queries and fetch data. PDO does not provide a database abstraction; it doesn’t rewrite SQL or emulate missing features. You should use a full-blown abstraction layer if you need that facility.

    On database abstraction layer they say in the manual “if you need that facility”…

    Have you ever thought if you really-really-really need it ?

    On small projects – it doesn’t add value… they’re small… you will finish them in time anyway.
    On big projects – it doesn’t add value… an abstraction layer brings overhead…

    Have fun !

  • Dave

    I think people have somewhat missed the point. Yes for a single bespoke project your db is probably prescribed so the db abstraction layer doesnt buy you much. But as a product developer writing for an abstraction layer (of which SQL is one) allows the customer to choose what platform they want to hold their data on rather then pre-reqing what you developers are most familiar with.

  • Grant

    Nice article.

    I certainly like having a standard interface.

    I like Zend_db and when I can’t do something easily – I use a vanilla SQL query as in the article.

    My issue with any abstraction is that you have to learn another language on top of what you already know; when sometimes it might be just as easy to cobble something together with the basic tools available in the standard library +- modules yourself. Same goes for glorious CMS’s that just spit out a bit of HTML – wouldn’t plain HTML suffice?

    I’m staring at a popular php framework right now – and to me it feels like it’s getting in the way of me coding, rather than helping me.

    For me it’s a case of how quick/intuitive/useful the tool is to develop with and learn, will it save me time in the long run, and can it be easily added to my project /tool chain with little fuss.

  • Pingback: Building layers on data access (Part 1)

What's this?

You are currently reading Database Abstraction Layers Must Live! at Ralph Schindler.