What’s wrong with persistence ignorance

TL;DR- ignorance is wrong.

Sounds good: write object oriented code focusing on domain problems and let somebody else (or future you) worry about persistence. After all, those two are unrelated problems in different layers.

False presumptions

Layering, responsibility separation, modularity etc. have costs and benefits. In programming we accept some costs if they give as adequate benefits. Modular programming, opposed to just hacking, takes more time to learn, plan and implement at first, but as soon as we need to change some functionality or debug a procedure we start to reap the benefits as all those tasks take less time. Repeat them often and benefits will surpass the costs.

Costs of persistence ignorance are on:

  • performance (layering, mapping, not using optimal SQL queries)
  • coding time (coding against interface is more time consuming, mappers and interfaces are extra code)
  • code size (mappers, interfaces)
  • duplicate code (marginally different structures for every layer).

When you get the benefits? If you change database provider (MySQL to Oracle) or if you change database technology (Oracle to XML). And when will that happen? Probably never. You optimized for the most improbable situation. It’s more likely you’ll change programming language or framework then the database.

Leaky abstractions

Joel Spolsky wrote in The law of leaky abstraction that you can ignore underlying structure or technology in most situations but not in all. For example, .NET string. Instead of manipulating arrays of chars, you can pretend that string is an object that you join, split, append and do many other convenient methods. But underneath, string is still an array of char and every time you change it, a new array is created. You can ignore that in most situations, but put

completeString += appendedString;

in a loop and ignorance will bite you.

Translated into OO/ORM/SQL process, ignoring the persistence technology will mostly work. If database is small enough, it will churn whatever you throw at it. With bigger databases, and every business application worth mentioning have one, only CRUD operation for single object are safe bet. If database calls get into loop, especially if database is over a not-so-fast connection, seconds turn into minutes.

Database is more then storage and SQL is language and tool for itself. Using it only for save and query functionality is not the optimal use. So, always be aware of the underlying system. You can automate some parts and ignore the others, but you can’t ignore it completely.