Opinion and commentary about Mac and iOS applications, publishing and content consumption behavior, web and cloud architectures
August 20, 2009

Alternative Data Storage Status Quo

There is a lot of innovation happening now in the alternative data storage space. People working on these projects have started a NOSQL community already and those not being part of it yet are trying to come up with schema-less approaches on top of relational databases.

There are a few things I am concerned about though. It looks like each of these solution is inventing its own API and is using their own protocols (being it memcached(-like), protobuffers , thrift , absolutely custom, etc.). I am not sure what the adoption status of these solution is right now, but I believe that over time these inconsistencies will become extremely expensive. While probably still early, but I really wish the NOSQL guys will start talking sooner than later about common APIs and protocols. (n.b. I am aware that there is almost impossible to expose the whole functionality of these systems through a common API, but I’m pretty sure it will be possible to find out the common points).

I also think that anyone looking into this field will have quite a hard time figuring out what’s his best option. I know that the NOSQL people are doing their best to add documentation and provide valuable help on their user groups, but there seems to be an almost complete lack of information on recommended usage scenarios. And there also might be the misconception about what commodity hardware means for others.

I usually don’t trust (micro or not) benchmarks, but I have to agree that VPork is an interesting and possibly very useful initiative:

With the wide range of distributed, non relational databases out there it is hard to know which one to choose. One part of the puzzle is of course performance. Personally I’m interested in low response times.

Here is my short TODO list on how to make things better:

  • accessible and immediate availability: make sure that there is a very easy way to get people to try out the solution. That might mean creating (already optmized) Amazon EC2, VMWare or whatever else images. As in the old internet say: “one-click away”
  • document scenarios: even if atypical, others will be able to figure out common points
  • publish any available benchmarks: even if not perfect they will give others an idea about what they are looking for
  • common API and protocols

Is there anything else you’d add to this list?

Make sure you check myNoSQL a NoSQL blog featuring the best daily NoSQL news, articles and links covering all major NoSQL projects and following closely all things related to NoSQL ecosystem. Everything you need and want to know about NoSQL.