The post begins with:
The great online shift is creating massive amounts of data - whether it is videos on YouTube or social networking profiles on MySpace. And that data is stored in databases, making them the key component of the new web infrastructure. But managing that information isn’t easyI think he nails the problem statement. The Web world is changing fast. And relational databases are having trouble keeping up.
The good news is that database management will be vastly different in the future. In fact, change has already begun; it just isn’t (cliché alert!) “evenly distributed” yet.He then goes on to describe some leading examples of companies or problems that are pushing the relational database envelope.
- Yahoo's creation of its own user management software based on BerkeleyDB
- Google's MapReduce
- Amazon's S3 (simple storage service) and SQS (simple queue service) which externalize operations normally done by a database.
- The general use of Lucene, Nutch, and Solr to do indexing of unstructured content, "something an old relational database cannot do well."
- The graph-structured data problem (also known as the parts explosion problem) inherent in social networking and which remains an Achilles' heel for relational databases
My question is: what about everybody else? What are they supposed to do?
My short answer is -- perhaps not shockingly -- MarkLogic. At MarkLogic, we call Data 2.0 "content."
- We manage XML natively
- We manage graph-structured data easily
- We manage, search, storage and index text and XML natively
See this post on top-to-bottom XML for more.

0 comments:
Post a Comment