Wednesday, May 31, 2006

XML Content Servers vs. Content Management Systems

A lot of people ask me to compare MarkLogic Server, which is an XML content server, to a content management system (CMS), like Documentum. People wonder if they're the same thing. They wonder if they need both. They wonder if one is a substitute for the other.

First, some definitions. By CMS, I mean either a content management system or a document management system. I do not mean an enterprise content management (ECM) system, which is a suite (typically built through acquisition and not terribly well integrated) that handles content/document management, web content management, imaging, records management, workflow, and collaboration. (Not to mention some other components.)

Now, my answer. In short, it's about creation vs. delivery.

CMSs are very good at automating the content/document creation process. With features like library services (check-in, check-out), versioning, and workflow (e.g., who needs to approve what), CMSs are great for controlling the creation of content.

For example, if you're an airline publishing flight manuals, you want to control versions very tightly, you want to track metadata about who worked on which chapters, you want to ensure that two people aren't working on the same chapter at the same time, and you want to ensure that new content goes through a very precise approval process before publication.

However, once you're done and you have a new version of the flight manual, then CMSs typically do not do a good job at helping you deliver that information to users. In this sense, "deliver" might mean:
  • To multiple output formats (e.g., PDF, xHMTL)
  • To recombine/repurpose the information into multiple information products (e.g., to produce a quick reference handbook, a flight manual, and training materials)
  • To provide fine-grained search of the information (e.g., get me the paragraph dealing with birdstrikes on takeoff, right now!)
This is where XML content servers like MarkLogic come in. They specialize in content delivery by providing fine-grained (i.e., element-level) queries against potentially large contentbases with very high performance.

Many MarkLogic customers use both MarkLogic and a CMS, such as Documentum. If you asked them (as I often do), they'd say pretty much what I did: "we love Documentum for content creation, and we love MarkLogic for content delivery."

Or, to answer the original questions.
  • XML content servers are not competitive with content management systems; they are complementary to them
  • They are not substitutes for each other; many people happily use both
  • The fundamental difference between them comes down to content creation vs. content delivery
So why is there confusion?

Because, as I like to say, MarkLogic lives inside the CMS marketing shadow but outside the CMS reality footprint. Simply put, CMS marketing materials talk about delivery, repurposing, and single-source XML publishing. But the reality is that most CMSs do not do these things well, if at all.

A lot of the problem comes down to granularity. CMSs are designed to chunk content into version-worthy bits, such as chapters. So CMSs are good at handling relatively few, relatively large chunks. This is probably exactly what you want when you're controlling content creation. (Who'd ever want to version a caption, or a citation, in an article?)

But CMSs fall down when it comes to content delivery because you need much finer access granularity.
  • You want to repurpose not just chapters, but paragraphs
  • You want to provide search that doesn't simply return links to entire documents, but instead returns the relevant pieces of them (e.g., the abstract, the captions and figures, the bibliography)
(Or, when it comes to content analytics, you might want to write queries that return pieces of documents and then counts them, such as determining the most-cited authors in a specific area of study.)

In most CMSs there is an inverse relationship between chunk granularity and performance. If you want fine-grained chunks, you can set them up, but the system will crawl.
  • Could you set up very fine-grained chunks in your favorite CMS? Yes.
  • Would you? No.
If you're like most folks, you'll use the CMS for what it's designed to do (and does well) which is controlling the content creation process. And you'll either defer your goals related to content delivery or a find a system like MarkLogic that excels at it.

(There are numerous ways to integrate MarkLogic with a CMS like Documentum. Lee Fife at Flatirons Solutions recently gave a talk on this topic at the Mark Logic User Conference.)

0 comments: