Thursday, August 30, 2007

Google Research from Outsell and Stephen Arnold

Information industry market researcher Outsell, working together with search guru Stephen Arnold, has been producing some excellent research on Google.

I wanted to highlight one piece, entitled Google as Publisher: Is Google Poised for a New Push into the Information Industry, which looks at how Google's infrastructure and technology could be leveraged for a major push into the publishing business. In addition to pages of excellent analysis, the report concludes with four action items for publishers in managing this threat/ opportunity:
  • Come to grips with the impact. I think many publishers have their heads stuck in the sand when it comes to their "partnerships" with Google. Yes, the relationship is inherently co-opetition and it's "cutting off your nose to spite your face" to not leverage Google's spider and search volume. But make no mistake, you're dancing with a partner who could decide to become the devil in an instant.
  • Adopt agile publishing processes, quickly. To me, this includes (1) "agile content" in an XML repositories like MarkLogic (cleaned up using lazy XML enrichment) and (2) agile software development processes. I'm amazed by how many publishers still do waterfall-based product development.
  • Use Google technologies in your own products. It's best to both know and leverage your (potential) enemy.
  • Mobilize and modularize content. XML is also a great means for doing this, because of its presentation independence.

XQuery Tops Gartner's 2007 Data Management Hype Cycle

I've been traveling this week so I had some time to catch up on my analyst reading. It was somewhat amazed to discover that XQuery is the hottest (i.e., highest visibility) technology plotted on Gartner's recently released data management hype cycle.

Frankly, I was pleasantly surprised by this because I hadn't thought that XQuery was getting anywhere near the attention it deserves. As frequent readers know, I think that our kids will do database programming in XQuery, not SQL, and that they will think of SQL the way that we think of COBOL. (Remember sections?)

Better yet, Gartner said that XQuery is only 2-5 years from mainstream adoption. This is all good news for Mark Logic because MarkLogic Server "speaks" XQuery natively much as Oracle natively speaks SQL.

The Gartner note, entitled Hype Cycle for Data Management, 2007 (subscription required) was published on July 2, 2007 (ID G00148704).

Monday, August 27, 2007

Fast Search Salacious Wikipedia Edits

See this post on Search Engine Land entitled Searcharazzi: Salacious Wiki Edits to Fast Search & Transfer. Excerpt:
“Would the vandal or PR person removing information from these pages please stop. It is a matter of record that FAST had a share price crash of 28% when they made a statement that they had needed to change their accounting approaches. The press release is on the company site. This is a material piece of information about the company.”
Here is the disputed Fast Search & Transfer Wikipedia entry. Here is the talk page.

Plus ça change, plus c’est la même chose
.

Go Check Out Facebook, Now!

The other day I was updating my LinkedIn contacts and I ended up sending a wide broadcast email asking friends and associates to join my LinkedIn network. I like LinkedIn and I use it as a way to keep in touch with a broad network of people with whom I've worked in the past. We also use it for recruiting and sometimes sales.

I'd setup a Facebook profile several months ago in response to an invitation, but I'd left it blank and never spent much time on the site. I'd noticed a steady up-tick in my rate of Facebook invitations during the past few months, so I'd been thinking about taking a serious look. In addition, I'd read about Facebook's strategy to become a "platform" (whose meaning was not immediately clear to me in the context of a social networking site) so investigating it was rising on my to-do list.

But it was only after receiving multiple responses to my broadcast email of "dude, LinkedIn is Facebook for dinosaurs" that I decided that I needed to do something. I'm pleased to report that I now have a complete Facebook profile, about 50 friends (compared to 450 on LinkedIn), and I must say I really like the site. Why?
  • It combines the best aspects of LinkedIn (e.g., biography, contacts, contact network, friend finding) with those of MySpace (messaging, updates, photos)
  • Unlike MySpace, it's not loaded with spam sites and flashing lights.
  • It has groups and networks that you can (easily) join and leave
  • It has a certain hominess that blurs personal and work lines
  • It has both Facebook-provided apps (e.g., calendar, photos) and user-provided ones (this is the platform part)
  • It has a clean, simple user interface
In fact, my only reservation about Facebook relates to one of the things I currently like about it -- the work/personal life blurring. Amongst my current friends I have former co-workers, Mark Logic customers, high school friends, a board member, some industry analysts, current employees, and even my High School aged son. That's cool.

While that's cute and homey, it's already created some awkwardness. After my son starting using a user-provided "compare people" app on me, I decided to use it on my friends and was quickly asked questions like "who has a better body?" comparing a current customer with a former employee. Not good. Mercifully, there was a "skip" button of which I made prodigious use.

So one nice thing about LinkedIn is that it's purely professional, at least as I've set it up. Going forward I think Facebook will need to provide a "role separation" solution and hopefully they will do a better job at it than Amazon, which still gives me recommendations for children's books and golf balls based on my buying them -- for others -- in the past.

In playing with Facebook, I realized something else: I really like their focused marketing strategy. Instead of a general, broad attack, they started out with one segment (university students -- actually barring others from joining for years), established dominance in that segment, and then expanded from there.

So, call me a fan. Given the potential to become a serious platform, replace email communications, and hide lots of content from Internet spiders in so doing, I think everyone should check it out.

In addition, I'd recommend this post, which provides an excellent introduction and overview -- Web Strategy: What the Web Strategist Should Know About Facebook.

Danny Sullivan Interview

I recently found this lengthy and comprehensive interview with search engine guru Danny Sullivan, formerly of Search Engine Watch, but who split off about a year ago to start Search Engine Land.

The interview covers a number of topics:
  • The challenges in starting Search Engine Land
  • Sphinns and how they came to be pronounced sfins as well as spins (sphinns are their equivalent of diggs).
  • Universal search, and the relevancy challenges associated therewith
  • Vertical search
  • Personalized search, and the privacy concerns associated therewith
  • Whether anyone (e.g., Microsoft) is capable of giving Google a run for its money in Internet search
  • Facebook, and social networking site's role in search (including the email replacement aspect of such sites and its impact on search)
I'd definitely recommend reading the interview.

I've added Stone Temple Consulting's blog, Ramblings about SEO, to my blogroll and also recommend looking at their site itself (there is lots of good content including interviews, podcasts, and primers).

Friday, August 24, 2007

Sun: The 0 in Web 2.0 (Thoughts on the New JAVA Ticker Symbol)

I must admit that I've never been a fan of Sun's marketing. I never liked the "we're the dot in dot com" campaign. I never liked the way they marketed Java. The only things I did like were McNealy's infamous barbs, but they wore out over time, in a slow and steady transition from sharp and clever to weary and bitter.

So I have to say that I sighed for Sun when I saw their latest move: to change their ticker symbol from SUNW to JAVA. It's more sad than anything else, really.

First, see this excerpt from Jonathan Schwartz's blog where he discusses the change (bolding mine):
As I said, the number of people who know Java swamps the number of people who know Sun. Or SUNW, the symbol under which Sun Microsystems, Inc. equity is traded on the NASDAQ stock exchange. [...] SUNW certainly has some nostalgic value - it stands for "Stanford University Network Workstation." [...] But SUNW represents the past, and it's not without a nostalgic nod that we've decided to look ahead. [I feel a tear welling up.] To be very clear, this isn't about changing the company name or focus - we are Sun, we are a systems company, and we will always be a derivative of the students that created us, Stanford University Network is here to stay.
Or, in other words, "everyone knows Java so we're going to charge ticker symbol to JAVA, but we're still Sun, the Stanford University Networking Workstation guys." That's clear as mud.

This basically says the whole thing is an investor relations ploy to put some sex into the stock. It's not about changing the name or focus; it's about changing the ticker symbol.

That alone is sad. If the best thing you can do for your stock is change the ticker symbol, you're in trouble. What's sadder is what they changed it to: JAVA.

Here are some reactions from the (currently 277) comments on his post, most of which tar-and-feather him for the ridiculous change.
Sun once again fails to grasp the big picture. While 'Java' may be better known by the public than 'Sun Microsystems', the perception of Java by the public isn't good, so why associate the entire company with it? Synonyms for Java are 'Big', 'Fat', 'Bloated', and 'Slow'. Are those the terms you want to characterize your company?
Exactly. But worst of all, Java was really hot, when, back in 1999 when Ricky Martin (Livin' La Vida Loca), Cher (Believe), and All Star (Smashmouth) were top of the charts? That's a long time ago.

To do something so transparent and to do it so poorly ("hey, did you hear the new Cher song?") is what's more pathetic than irritating. Put differently: Sun, if you want to do a transparent ploy to put some sex into your stock then you should change your ticker symbol to one of the following:
  • AJAX (that's got Java in it at least)
  • WEB2 (though I think digits aren't allowed)
  • GOGL (maybe you'll get some lift from people thinking they're buying GOOG)
  • RAILS (if you can sneak in a 5th letter)
  • FACE (try to steal some of Facebook's hype)
  • XQRY (hey, that will be ours one day (not))

Thursday, August 23, 2007

SEO Mistakes Most Bloggers Make

If you're like me, you're using your blog as your own little soapbox on the Internet, and while you might be tracking stats (I recommend sitemeter and feedburner) you probably haven't given much thought to doing search engine optimization (SEO) for your blog.

While I'm often impressed, and sometimes amazed, with the organic search results I get on Google, I've never really given much thought to systematically improving them. (Though I should tell you about the post I did on XML support in relational databases entitled "Pimp My Ride" which generated a lot of misdirected traffic on the search "Dave Pimp My Ride" where I still end up on the first page of results.)

So I was happy to find this post, entitled Twelve SEO Mistakes Most Bloggers Make on the Search Engine Land blog.

Here are the twelve mistakes:
  • Auto-generated title tags
  • Allowing the indexing of pages that shouldn't be
  • Multiple homes for one blog
  • Not using optional excerpts to minimize duplicate content
  • Not using rel=nofollow where indicated
  • Over-use of date-based archives
  • Instability in keyword focus on category pages
  • Poor URLs, often auto-generated
  • Only one RSS feed that's unoptimized
  • Unoptimized podcast tagging
  • Putting your URL or your feed on a domain you don't own
  • Suboptimal anchor text on internal linking
I think I'm breaking rules 1, 4, 5, 6, 8, maybe 9, 11, and definitely 12.

Wednesday, August 22, 2007

Lazy XML Enrichment

One of my big gripes with most content-oriented software is that it requires a big bang approach (see The First Step's a Doozy). The basic premise behind most content software is roughly:

1. If you do all this hard work to perfectly standardize the schema of your content, perfectly tag it, and possibly perfectly shred it, then

2. You can do cool stuff like content repurposing, content integration, multi-channel content delivery, and custom publishing.

The problem is, of course, that the first step is lethal. Many content software projects blow up on the launchpad because they can't get beyond step 1. Our first customer had been stuck on step 1 for 18 months with Oracle before they found Mark Logic. (We loaded their content in a week.) At a recent Federal tradeshow, we had dinner with some folks from Booz Allen who'd been trying to load to some semi-structured message traffic data into a relational database for months. We told them to swing by our booth the next day. Our sales engineer then loaded their content over a cup of coffee while eating a muffin and built a basic application in an hour. They couldn't believe it.

In most companies -- even publishers -- content is a mess. It's in 100 different places in 15 different formats, and each defined format is usually more of an aspiration than a standard. Once, at a multi-billion dollar publisher one of our technical guys actually found this sentence in some internal documentation: "it is believed that this tag is used to ..." Only folklore describes the schema.

So when it comes to the general problem of making XML more rich -- i.e., having more tags that indicate more meaning -- many people take the same big-bang approach. "Well, step 1 would be to put all the content into a single schema (which alone could kill you) and run it through a dozen different entity, fact, sentiment, concept, summarization "extractors" that can markup the content and fragments of it with lots of new and powerful tags (which alone could cost millions).

Again, step 1 becomes lethal.

At Mark Logic we advocate that people consider the opposite approach. Instead of:
  • Step 1: make the content perfect so you can enable any application you want to build
  • Step 2: build an application
We say:
  • Step 1: figure out the application you want to build
  • Step 2: figure out which portions of your markup need to be improved to build that application
  • Step 3: improve only that markup, sometimes manually, sometimes with extraction software, and sometimes with heuristics (i.e., rules of thumb) coded in XQuery
  • Step 4: build your application and get some business value from it
  • Step 5: repeat the process, driven by subsequent application requirements
I call this lazy XML enrichment. You could call it application-driven, as opposed to infrastructure-driven, content cleanup. I think it's an infinitely better approach because it delivers business results faster and eliminates the risk of either never finishing the first step because it's impossible, or having funding yanked by the business because it runs out of patience with an IT project that's showing no ostensible progress.

At this point, I'd like to direct those of technical heart to Matt Turner's Discovering XQuery blog where he provides a detailed post (code included) that shows an example of lazy, heuristic-based XML enrichment, here.
  • Matt's example show lazy enrichment because the only markup he needs for his desired application is related to weapons, so that's all he adds.
  • Matt's example is heuristic-based because he devises a way to find weapons in XQuery, and then use XQuery to tag them as such.

Tuesday, August 21, 2007

Rethinking Crossing the Chasm

Read/WriteWeb had an interesting post a few weeks back entitled "Rethinking Crossing the Chasm." Those of you who know me know that I'm a fairly devout practitioner of chasm-crossing ideas and that I'm a big believer that they work, particularly in markets where there is no mainstream demand, for either a specific time period or in general.

I'd argue that some markets, however, are born lucky and never have a chasm-crossing phase. For example, we once hired The Chasm Group when I was at Business Objects to help us with a strategic planning session and we spent most of the time arguing with the consultant about whether the model even applied to us. My personal conclusion from the discussion was that business intelligence (BI) never had an in-the-chasm phase.

Why? Perhaps my training in seismology causes me to remember the elastic rebound theory part of the book better than most, but remember that Crossing The Chasm (CTC) was primarily about disruptive infrastructure technology. The idea was that a crack in the bell curve developed because disruptive technologies were, well, disruptive. That therefore caused more normal, mainstream individuals and companies to fear adopting them until they were "safe" (i.e., generally perceived by the herd to be mainstream, safe, and reliable).

That in turn drove the creation of the chasm phase -- you'd already sold everyone who wanted whizzy-ness, so after what appeared to be a smooth take-off for a company, you'd find yourself flying into the chasm because all the whizzy-oriented people had already bought precisely one copy of your product and no one in the mainstream would yet dare touch it. CTC theory says the solution to this problem is total focus on a single entry point (one industry, one problem) in order to build a complete solution, attractive to mainstream buyers.

In the sequel, Inside the Tornado, Moore then argued that the single entry point should be treated as the "head pin" (switching metaphors) in a bowling alley and that companies should treat further market development as an exercise in leveraging success off that head pin, by targeting and knocking down adjacent pins (i.e., markets).

But, back to my first point, everyone seems to forget a few basics of CTC theory:
  • The chasm is caused by reluctance of the mainstream to buy
  • The tornado is a release of pent-up demand caused by the chasm (i.e., the elastic rebound theory part)

Simply put, in my opinion:

  • If there is nothing disruptive about the technology
  • Then there is no chasm
  • And there ensues no tornado
My theory explains BI evolution very well. The market experienced neither a stalled, chasm phase, nor a hypergrowth tornado phase. Nor did the BI market naturally coalesce around a single leader early on, instead having several strong leaders even to this day (e.g., Business Objects, Cognos). Because the technology was not disruptive there was less risk, so no chasm.

Because BI technology was more tool than infrastructure, there was no market need to identify one clear leader. No one died if they picked Cognos instead of Business Objects, whereas with a real infrastructure technology, getting the wrong guess on platform (e.g., ASK's choosing Ingres instead of Oracle) could be fatal. In ASK's case, it cost them their birthright to transition their leadership in MRP into leadership in ERP. Instead, by betting on the wrong platform, ASK handed that multi-billion dollar prize to SAP, and faded off to obscurity within CA.

While I'm a big fan of Crossing the Chasm theory, I don't think it applies to every situation. I think it was written to apply to disruptive infrastructure technologies. I think it can generalize to any new technology that involves significant risk to adopt. I think people over-generalize CTC theory to consumer applications and products where no such risks exist.

My other quibble is that the Read/WriteWeb post suggests that early adopters are a homogeneous pool of gadget freaks who are burning out because too much whizzy technology is being thrown at them too quickly. While I do pity the gadget guy in the midst of today's consumer tech frenzy, I think the assertion misses the bigger point that "early adopters" isn't a demographic -- i.e., it's not primarily about people, but rather about companies and the situations they find themselves in. Put differently, CTC is largely about industrial (or B2B), not consumer, marketing.

For example, publishers are one of the early adopters of Mark Logic's XML content server technology. Is that because publishers, as a rule, are generally early technology adopters? No. Is that because the gadget guys in publishing bought MarkLogic to play with it and discovered early uses? No.

It's because of the situation that publishers find themselves in. It's not about personalities and gadget orientations. Nor is it about industry norms on the use of technology for competitive advantage.

It is, instead, all about the death of print, the transition to online, Google, free information, and the urgent need to find ways to add value in the fast-changing world around them. Trying circumstances can turn the most staid publisher into an aggressive early adopter of a technology that just might save their business.

Quibbles aside, I'd say the ReadWriteWeb post is definitely worth reading. My single favorite piece of it was this graphic by Tara Hunt. With one picture she nails one of the biggest mistakes that most startups make -- so read it carefully. Enjoy!

Bubble 2.0? 5,000 Web Apps in 333 Seconds

Here's a great piece of PR from SimpleSpark, a Web 2.0 applications directory site which they call "the place to find and share a new world of web applications." As of this instant, SimpleSpark is tracking 5,108 apps. They made a video that shows in 333 seconds the logos of the first 5,000 applications they are tracking.

(In fact, in the Captain Anal department, I think it's 5,001 if you include SimpleSpark itself.)

Think of all the venture capital that went into funding all these companies. Happily, most aren't capital intensive, but in any case -- it's a lot of companies. Browse the catalog a bit if you're not convinced. Bubbly anyone?

Friday, August 17, 2007

Fun High-Tech Cliches

Here's a fun editorial on CNET entitled "Tech cliches to live by." Here's the opener:
Todd, a friend of mine, once gave me an invaluable piece of advice: if you fall asleep in a meeting and wake up not knowing what's going on, just say "so where's the value add?" at the first pregnant pause.
Here's a summarized list from the article, along with a little of my commentary:
  • They need a Lou Gerstner type. (I remember when Scott McNealy barbed that IBM now stood for International Biscuit Corporation when Lou was appointed. Boy was he wrong.)
  • Anything to do with Moore's law. (And to make one really ill, invoke Metcafe's Law.)
  • Be like Google. (That's like saying "be like the guy who won the lottery.")
  • It's an inflection point. (Agree that term is abused but as a mathematician I remember that it still has meaning: a sign change in the second derivative.)
  • Let's tear everything up. (I don't hear this one, much.)
  • Just think what Apple could do with that. (Or Frog Design, for matter.)
  • It's a different business model. (And often one that loses money. If you look at the Valley's most fashionable business models at present -- SaaS and appliances -- both are quite adept at losing money.)
  • Follow the money. (I like the idea of this one, but agree it's hackneyed. Back at Business Objects, after we bought Crystal Decisions, the Bain consulting guys we hired to help with integration said this all the time. Or should I say "Bane" consulting?)
  • Patents are stifling innovation. (A fashionable, if uninformed and arguably illogical viewpoint, in my opinion.)
  • History is written by the victors. (Sad but true in my mind. See my post on Ingres to see relational database history from the other perspective.)
I think he forgot "software wants to be free," "let's run it up the flagpole," "let's socialize that," "we're getting traction," and "it's orthogonal," but otherwise it's a pretty good list.

Thursday, August 16, 2007

Fast Search: The Blood-Letting Begins

See this Forbes story for the latest news on the situation at Fast Search (& Loose Accounting) & Transfer:
  • Layoff of 148 employees
  • Reduction in operating costs of approximately $12M/quarter
  • Up to a $55M one-time restructuring charge
  • Of which $25M will be cash (e.g., severance)
  • And remaining up to $30M will be non-cash
The non-cash write-offs are the most interesting ones. In the slide presentation from today's conference call they say the $30M in non-cash charges will be for:
  • Internally developed software. This means some amount of previously capitalized R&D has now been decided to be worthless. (This is why conservative software companies don't capitalize R&D.)
  • Acquired technologies and customers. I've never heard of carrying acquired customers on your balance sheet before, but saying you're writing off acquired technologies means that some products or work-in-process R&D you had previously acquired and put on the balance sheet as assets have since been decided to be worthless.
  • Specific accounts receivable provisions. (What's the PR rule? Always put the thing you want the least focus on 3rd in a list of 4?) I don't know what "specific provisions" are, but I do know that writing off accounts receivable (AR) means that customers aren't paying for deals that you previously reported as revenue, either because your agreements weren't actually binding (and the deals should never have been reported as revenue in the first place) or because customers aren't happy and are simply refusing to pay. One does wonder how much in additional AR write-offs is buried in this otherwise opaque $30M pool.
  • Property and equipment. I'm not sure what this is, to be frank. It's hard to imagine walking into a building one day and deciding it's become worthless. Perhaps it's more about computers or about their planned real estate consolidation.
In addition, Fast provided 2007 guidance of ~$160M, which is slightly down from the reported 2006 revenues of $162.6M (see page 25).

Somewhat amazingly, for a company that on May 31 thought it was going to do "$53.5 to $57M" in the quarter ended 30 days later and did $34.1M instead, Fast gave guidance for revenue growth for "succeeding years" (i.e., beyond 2007) of "30%+".

Here I was thinking it was bold to provide 2007 guidance under the current circumstances, and they're giving guidance for 2008 and beyond.

See the FAQ for disclaimers. See these posts (Fast Warns, Who's Accountable) for more on the story.

How The Web Disrupts the RDBMS World

I found an interesting post on The Future of Software minisite run by the GigaOM network, best known for Om Malik and his GigaOM blog. The post is entitled "Data 2.0: How the Web disrupts our relational database world" and is written by Nitin Borwankar.

The post begins with:
The great online shift is creating massive amounts of data - whether it is videos on YouTube or social networking profiles on MySpace. And that data is stored in databases, making them the key component of the new web infrastructure. But managing that information isn’t easy
I think he nails the problem statement. The Web world is changing fast. And relational databases are having trouble keeping up.
The good news is that database management will be vastly different in the future. In fact, change has already begun; it just isn’t (cliché alert!) “evenly distributed” yet.
He then goes on to describe some leading examples of companies or problems that are pushing the relational database envelope.
  1. Yahoo's creation of its own user management software based on BerkeleyDB
  2. Google's MapReduce
  3. Amazon's S3 (simple storage service) and SQS (simple queue service) which externalize operations normally done by a database.
  4. The general use of Lucene, Nutch, and Solr to do indexing of unstructured content, "something an old relational database cannot do well."
  5. The graph-structured data problem (also known as the parts explosion problem) inherent in social networking and which remains an Achilles' heel for relational databases
So while I generally agree with his thesis, the examples cited are basically all technology companies who are able to write their own system-level software to bypass and/or accommodate the limitations of relational databases.

My question is: what about everybody else? What are they supposed to do?

My short answer is -- perhaps not shockingly -- MarkLogic. At MarkLogic, we call Data 2.0 "content."
  • We manage XML natively
  • We manage graph-structured data easily
  • We manage, search, storage and index text and XML natively
Some companies will always be able to write their own stuff to get around problems. But the reason MarkLogic exists is provide a commercial DBMS that "the rest of us" can use when managing content and building web applications with it.

See this post on top-to-bottom XML for more.

The Blogs That I Read

I often get asked what blogs I read. While I do have a blogroll in the right hand column, I tend to use that to highlight a few blogs of interest, and don't use it as an exhaustive, up-to-date list of what I'm reading.

In fact, at a technical level, I have a problem because:
  • I read way more blogs than those on the right
  • The blogs I read are constantly changing (e.g., finding new one, zapping old ones)
  • Changing the blogroll on blogger requires me to do manual HTML editing, so it's hard work
  • I've found no way to automatically link or synchronize my bloglines (reader) blogroll with my blogger blog (try saying that ten times fast)
Ergo, in this post I'll simply give you the URL of my bloglines blogroll, which is: http://www.bloglines.com/public/davekellogg.

Speaking to the power of human aggregation in the Internet era, I would say that if I could only read one feed related new media and publishing, then it would be:
Jill is the director of planning and communications at NFAIS and she does a great job scanning a large number of blogs and pulling only the best posts into her feed.

Tuesday, August 14, 2007

MarkLogic: The Oracle of XML Databases

I just found this post on The State of Native XML Databases that I thought was interesting. It calls MarkLogic "the Oracle of XML Databases" and I thought I'd first comment a bit on that.

If being "the Oracle of XML databases" means that:
  • MarkLogic can handle size and scale: thank you and we agree.
  • MarkLogic is leading the enterprise market: merci encore.
  • MarkLogic is priced similarly to Oracle: well, we are. Our pricing strategy is to try and price the system like any major enterprise DBMS. We are not trying to differentiate on price on either the high or low side. Price-wise, we want to be a regular old enterprise-class DBMS.
  • MarkLogic is expensive: I'm not sure. Is Oracle expensive? Are search engines expensive?
I suppose, as a marketing guy, that my answer to the expensive question is always: relative to what? Judgments like "cheap" or "expensive" should always be made relative to alternatives.

Relative to a movie ticket MarkLogic is indeed expensive. Relative to the cost of not catching the next terrorist scheme (for the Intelligence Community) or missing the move to online contextual applications (for a publisher), or downtime due to bungled maintenance procedures (for an airline), then I'd say it's a bargain.

At a product level, relative to the combination of an RDBMS and enterprise search engine that MarkLogic replaces, I’d argue that it's pretty much a two-for-one deal.

I thought my Chrysler Sebring convertible was inexpensive until the transmission blew at 28K miles. I thought my BMW X5 was expensive, but it's at 80K miles with almost zero maintenance costs and no surprise downtime. The BMW's winning on cost/mile basis.

Is MarkLogic basically a high-end product? Yes. Do we work with some very large companies and government agencies? Yes. But, we also work with many small and mid-sized businesses as well. (Hint: we work a lot in publishing and there aren't that many $1B publishers. There aren't that many $300M ones, either.)

In short, if you're a prospective customer and you're reading this, please do two things: (1) talk to us before making a judgment and (2) in so doing look at the total, big-picture costs and benefits in making a decision.

The next issue is whether MarkLogic is indeed an XML database. We position the product as an XML content server, and we continue to believe that positioning better distills the essence of what the system does well.

Though, were I an analyst, I would put MarkLogic into the broader category of XML databases, and let the company successfully argue that XML content server is a subcategory of XML database. That is, an XML content server is a native XML database designed specifically for semi-structured content (e.g., documents) full of text and all the vagaries that come with that.

For more thoughts on this positioning issue, see this post (Half Man, Half Machine, All Cop) or this one (MarkLogic: DBMS or search engine).

Finally, let me emphatically agree with the post's conclusion:
The simple truth is that while much data and many applications fit very neatly into tables, even more data doesn’t. Books, encyclopedias, web pages, legal briefs, poetry, and more is not practically normalizable. SQL will continue to rule supreme for accounting, human resources, taxes, inventory management, banking, and other traditional systems where it’s done well for the last twenty years.

However, many other applications in fields like publishing have not even had a database backend. It’s not that they didn’t need one. It’s just that the databases of the day couldn’t handle their needs, so content was simply stored in Word files in a file system. It is these applications that are going to be revolutionized by XQuery and XML.

If you’re working in publishing, including web publishing, you owe it to yourself to take a serious look at the available XML databases. If they already meet your needs, use them. If not, check back again again in a year or two when there’ll be more and better choices.

The relational revolution didn’t happen overnight, and the XQuery revolution isn’t going to happen overnight either. However it will happen because for many applications the benefits are just too compelling to ignore.

Thursday, August 09, 2007

The Price of Inxight: $76M

It's funny how things that companies leave undisclosed in press releases can often be found buried in SEC filings. Seth Grimes and I had gone back and forth a bit debating what Business Objects had paid for Inxight.

Seth had estimated $125M to $175M. Knowing that Business Objects was a value shopper, that Inxight had been on the block for some time, and hearing that the deal took quite a while to come to fruition led me to a lower estimate of $50M to $75M.

I'm happy to report we were both wrong (but boy was I close). In this filing you get the answer:

Acquisition of Inxight
On July 3, 2007, the Company’s wholly owned subsidiary, Business Objects Americas, acquired privately held Inxight Software, Inc. (“Inxight”) in accordance with a purchase agreement dated May 21, 2007. Business Objects Americas acquired all of the outstanding capital stock of Inxight, a leading provider of software solutions for unstructured information discovery, including text analytics, federated search, and data visualization.
The acquisition was an all cash transaction of approximately $76 million. Of the total purchase price, $6.5 million was placed in an escrow account to satisfy potential claims under warranties and indemnities in the agreement. Provided there are no such claims, the amount will be paid on October 3, 2008.

Highlights from the 2Q07 Software Equity Report

The Software Equity Group has released the 2Q07 Software Equity Report. It's a great piece of research on the market. Here a few highlights:
  • Overall software industry growth from $211B to $305B between 2005 and 2010, reflecting a 7.6% CAGR (per IDC)
  • There were 6 software IPOs in 1H07 raising an aggregate $739M
  • The median software IPO raised $80M, with median enterprise value $298M, and median next fiscal year estimated growth of 43.8%
  • There are 12 companies in the software IPO pipeline with median annual revenues of $75.8M, median net income of $0 (I guess profitability is not a constraint), and a median proposed offering size of $86.3M
  • Software M&A was down slightly in 2Q07 from $26.3B to $25.4B, with a median valuation of 2.3x TTM revenues
SandHill.com blogs about the recently released report, here.

The Fast Search Train Wreck: Who's Accountable?

Fast Search & Transfer reported its 2Q07 financial results yesterday. Here is the summary:
  • Revenues of $34.1M, down 31% from 1Q07, and down 11% from 2Q06
  • Operating loss of $37.8M, reflecting an operating margin of -111%
  • Cash burn of $25.6M in 1Q07 and $74.8M over the past year
  • Explosion in days sales outstanding (DSO) to 265 days.

On the plus side, Fast took their lumps. On the downside, while they admit to serious problems there seems to be no accountability for those who let them happen.

Quotes from the investor presentation, along with my commentary:

  • "We are disappointed about our Q2 results." I sure as hell hope so, given that they'd provided guidance of $53.5 to $57M with one month left in the quarter, and they had positioned the company as the high-growth market share gainer in enterprise search.

  • "Change of sales procedures has cut short-term revenues significantly: tightening of financial control, including non-use of [memoranda of understanding] and removing longer payment terms." My translation: Fast will stop taking revenue when they don't have signed software license agreements and they'll stop accepting payment terms that look more like a discount mattress store (buy now and make no payments till next year) than an enterprise software company. My question: if these practices are not acceptable, then who is accountable for having allowed them in the past?

  • "Thorough review of accounts receivable has led to $13.5M in new provision for bad debt." My translation: $13.5M worth of deals that Fast had booked and reported as revenue in the past actually, uh, wasn't because the customers won't pay -- probably because either they're not happy with the software or because the agreements used (e.g., MOUs) weren't actually binding. And that's not $13.5M in total "fake" revenue, that's $13.5M more than they'd previously estimated. This begs the same accountability question, and also suggests that a restatement of past results might be in order.

  • "No excuses: issues are internal operational and fixable." For the most part, I'd guess that's true but (accountability aside) this impacts how I think about the enterprise search category. Simply put: Fast and Endeca were the bright spots in an otherwise fairly bleak category. Now, there's only Endeca and the bottom-eating Google Appliance.

  • "We are in a unique position in a very attractive market." Well, I'll give you the unique position part. See the prior point for my thoughts on the market.
Here's some free advice for Fast:
  • Restore some credibility by holding someone accountable for this situation. When in doubt, the CEO is a good place to start.
  • Stop reporting under different financial rules (IFRS) than the mainstream software industry: report under GAAP like just about everyone else.
  • Dual list on the NASDAQ, subjecting the company to SEC rules and regulations
In short, take a lesson from the Barry Bonds situation: if you want people to care about -- let alone celebrate -- your results, then you should play by the same rules as everyone else .

(Recall that I was an executive officer of a France-based, dual-listed enterprise software company for 9 years so I have personal experience in dealing with these international issues.)

See the FAQ for disclaimers.

Sunday, August 05, 2007

Plight of the Single-Digit Millionaire

The cover story of today's (Sunday) New York Times is an article about millionaires in Silicon Valley who don't feel rich. Overall, I think the piece does an excellent job of capturing the local culture and feelings about money.

However, I think it overlooks a few possibilities -- other than keeping up with the Jones' -- in analyzing Silicon Valley's single-digit, working-class millionaires. Do they keep working because:
  • They are simply chasing those above them, as the article generally suggests?
  • They truly enjoy what they're doing and want to keep doing it?
  • They lack the creativity or boldness to step out of their workaday life and try something completely different?
  • They are locked-in to a workaday life due to other constraints (e.g., kids who they don't want to raise in Bend)?
  • They are competitive, type-A people who view work as competiton and money as the score?

While I think the story does a great job at portraying the outcome (seemingly at the expense of those who volunteered to be interviewed), I think it does less well in assessing the reasons behind it.

Some favorite quotes follow. When speaking about the (very real) relative humility of many Silicon Valley millionaires:

“They recognize that if they happened to walk into a different office,” said Marilyn Holland, a Menlo Park psychologist who has been counseling the Valley’s elite for 25 years, “things would have turned out very differently.”

Then, when speaking on status in Silicon Valley, the founder of Match.com weighs in with:

“You’re nobody here at $10 million,” Mr. Kremen said earnestly over a glass of pinot noir at an upscale wine bar.

Friday, August 03, 2007

Perhaps It *Is* About The Technology

I had the pleasure of meeting Harvard Business School professor Andrew McAfee several years ago when he was writing a case study involving Business Objects. Prior to that meeting, I'd never heard of him; since then, I hear about him more and more. He seems to have emerged as a thought leader in how IT impacts business, an area that I think has been long neglected by the business school academic community.

So it was with considerable delight that I discovered his recent post about the old Silicon Valley saw: "it's not about the technology" (which he abbreviates as INATT). Excerpts:

[INATT] is dangerous because it essentially denies two important facts: [1] that technologies can differ from each other in salient ways, and [2] that they can change over time. A lot of my work, for example these articles, is an attempt to articulate the managerially relevant differences across technologies. [...]

[INATT] encourages listeners not to keep such differences in mind, and I think that's the wrong idea. [ ... It] also encourages the view that there's nothing new under the sun -- that one generation of technology aimed at addressing a business problem is the same as all other generations. So (for example) we need to collaborate and share knowledge better, but it's not about the technology. [...] This sense of INATT is pessimistic and self-defeating, even if it's not intended to be. It denies that there can be improvements, incremental or radical, in the ability of technologies to accomplish important goals. I disagree categorically with this.


So do I, Andrew. His full post is here.

As someone running a Silicon Valley company focused on disrupting the multi-billion dollar DBMS oligopoly, and as someone who has spent over 20 years working in enterprise software, I can say categorically that sometimes it's not about the technology (e.g., Ingres' defeat at the hands of Oracle), but sometimes it is (e.g., the ascendency of the RDBMS itself).

While it may be fashionable for Silicon Valley to dismiss its roots, the simple fact is that:
  • Technology does matter
  • It's not the only thing that matters
  • Technology does indeed change over time, sometimes radically
  • The noise about meaningless incremental differences in technology tends to deafen us to the signal about radical ones

Crushing Butterflies: Answers Whacked by Google Algorithm Change

I've always loved the tale of the time traveler who goes back in time, steps on a butterfly, and returns to a radically different present than the one he or she left. I think about it every time I hear a story of a business that's been nearly wiped out each time Google tinkers with its search algorithm.

The latest butterfly? Answers.com. See this story by Om Malik. Excerpt:

Answers Corporation (ANSW) announced today that, due to a search engine algorithmic adjustment by Google, Answers.com has seen a drop in search engine traffic starting last week. As a result, overall traffic is currently down approximately 28% from levels immediately prior to the change.

While the crushees will often assume deliberate intent, the sad reality is that it's usually as accidental as stepping on a butterfly.

Thursday, August 02, 2007

Sharepoint Sales Top $800M

See this Bloomberg story that says sales of Microsoft's Sharepoint (portal, collaboration, search, workflow, and content management technologies) exceeded $800M in the year ended 6/30/07.

Sales of SharePoint, introduced in 2001, have grown faster than any other piece of software in company history, Business Division President Jeff Raikes said. Microsoft says it has sold more than 85 million licenses to 17,000 customers.

While we continue to hear that Sharepoint's search facilities are lacking, it does seem to be making quite an impact in the portal, collaboration, and content management space.

The Flatirons Dynamic Content Delivery Solution

Flatirons Solutions has been getting quite some attention around their DITA-based Dynamic Content Delivery solution. (DITA is the Darwin information typing architecture, more here.)

I just noticed they've put a data sheet up on their site, describing the solution, available here. The solution includes a MarkLogic Connector to Documentum; information on that is available here.

And I've already blogged about their excellent white paper on the topic of DITA and dynamic content delivery. If you missed it, that paper is here.

Addition (8/3/07)
In the coincidence department, Flatirons CTO Eric Severson just emailed, informing me that he's an official guest blogger on the Gilbane Publishing Practice Blog and that he has just blogged about dynamic content delivery on that blog, here.

Wednesday, August 01, 2007

Keynoting the RSuite CMS User Conference

Just a quick post to announce that I'll be giving the keynote address at Really Strategies' 2007 RSuite CMS User Conference, which is being held October 8th - 9th at the Cira Centre in Philadelphia.

The Really Strategies press release on the event is here.