Monday, November 26, 2007

The Death of E-Mail: Greatly Exaggerated

This Slate story, entitled The Death of E-Mail, has been getting a lot of attention in the blogosphere of late. I felt compelled to comment because the story was published so close in time to our launch of the MarkMail service for getting intelligence from e-mail archives.

If e-mail were doomed, one thinks, perhaps MarkMail was a mistake. Then again, perhaps not, for two reasons:
  • The article is largely based on a 2005 Pew study that says teenagers prefer to chat over instant messaging (IM) than over email. Heck, I prefer to chat over IM, too -- but I don't use e-mail primarily for chatting.
  • MarkMail is actually as applicable to IM as it is to email. So even if email were to fall beside IM in popularity, MarkMail could easily be recast as "MarkMessage" and aimed at IM archives instead of (or in addition to) email ones.
The Slate story talks about Facebook status updates and Twitter tweets as email replacements, but I think it's off-base. At 100-something character limits, neither Twitter nor Facebook status updates replace email. However, Facebook messaging/mail is a valid replacement for email and while it's primitive today it wouldn't surprise me if the future looked like:
  • Everyone has a corporate email for work use
  • Everyone has a Facebook email for private use
And Tweets are replaced by Facebook status updates (which desperately need a better name) and your Yahoo and Gmail personal accounts get replaced by Facebook. But, from a MarkMail perspective, that future's fine, too.

For more cogent comment on the Death of E-Mail meme, I'd direct you to Scott Karp's post on his Publishing 2.0 blog, entitled simply E-Mail is Not Dead.

A-Space: Social Networking for Intelligence Analysts

I am always encouraged when I see the US Government take interesting ideas from the commercial and Web 2.0 spaces and apply them to improve intelligence and national security.

Towards that end, I have always been a fan of In-Q-Tel, the "venture arm" of the CIA, which I consider a highly innovative approach to ensuring that the intelligence community is plugged into Silicon Valley and has access to cutting edge technology.

In a similar vein, I like the idea of A-Space, a collaboration system presumably modeled on MySpace-style social networking and profiled in this Information Week story.

Excerpt:
A-Space will begin life as a portal that includes a Web-based word processing tool akin to Google (NSDQ: GOOG) Docs, a wiki-based intelligence community encyclopedia known as Intellipedia and access to three "huge, terabyte databases" of current raw intel for analysts to sift through. It'll be scaled for 10,000 users at day one. By the end of 2008, the DNI hopes to bring in other resources like intelligence blogs, social networking capabilities akin to a Facebook for spooks, secure Web-base e-mail, better search functionality, and much more.
More interestingly, Lewis Shepherd, recently departed (and headed to Microsoft) chief of the innovation directorate of the Defense intelligence Agency (DIA), blogs about A-Space here, with a very quotable quote at the end of the second paragraph:
Our team at DIA got assigned by the Director of National Intelligence (DNI) to design and build A-Space, a brand new social-networking environment for the full intelligence community - “the MySpace for spies.” We’re talking a very high-walled Walled Garden.

I had to devote (not to say divert) some of our most talented people leading the all-important Alien program to this new effort, which really only began in September. Phase I of A-Space must go live by the end of the year; Phase II (with more advanced Web 2.0 capabilities) just a few months later. We expect no delay to Alien - the larger and in many ways more fundamental effort - but the experience has been akin to having the NASA Apollo XI team also asked to “figure a way to stop by Mars first.”

Forget BI: Go With Your Gut

Newsweek recently published an article that should be sending shock waves through the business intelligence (BI) market, populated with vendors like Business Objects, Cognos, and MicroStrategy.

BI tools help business people get access to information in corporate databases and data warehouses, so they can make better business decisions. In fact, if you looked at the tag lines for these vendors over the years, they consistently played off the theme of knowing more and therefore making better decisions:
  • Better decisions every day
  • Now you now
  • The power to know
  • Business intelligence: if you have it you know
  • [Mumble, mumble] something about insight [with lots of black] (poking fun at BOBJ's "margeketing")
My interest in this implicit premise led me to research how people made decisions, enjoying books like Decision Traps by J. Edward Russo, its newer sequel Winning Decisions, and Smart Choices by John Hammond. After all, in the BI world, if we were in the business of providing better information for making better decisions, maybe we should learn something -- and perhaps try to help improve -- the next step down the line.

But what if the premise were flawed? What if more information didn't help improve decision quality? I remember asking J. Edward Russo (who is both a psychology and business professor) what people would most likely do with increased access to information? His answer: selectively filter the information to justify already made decisions. Hum.

In the end I concluded two things:
  • Selling "better decisions" wouldn't work because most people -- particularly executives -- don't think they have a problem. "I'm a great decision maker; look how far I've gotten in my career."
  • If that weren't enough, given my reading, I felt that the first thing companies could do to improve organizational decision making would be to systematically record votes on major decisions and periodically review the decisions and who voted which way. When I proposed we do precisely that at Business Objects, executives scattered faster than cockroaches with the lights turned on.
Clearly, while there was a big market for "more information," demand for "better decisions" seemed lacking.

So what does Newsweek have to say? Almost in the Blink school of thought, there's a new book out called Gut Feelings that argues our subconscious can do a pretty good job filtering and processing information.

Excerpts:
Hunches, gut feelings, intuition—these are all colloquial English for what Gigerenzer and his colleagues call "heuristics," fast and efficient cognitive shortcuts that (according to the emerging theory) can help us negotiate life, if we let them.

[...]
Gigerenzer calls such decision making "satisficing," as in "satisfying" enough to "suffice." Satisficers don't feel the need to know everything, in contrast to "maximizers," who do want to weigh every detail imaginable in making even minor life decisions. Interestingly, studies have found that satisficers are more optimistic about life, have higher self-esteem, and are generally happier than maximizers.
The whole story reminds me a humorous moment in my marketing career. We were running the BI Summit, a top-end executive event in the UK. We had Michael Heseltine, a member of parliament, secretary, prominent UK politician and businessman as our keynote speaker. We were donig Q&A in an interview format and the interviewer -- on ear-bud prompt by our UK marketing director -- kept asking increasingly leading questions about the power of information in making decisions.

And then he pressed once too far. It was many years ago, but as I recall it went something like this:

Interviewer [building in hyperbole]: Well, then, would you say that some of the best decisions you ever made in your life were based on data and analysis?

Heseltine: Well, in fact, no. No, I wouldn't. I remember when we decided to start [magazine X] just having a flash of intuitive brilliance in looking at a newsstand and realizing there was no publication in the [X] space. In fact, well, I think I'd say that some of the best decisions I've ever made have been based on pure instinct and intuition. No data at all, really.

There's a lesson on decision-making in there. And one on over-reaching as well.

A Bottom-Up Education in Venture Capital

Everybody has some understanding of what venture capitalists do, right?
  • They sell money
  • They invest in [technology] startups hoping to get 10x returns
  • They invest in people, not technology or companies
  • They invest in market segments, not companies or technologies
  • They take risk hoping to yield superior returns
  • They eliminate risk by systematically isolating it
  • Yes, they invest money, but the real value they provide is in support
Well maybe it's not so clear. :-)

By the way, personally, I'd argue there is some truth in all of the above statements. But that's not the purpose of this post. While the popular adages appeal to the big picture intuitive side of us, they leave one feeling rather empty when you want to understand VC at a more mechanical level. How do the deals work? What's binding when? What shape to liquidation preferences and/or dividends take? How about anti-dilution? Just to mention a few of the items one runs into on a "term sheet."

To help people better understand the rubber-meets-the-road mechanics of venture capital, I'd recommend carefully reading these template documents conveniently posted on the website of the National Venture Capital Association.

For example, you can find a sample term sheet, a sample stock purchase agreement, and a sample investor rights agreement, among others. This stuff isn't for the faint of heart. But if you want to get a concrete sense for how these deals are structured and how some of the variously Draconian terms (e.g., full ratchet anti-dilution) work, then take a look at these documents.

Wednesday, November 21, 2007

Mark Logic Named to EContent 100

For the third consecutive year Mark Logic has been named to the EContent 100, the list of companies that matter most in the digital content industry. Says Michelle Manafy, in the December 2007 issue of EContent:

Indeed, today our industry cannot be focused on control, but rather on fueling possibility. As we once feared the cannibalization of print revenue by digital distribution, we now face the rise of the empowered user as content creator. Will we suffer inertia while questioning our value? Or will we evolve, continuing to demonstrate that content professionals deliver high-value information—be it created by professional writers, end users, customers, or CEOs?

The EContent 100 list represents the best and the brightest digital content companies as selected by a dozen judges who follow different aspects of our vast and varied industry—from vantage points all over the map (literally and figuratively). We offer this list not just to recognize companies that lead our industry, but to inspire organizations of all kinds to join in the content conversation online.

Our judges spent more than a month reconsidering last year’s 100 and vetting new contenders, collaborating in a Socialtext wiki. The process is always a challenging one, but the rewards are great. Each year we are reinvigorated about our industry, from the continued prowess of some of the industry’s inveterate leaders to the renewed vigor of others to the startling innovation of the newcomers.

I'm quite pleased that we have again been selected for this list (our listing is here, on page 3, by the way.) It's a Who's Who of content companies, and the list I'd recommend for someone to use to as an orientation in learning about the digital content industry.

The illustrious panel of judges is here.

XBRL: Mandatory Soon --> The Web as Database

Check out this eWeek story, entitled SEC Readies XBRL Tagging Rules for Financial Filings, which discusses the progress of extensible business reporting language (XBRL) and the fact that tagging financial filings in XBRL will go from voluntary to mandatory in the not-too-distant future.

John White, head of the SEC's division of corporate finance, and Conrad Hewitt, the SEC's chief accountant, told the Financial Executives International Conference in New York that the SEC is in the process of shaping an XBRL (Extensible Business Reporting Language) proposal to make it mandatory in required filings.

This is great news because once these filings are marked up in a standard XML format, a corpus of them transforms from a mess of documents (with number strewn throughout) to a queryable database, thanks to XQuery.

This is part of a new meme for my blog: the web as a database. To many people that means "semantic web" but I think there is too much baggage and too many assumed technologies (e.g., OWL, RDF) associated with that term. While OWL, RDF, and inferencing are all interesting, you don't need them to turn the web into a queryable database.

In fact, with MarkLogic you can scrape portions of the web with a third-party spider, load the scraped content into a MarkLogic database and thus turn the web into a queryable database today. Using our partners' enrichment technology (e.g., Inxight) you can even add lots of semantic mark-up (e.g., automatically detecting and marking people, places, companies, travel, sentiment) and then include those tags in queries to make them more powerful.

I believe the web in is in the midst of a transformation from (1) a giant brochure and shopping catalog circa 1999 to a (2) queryable database platform. It will take some time to get there. But piece by piece, with XML-based standards like XBRL, we are accelerating the progress.

Go here to read more about XBRL benefits. Go here to see a sample piece of XBRL code.

Tuesday, November 20, 2007

Wall Street Journal Validates Rise of Special-Purpose Databases

The Wall Street Journal recently ran an article entitled Start-Ups Mine Database Field: Nimble Software Helps Make Sense of Information Tide (subscription required for full text) that validates the imminent mainstream-ing of the top meme I write about on this blog: the rise of special-purpose database management systems.

The article begins with:

Most databases are based on technology that originated 30 years ago. But change is in the air.

A mob of start-ups have been developing variants of the software, which provides the equivalent of filing cabinets for corporate information. Customers say the offerings are generating faster answers to questions that require sifting through huge volumes of business information.

Established suppliers aren't conceding much to the newcomers, but industry executives agree the pace of progress is accelerating.

"The database market is going to be an exciting place to be in the next decade," said Michael Stonebraker, an adjunct professor at the Massachusetts Institute of Technology ...

My favorite excerpt includes this quote from chief database guru at Gartner, Donald Feinberg (whose actual title is vice president and distinguished analyst).

Some predict specialized products will find a niche. "One kind of database is not going to suit all of the different applications we are coming up with," said Donald Feinberg, an analyst at market researcher Gartner Inc.
I couldn't say it any better myself. And -- whack -- think about this. You have the head database analyst at Gartner saying that one kind of database isn't going to meet all needs. That's big. And he's right.

New Conglomerates: After the Deluge

I found an interesting post on SandHill.com by Ken Bender of the Software Equity Group, entitled After the M&A Frenzy: What's Next?

One of my recent memes is that the mega-players in enterprise software are becoming new conglomerates. They seem more driven by size for size's sake than by driving the synergies of integration. When you talk to people in different divisions of Oracle they sound like they work in different companies. The same is true for SAP, despite its much more organic growth. It's undoubtedly true for people who work with Inxight within Business Objects within SAP.

While everyone seems to think Oracle is on track to become General Motors, maybe they're actually on track to become ITT.

What happened to the conglomerates? Well, realizing that there were no real synergies to be had by combining such a broad range of businesses, many companies were spun out. The bizno-fashion pendulum swung from size to focus.

I think the same thing is likely to happen in enterprise software, and the recent SandHill blog comes to a similar conclusion. Excerpts from the discussion of large software vendors (bolding mine):
What will be the likely impact of a mild recession on the software industry? Enterprise customers will markedly reduce their IT capital spending, as they have in prior downturns. Consequently, software company growth will slow, and investors will increasingly turn their attention to profitability and net income. It’s almost a law of nature.

Larger software companies, in response, will turn their attention to cost-cutting, re-examining spending priorities, paring headcount, and enhancing the productivity of those who remain ... Particular attention will be paid to products acquired during the M&A frenzy of the past few years.

After conducting these product line, operational and financial reviews, we fully expect a good number of public software companies will shed non-performing and incongruent product lines and business units in an effort to cut development, support and marketing costs.
They go on to discuss the impact on private equity (PE) owned software conglomerates as well:
... Private equity-owned platform companies now own a host of acquired assets they’re attempting to understand, manage, integrate and leverage. In good times, when IT budgets are healthy and growing, there’s little impetus to cut costs, especially after the first year following an acquisition.

But when growth slows, private equity firms will be very disciplined in assessing their acquired assets. They’ll really have little choice. The debt leverage on these acquired companies assumes continued economic expansion and continued growth of recurring revenue and operating income ...

After taking a very hard look at their portfolio companies, ... many PE investors will opt to shed non-core business units that are not providing the strategic leverage, accelerated growth or incremental revenue anticipated at the time of acquisition.
Seems like someone should create a company to buy all these units at fire-sale prices as they're spun out of the new public mega-player and private equity conglomerates. All I ask is a board seat in return.

Monday, November 19, 2007

LinkedIn Growing Faster than Facebook

I found this post today on Vallewag, which shows that Facebook grew from 8.7M to 19.5M users in the year ended 10/07 while LinkedIn grew from 1.7M to 4.9M. So while Facebook is nearly 4x LinkedIn's size (and MySpace nearly 3x Facebook's), when it comes to the future (i.e., growth rates), the picture looks like:

  • MySpace: 19%
  • Facebook: 125%
  • LinkedIn: 189%
While I've heard LinkedIn called "Facebook for dinosaurs," I believe its focus on the professional marketplace makes it both a superior venue for advertisers and for professional networkers (in the sense of professional people networking, not people who network for a living who are called "bankers"). As I pointed out here, responding to "who's got a better body?" when looking at a picture of a board member and a customer is not a great thing.

I like Facebook, don't get me wrong (it's 100 times better than MySpace) and I do believe there is real power in leveraging Facebook as a platform. But I also believe in focus. While Facebook started with college students, owned that market, and is now one-hop expanding into the broader "everyone" market, LinkedIn started with professionals and stayed there.

While I do wonder if Facebook is over-expanding too quickly (e.g., why not get high schools, then some segment of businesses, building out systematically), I do believe there is a potential opportunity for some company to "own the graph" and that's clearly what Facebook is pursuing -- but at the cost of serving each of the segments in an appropriate way. That said, time is on their side because once you hook the audience in high school or college, they inevitably age into young professionals. Basically, you own the audience until you irritate them or until they find a better tool for the task. For example, will current high schoolers think of Facebook as something so personal/friend-y that it's not appropriate for work networking? It's possible.

By the way, I think LinkedIn has a 0% chance of owning the graph when it comes to high schoolers and college students, and they are at something of a time disadvantage when it comes to audience life cycle.

But I really like their focus on the professional segment. In fact, I like their focus more than their offering -- i.e., I don't actually derive much everyday value from LinkedIn, but I still like the fact that it's work contacts -- and only work contacts -- to whom I'm linked and I'm not answering questions like "which [employee] I'd want to be stuck on a desert island with?"

So I'd say strategically the odds are in LinkedIn's favor if they aggressively evolve their offering to best serve the needs of the professional segment. How might they do that?
  • Continue serving the recruiting market, where I think they get most of their business
  • Enable LinkedIn apps so the community can create apps of practical business value. I don't think I've heard much from there here.
  • Work to avoid network dilution -- if everybody says yes to every network request then the graph loses value. Help people understand who they should and shouldn't link to. Help them ignore requests. Help them prune and clean the graph.
  • Enable a clean transition / migration from Facebook and MySpace for young professionals as they grow up.
Basically, carve out a niche in social networking for professionals. Until Facebook understands roles and puts a real focus on serving professionals, I think LinkedIn has a great chance to be the leader in the segment. But more and better execution is needed.

Thursday, November 15, 2007

Hecklers in People-Powered Search

Cha Cha just raised a B round of $10M which has led to some new attention for the "people-powered search" company and some controversy about its ties to Indiana University as well.

But this post is pure fun. TechCrunch's Forums has a conversation thread where people post their best / funniest Cha Cha transcripts.

Here's my favorite:

search: my wife
Status: Looking for a guide ...
Status: Connected to guide: Pamela C
Pamela C: Welcome to ChaCha!
You: hello pamela!
Pamela C: Hi
You: I'm looking for my wife, can you help me?
Pamela C: Nope
You: I thought you guys were experts in searching?
You: she left an hour ago but her cell phone is off.
Pamela C: searchable searches online
You: you mean to tell me with all the world's information at my fingertips (and yours) we can't put our heads together and find my wife?
Pamela C: nope
You: why not?
Pamela C: I dunno
You: can't you at least try? perhaps she advertises herself.
Pamela C: Nope
You: chacha sucks!
Pamela C: I am sorry
You: will you marry me Pamela?
Pamela C: You are married
You: yeah, but she's gone. I live in Utah.
Pamela C: Okay
You: awesome.
Pamela C: Awesome
Pamela C: What are you doin
You: crying
Pamela C: Why
You: I miss my other wife.
Pamela C: LOL
Pamela C: What is her name
You: Pamela B.
Pamela C: LOL
Pamela C: whatever
You: you are the third pamela, but I promise to love you as much as the others.
Pamela C: lol
You: what are your hobbies, pamela?
Pamela C: going to school
Pamela C: trying to pass precalculus
Pamela C: i have one minute left
You: oh dear, you're of age right? I don't want dateline NBC coming to my house!
Pamela C: LOL
Pamela C: 25
You: what happens in one minute?
You: phew.
Pamela C: i gotta go
Pamela C: bye
Pamela C: bye
Pamela C: come back
Pamela C: soon
You: I will always love you pamela!
Pamela C: Hey
Pamela C: Rate me great
Pamela C: bye
Pamela C: Please RATE ME. Thanks for using ChaCha.
Status: Session ended.

Tuesday, November 13, 2007

Membership Drive and Feedburner Request

I have three requests for frequent readers:

1. If you like the blog, please tell a friend. November is my newly created annual membership drive month and I'd like to see I could double the number of subscribers to the blog this month. It adds up to a lot of work and I'd like to ensure I'm getting as much leverage as possible.

2. Please, please, please use the feedburner feed: http://feeds.feedburner.com/marklogic. When you do so, it enables me to track activity better, all in one place.

3. If you're also a blogger -- please link to some stories. The inbound links will help my authority greatly and I'll happily return the favor. If you've got a blog that I should be reading and writing about let me know by emailing ceo-at-marklogic.com. (I put -at- instead of the at symbol to avoid having the address crawled by spammers.)

Thank you for your support. I do most of the blogging in my CST (copious spare time) and I'm very much motivated by watching the blogstats move high and to the right.

Cheers
Dave

Monday, November 12, 2007

Andy Feit to Speak at STM Innovations Seminar: What's New in Search?

Mark Logic's own Andy Feit will be speaking on December 12, 2007 at the STM Innovations Seminar in London at the Hilton London Kensington.

The one-day event is being keynoted by Patricia Seybold, whose talk is entitled Using Web 2.0 for OUTSIDE INnovation: How Customers are Co-Creating Value & Knowledge.

Andy is speaking at 11:30 AM in a part of the program entitled What's New in Search with Andy's speech entitled Beyond Search: Creating a Modern Content Platform.

Howard Ratner's Slides From The Agile Publishing Imperative

We recently hosted an event in NYC called The Agile Publishing Imperative and were lucky enough to have Nature Publishing Group CTO and Executive VP Howard Ratner present. I asked Howard if I could post a copy of his slides to this blog and happily he said yes.

Here they are:


As Predicted, IBM Buys Cognos (for $4.9B)

As predicted in my post about the SAP acquisition of Business Objects, IBM today announced that it has offered $4.9B in a friendly bid to acquire #2 independent business intelligence tools maker, Cognos. Per this New York Times story, the bid amounts to a 9.5% premium over Friday's closing price, but the story also asserts that seemingly low premium is the result of a run-up in the stock resulting from speculation that such an acquisition was imminent.

With Hyperion acquired by Oracle, Business Objects acquired by SAP, and now Cognos acquired by IBM, all the major players except Microsoft have their BI dance cards filled. Does MicroStrategy end up a tony bachelorette or a wall flower going forward?

Sure, MicroStrategy could theoretically get acquired by Microsoft or SAS, but I don't see either as particularly likely. By the way, MicroStrategy has done a great job of quietly rebuilding the company after it was wracked by scandal in the post-bubble. For example, in their most recent quarter, they had a healthy net income of $19M on sales of $96M and, on a quick glance, all of their ratios look pretty good to me.

So A+ in execution for MicroStrategy, but the question is does any of it matter? Does the market need a $400M independent BI company? The prevailing wisdom is no. And that wisdom not only was the cause of the recent "wave 2" BI consolidations, it's also the force behind the wave 1 consolidations as well (e.g., BOBJ/Crystal, COGN/Adaytum, Hyperion/Brio).

Opinion-wise I'd don't have much to add other than:

  • The deal was fairly obvious, speculation was rampant in the blogosphere, and the stock had risen from $37 to $50 in the past 3 months

  • It's part of a broader trend of BI consolidation and beyond that enterprise software consolidation.

  • The truly interesting question -- and one I started to tee up already here -- is whether software industry consolidation will work? Is it really about sales, R&D, and G&A synergies, or is it simply about ego and size? Put differently, is Oracle on track to become General Motors as most people seem to think, or is Oracle on track to become ITT?

See this post for the start of that thread. And see this article on the history of conglomerates for more background. And thanks again to Alex Moissis for getting this whole meme into my head.

Friday, November 09, 2007

Code with the XQuery Alpha Geeks: Jason Hunter and Ryan Grimm

(Post revised: the event is in California, not the UK.)

Mark Logic is hosting an unusual and fun event in San Carlos, California on November 29, 2007. The event is called Code with the XQuery Experts.

This event is not for the technically faint of heart: you're required to bring a laptop with MarkLogic Server installed on it (trial version available) to get in. You will be coding -- it's sort of a code jam -- and there's even a prize for the best application.

You couldn't spend the day with two more skilled -- or more fun -- XQuery coders.
  • Jason Hunter is Principal Technologist with Mark Logic, specializing in large-scale XML content manipulation using XQuery. He's probably best known as the author of Java Servlet Programming. He's also an Apache Member and as Apache's representative on the Java Community Process Executive Committee he established a landmark agreement allowing open source Java. He's an original contributer to Apache Tomcat, the creator of the JDOM open source project, a member of the expert groups responsible for Servlet, JSP, JAXP, and XQJ API development, and was recently appointed Sun Java Champion.
  • Ryan Grimm is a consultant for Mark Logic Corporation and has been an XQuery enthusiast for three years. Prior to Mark Logic, he was a software engineer for O’Reilly Media where he developed the XQuery back-end for SafariU, a custom publishing application built on MarkLogic Server. While at O’Reilly, Ryan also created O’Reilly Labs. Ryan has coded XQuery in many interesting venues, the most unlikely of which was Fort Leavenworth. (Ask him about it!)

SQL/XQuery Franglais Frankenqueries

One of our consultants is doing some testing of MarkLogic vs. XML-extended relational databases, and he sent me an example of the kinds of queries you need to write when you're mixing SQL and XQuery/XPath. Here is an example:

SELECT XMLQUERY( '$p/Citation/Index/ConceptCodeList/ConceptCode' PASSING P.XMLDATA AS "p")FROM AllCitations AS p WHERE contains (XMLDATA,'(SECTION( "/Citation/Index/ChemicalData/ChemicalList/ChemicalName") "leucovorin")&(SECTION( "/Citation/Index/ConceptCodeList/ConceptCode") "Pharmacology")') = 1;

A few things spring to mind when I see queries like this:
  • This is why people made XQuery -- so you wouldn't have to write stuff like this.
  • Why in the world do you need to mix XPath and SQL in this way? In a theoretically bi-lingual SQL/XQuery database, can I just write document-oriented queries purely in XQuery and not mess around with selecting columns that are themselves XMLQUERYs? Answer: in DB2's ironically named pureXML, you need to use SQL as the outer framework if you want to use full-text indexing; so yes, you must do this.
  • Are there more than 10 people in the world who will understand what the answer to this query is supposed to be? SQL and XQuery each have their own semantics, and few people deeply understand them. How many people understand not only both SQL and XQuery semantics, but also how they interact? (It reminds me of trying to find a tax guy in France who could do both the US and French systems at the same time.) I watched two world-class experts debate what the correct answer was to such a query for 20 minutes. Does Joe Programmer even have a chance?

XQuery Your Office Documents

Just a quick post to highlight this recent article in Dr. Dobbs Journal, entitled XQuery Your Office Documents.

Excerpt:
A key benefit of ODF and OOXML for developers is the reuse of existing standards -- in essence, your office documents are XML documents, which makes available a complete palette of tools for manipulating these documents and the information they contain. Using tools and technologies available today, you can transform office documents to HTML or PDF, store them in an XML database, shred their information and store it in a relation databases, embed SOAP messages, enrich them with external information, and so on.
The article shows some fairly basic manipulations you can perform on Office documents using XQuery and shows an example of how XQuery can be used to integrate both XML and relational data.

Why Thomas Keller Loves In-N-Out Burger

If I told you fifteen years ago that a family-owned California hamburger chain founded in 1948 was going to beat McDonald's in hamburgers, what would you have said?

"Beat McDonald's in hamburgers? Are you crazy?" And you might have added: "Burger King's the poster child for why that will never work."

The problem is that Burger King is a classic, #1-focused competitor who, by virtue of its focus, condemns itself to be a poor imitation of #1, offering effectively the same product and same value proposition.

In-N-Out Burger, other the other hand, is a different animal. They do one thing -- burgers -- and they do it well (i.e., quality product). No salads. No chicken sandwiches. No Croissanwiches or McGriddles. No fish sandwiches.

I remember first observing this about 10 years ago when I was eating a Double Double in parking lot of the In-N-Out on Rengstorff in Mountain View. While the In-N-Out drive-thru was 20 cars deep, the McDonald's across the street was empty. I thought to myself: "wow, the power of focus and quality." So I know that I've always liked In-N-Out.

What surprised me was to find out the the top chef in Northern California, Thomas Keller, proprietor of Northern California's only Michelin three-star, The French Laundry, is a fan as well. Here's a blurb from the latest issue of Via Magazine (you need to scroll down quite a bit to see it). Bolding mine:

WHY THOMAS KELLER LOVES IN-N-OUT BURGER
I really respect a company that holds its ground when there is so much pressure to follow the "what’s next, what’s new" trend. In-N-Out’s quality lies in the simplicity of what it promises and delivers. To be able to do something over and over with integrity and excellence, even if it is fast food, is something to be truly admired.

Thursday, November 08, 2007

SaaS Not Working in Enterprise Search

Just a quick post to highlight an interesting article I found in Network World, entitled Hosted Enterprise Search Vendors Lag Rivals.

Excerpts:

Web-hosted enterprise search has failed to catch on because of its limited scope and the rise of search appliances from vendors such as Google, analyst firm CMS Watch says in a new report.

“Software-as-a-service delivery models are hot all across the software landscape, but search has become an exception,” CMS Watch says in a press release. “Early hosted players Blossom and Pico have remained relatively obscure. The grand-daddy of hosted search, Atomz, has been acquired twice and now belongs to Web analytics vendor Omniture.”

Google’s product and others, such as the Thunderstone Search Appliance, has cut into demand for the type of simple, Web-oriented enterprise search that’s typically offered by software-as-a-service vendors.

Wednesday, November 07, 2007

Fast Forgets Headline and Body Copy Need Linkage

One reason I give Fast Search & Transfer a hard time is that I don't like their communications strategy. For example, during their recent and ongoing financial travails there has been way too much "happy talk" in their communications for my taste.

They irked me again today with this press release. When I read the headline -- silly me -- I thought they'd launched a new knowledge management solution, or maybe a whole line of them.

Here's the headline:
Knowledge Management As We Know It Is Over: FAST Delivers Next Generation Search-Powered Information Discovery Solutions For Enterprise 2.0
Yes, it's wordy (20 words) and even more buzz-wordy (e.g., next-generation, search-powered, enterprise 2.0). But at least, when stripped of the hyperbole, it seemed clear. Fast was announcing a new series of knowledge management solutions.

Or were they? Let's read the lead paragraph of the body copy to find out:
Fast Search & Transfer, the leading provider of search technologies, today announced that one of the largest European IT services providers, TietoEnator, has gone live with their intranet solution powered by the award-winning FAST Enterprise Search Platform.
Huh? So a company I've never heard of, that's got more vowels than a Greek wedding program, has deployed Fast's core product. But what's that have to do with redefining knowledge management and delivering enterprise 2.0 solutions that change knowledge management, forever?

Not much, it turns out.

We later learn that TietoEnator is not purely an end-user customer. They're an implementation partner who decided to use the technology internally. This isn't bad mind you, but it's not as credible as an end-user customer. An implementation partner, after all, is incented to help you get more implementations.

And if credibility comes from clarity, this not-so-pithy CTO quote doesn't do much to help:
"During the next five years, we will see more and more organizations shift their investments away from legacy knowledge management and towards Enterprise 2.0, enterprise search, information discovery, and other tools, technologies, practices, and processes that allow for emergent work patterns to form in a vibrant 'learning organization.'”
Finally, since we now know it's a customer press release -- and not a solutions announcement -- it's always good practice to factor in the health of the customer that you're promoting. After all, you wouldn't want to promote Nike's use of your supply chain software right after they missed a quarter due to a major inventory problem.

So what's TietoEnator up to of late -- let's look at this three-week-old release: TietoEnator revises full-year profitability guidance, renews strategy and changes CEO.

Death to Death By PowerPoint

Here's a great presentation on this topic that I picked up courtesy of The Content Wrangler.

Software Consolidation: Modern Conglomerates?

I had a long overdue lunch the other day with former Business Objects co-worker, and one of the smartest people I know, Alex Moissis. While most people you talk to in Silicon Valley compare software industry evolution today to evolution of the automobile industry in the early 20th century, Alex had a different view.

He thinks you should compare Oracle not to General Motors, but to ITT -- i.e., to the conglomerates of the 1970s.
  • Conglomerates were built through acquisition, at sometimes pricey multiples.
  • They did so largely for size's sake.
  • Their leaders were incented to keep getting bigger.
  • While there were arguably scale economies to be had, they were generally not realized, nor did they prove compelling compared to the disadvantages of the conglomerate model.
  • In the end, they were largely broken up.
It's an interesting viewpoint and well grounded in reality. When I talk to my friends at the new behemoths, I don't see any signs of any real product integration and/or discontinuation coming anytime soon (e.g., next 5 years) nor do I see any obvious scale economies. In fact, when I talk to friends in two different divisions of Oracle, it's more like talking to people at different companies than anything else.

So are we witnessing a consolidation a la the early automobile industry or the growth of conglomerates a la ITT?

My take is that while history never exactly repeats itself that I would predict that a lot of products / companies do get spun back out of the behemoths before the movie ends. And you can't forget that the behemoths themselves are being disrupted at three-levels

  • Technologically by startups. The cost of being a behemoth is that you are so buried in integration road maps that innovation gets stalled.
  • Price-wise by open source. MySQL, SugarCRM, Lucene, even Ingres all seem to be chipping away and moving up against their enterprise counterparts.
  • Business-model-wise by SaaS and Google. Companies don't employ electricians today, will they employ IT staffs to do basic operational systems (e.g., HR, CRM, ERP) tomorrow? Or will they just configure multi-tenant SaaS apps and focus their technology investments in R&D -- as Geoffrey Moore would say, invest IT resource in core, not context.
For more on the history of conglomerates, Alex directed me here.

Girouard Claims 90% Win Rate in Enterprise Search

See this CBR (a UK publication, nee Computer Business Review) article entitled Google Dismisses Noise and FUD in Enterprise Search for some great quotes from Google's Dave Girouard, head of Google Enterprise and Applications.

"There are three things that IT departments and users want in the enterprise. They want the results fast; they want relevance, in that the results they are looking for are in the first few hits they get back; and they want security so it searches what they are allowed to and nothing else."

"If we can get those right," said Girouard, "and we believe we have, then we know that everything else the competition can sling at us is just noise and FUD."

And the topper:
Girouard said that in competitive tender situations, the customer chooses the Google Search Appliance 90% of the time.

Wow. That's a big claim. And if it's true it validates everything I've been saying about enterprise search being caught between a rock and a hard place. Reinforcing this, the article ends with the statement that Google has 10,000 customers for the enterprise search appliance.

Aside for Marketers and Spokespeople
As a general rule, I always love reading the UK trade press because I think they are often able to "get the story" more effectively than the US trade press. Why?

  • Because I think UK trade press journalists see themselves more as reporters, in the classic sense of the word
  • The UK trade press therefore use more hard-ball techniques in extracting the story from spokespeople (e.g., buying the question, fake end of interview)
  • US spokespeople inexperienced in dealing with the UK press therefore often make major mistakes because they are unaware of these cultural differences.
For example, since Girouard isn't double quoted in the story above that means that words "the customer chooses the Google Search Appliance 90% of the time" most likely didn't actually come out of his mouth. I'd bet you $10 that the actual interview went down something like this:

DRAMATIZATION FOR EDUCATIONAL PURPOSES

CBR: "Dave, how often do you beat your competitors in customer evaluations?"

DG: "Well, I'd rather talk about us than them. And we think three things matter in enterprise search -- " [Trying to stick to party line.]

CBR: "But -- let me dig into this for a second -- would you say you beat them most of the time?" [Setting the trap.]

DG: "Well, yes, but I'd like to talk about what --" [Regretting having opened the door, but still trying to hold line.]

CBR: "Well does 'most' mean 51% or say 90%" [Tightening the noose.]

DG: "Well, of course it's hard to put a number on it, but I'd say closer to the latter than the former." [Thinking he's saying it without saying it.]

CBR: "OK, so you'd say that in evaluations customers choose the Google Search Appliance 90% of the time." [Going in for the kill.]

DG: "I suppose so. Yes." [Buying the question.]

Tuesday, November 06, 2007

Ads on the Mark Logic CEO Blog?

I have several goals for this blog
  • To get my own soapbox, my own editorial column on the Internet if you will.
  • To promote Mark Logic, talking about issues relevant to the company and why I joined and I how think information management will change in the coming decade.
  • To experience the changes in the publishing and content industry -- personally -- because these issues are affecting our customers.
  • To understand the technology and tools, first hand, related to changes in the publishing industry.
To elaborate on the last point, I can't tell you how much I have learned about blogs, blogging, blog search, Internet search, SEO / organic results, and site monitoring and measurement as a result of running this blog.

What haven't I learned much about? Monetization. So I've decided to ad Google AdSense advertisements to the blog. Right now, I'm running a small unit at the top of the page. And I've already learned two things:
  • Until they verify your account they place only public service ads. Cool idea.
  • AdSense seems to overweight the title and blog description in choosing content. Most of the ads thus far have been about CEO jobs and venture capital, not say search engines or XML databases.

Last Call for our NYC Event on Thursday

Just a quick post to say it's not too late to register for our event this Thursday at the Four Seasons Hotel in New York City, entitled: The Agile Publishing Imperative -- Accelerating the Creation of Information Products.

We have two phenomenal guest speakers:
  • Nature Publishing Group CTO Howard Ratner
  • Chief Research Fellow at Outsell Inc., David Worlock
Matt Turner blogs about the event here. Registration is here.

Monday, November 05, 2007

Mark Logic Redefines E-Mail Search: Introducing MarkMail

This weekend Mark Logic launched a new Internet service, called MarkMail (tm), that lets users search 4,000,000+ emails from over 500 Apache mailing lists, in order to analzye trends, locate experts, and get fast, precise answers to technical questions.

Put one way, MarkMail redefines what search means in the context of e-mail (think: what Technorati did for blogs.) Put another, MarkMail is a demonstration of the power of MarkLogic Server when aimed at e-mail content. (Think -- and I know the analogy is risky -- AltaVista, the Internet search engine launched to demonstrate the power of DEC's Alpha chip.)

MarkLogic Server is an XML content server, a special-purpose database management system (DBMS) designed and optimized for managing XML documents. In plain English, MarkLogic Server is the world's best place to put XML documents.

But who has XML documents that they need to put somewhere? Today, that's largely constrained to the information industry (i.e., publishers) and the Federal government, particularly the three-letter agencies. (You can also find XML content in certain enterprise functions like technical publications.)

One reason we're so excited at Mark Logic is we know the world is moving our direction. While relatively few people use XML as their document markup format today, virtually everyone is moving to XML, whether they know it or not, as Microsoft takes the standard Microsoft Office document format to XML in Office 2007. But adopting Office 2007 will take a while. So what content can we leverage meantime to show-off the power of our server?

E-mail. Why?
  • It's semi-structured, and we love working against semi- and un-structured information. E-mail has some clear metadata (e.g., author, subject, send-date) and plenty of free text, both in the body copy and in the metadata fields (e.g., thread topic) themselves.

  • It's easily converted to XML.

  • It's ubiquitous. Everybody uses it.

  • There are lots of free, public mailing lists that contain lots of valuable information -- on topics from wine to Tomcat and everything in between.

  • Most important, e-mail is -- as Mike Moritz of Sequoia Capital once said -- the new corporate knowledgebase.

To expand the last point. If I told you that you could go to one place -- and only place -- to learn about a company, where would you go? To their corporate data warehouse? To their knowledgebase? To their financial systems? To their sales and CRM systems?

Personally, I'd go to their e-mail. Despite years of attempts to systemize it, knowledge has eluded capture and evaded knowledge management systems. Knowledge, it seems, instead resides in e-mail and collaboration systems. Through e-mail I can find lots of important quantitative information (mailed around as spreadsheet attachments) but more importantly, the color and commentary that goes along with it. As Mark Logic's Jason Hunter once put it: "I can see the movie (the data), and the subtitles that go along with it."

E-mail is the one-stop shop for information inside most organizations. So why not demonstrate our power on e-mail, we thought? So we did.

The other nice thing about e-mail is that it has additional idiosyncrasies that let us show-off more of our power.

  • Included text and conversation threads. MarkMail does a great job of eliminating duplicate inclusions and re-building a conversation from a series of emails.

  • Attachments. We love documents and people email them all the time. MarkMail has some very nice -- and sexy -- ways of handling e-email attachments.
Go try MarkMail, now! If you're not an "open source person" and don't know what to search for, you can start with one of my favorite searches: XML indexing in the Lucene project. (Hint: if you're looking to index XML with Lucene, it's a good indicator that you should be perhaps looking at MarkLogic, which indexes XML natively.)

Once you've tried MarkMail, please do two things:

  • Tell a friend. Particularly any open source types (the target users of the current incarnation of the service) you know.

Thursday, November 01, 2007

Fast Search Announces $100M Net Loss in Q3 07

Fast Search and Transfer announced their Q3 2007 results on 10/30. Here are some highlights from the announcement, some of which (the net loss, for example) aren't actually in the company's press release.
  • Revenues of $35.6M, down 16% compared to Q3 2006 revenues of $42.5M
  • Operating expenses of $121.2M, up 246% over Q3 2006 operating expenses of $49.2M
  • Net loss of $100.4M, up 2200% over the net loss of $4.5M in Q3 2006.
  • Cash burn of $57.9M
  • They increased guidance for 4Q 07 from $43M to $47M
I've not had time to read everything in detail yet, but I'm sure there are lots of one-time restructuring charges in the $121M of operating expense. Fast goes to quite some length to explain why all this is good news. But to me, the numbers are numbers.

As a bit of commentary, I find it a little odd when a company's earning press release doesn't include the financial statements. But a lot of people do it. However, I find it quite odd when you press the link to the financial statements (which wasn't easy to locate) and find something other than, well, the financial statements. In this case, you find a what I'd consider a veritable Q3 07 "brochure" with a few well chosen and well framed financial metrics on the first page, several pages of good news, (carefully) selected metrics and commentary, and a few high-and-to-the-right arrows, boasting 4%, 5%, and even 23% growth rates.

In fact, there's so much pre-material in the financial statements, that you might get weary wading through it before you get to page 6 and finally find the income statement.

Hey, perhaps that's the point.