Thursday, March 29, 2007

Watch This Space


Is there anyone else who did not recognise Pam Beesly on the front cover of Wired this month?

Sunday, March 25, 2007

CRM is Back

There were several interesting things that came out of Jacob Taylor's "CRM is Back" presentation to the SDForum Business Intelligence SIG last week. CRM stands for Customer Relationship Management, an umbrella term for the software and systems for managing sales, marketing and service, the customer facing aspects of a business. Jacob is CTO and co-founder of SugarCRM which is the leading Open Source CRM system. During the talk, he touched on the incredible success of SugarCRM, its relationship with Open Source and the use of the PHP language. This post is about the incredible success of SugarCRM. I will discuss SugarCRM's use of Open Source and PHP in future posts.

SugarCRM was formed in April 2004 and by July had come out with its first product SugarCRM Version 1, a sales CRM module. The company also received Venture Capital funding, the first Open Source software application vendor to do so. From the start SugarCRM generated a huge amount of interest, with as Jacob described many enthusiastic users who offered input, feedback and code to improve the product. By that October, SugarCRM had risen to become Project of the month on SourceForge, the leading repository of Open Source projects.

I recall from the time that SugarCRM had generated so much interest that there were the rumblings of a backlash amongst experienced CRM professionals. The old guard did not understand how this new product that was being made available in a new and low cost way could generate such excitement. However, the excitement is there. Since Sugar has launched there have been more than 3 million downloads of SugarCRM and related projects.

Jacob attributed their rapid success to a number of factors. The first advantage is in the three founders themselves. They have complementary skills in sales, service and engineering and they work well together. Secondly, the founders had a lot of experience with building CRM systems. Before forming SugarCRM they had all been at E.piphany which had been a major CRM system vendor in its day, and prior to that they all had experience with other CRM systems. Given their experience, they all knew exactly what was needed and this allowed them to put together an extremely capable CRM system in a very short period of time.

A third advantage is Open Source and the community that SugarCRM has built around their project. From the earliest days they had an enthusiastic user community who both used and contributed to the project. Sugar soon set up its own SugarForge to accommodate what are now more than 350 Open Source extensions to the project. The Sugar Forums with more than 38,000 members allow users to help each other use SugarCRM. The recently created SugarExchange allows people to sell and exchange products and services based on SugarCRM.

As Jacob put it, Open Source is "Passionware". People want to be involved in the tools they use every day and Open Source offers them new ways in which they can be involved. SugarCRM has been a leader at involving their users and building a community. The reward is that they have created a large number of passionate advocates of the product.

Sunday, March 04, 2007

The DRM Battle

The other day, someone on Slashdot commented that there had been a lot of posts on Digital Rights Management (DRM) recently. Well there have been and there will continue to be DRM posts because Intellectual Property Rights have become a huge issue caused by technology advances with computing and the Internet.

I have written about DRM frequently in the past and will continue to do so, as there is a lot to say. In the mean time, here is great quote that precisely expresses some of my feelings. It comes from Lifeline, the first published work by science fiction writer Robert Heinlein. It was written in 1939, some time before the RIAA even existed:
There has grown up in the minds of certain groups in this country the notion that because a man or a corporation has made a profit out of the public for a number of years, the government and the courts are charged with the duty of guaranteeing such profit in the future, even in the face of changing circumstances and contrary public interest. This strange doctrine is not supported by statute nor common law. Neither individuals nor corporations have any right to come into court and ask that the clock of history be stopped, or turned back.

Sunday, February 25, 2007

Two Disruptive Trends

Barry Klawans gave an excellent talk to the SDForum Business Intelligence SIG when he spoke on "Two Disruptive Trends, Open Source and SaaS Meet Business Intelligence" at the February meting. Open Source is something that I have written about before. SaaS stands for Software as a Service, the idea that information technology can be delivered as a service over the internet. The best example of a successful SaaS business is Salesforce.com. Barry knows the territory well as he is CTO of JasperSoft, an Open Source BI reporting company.

Barry started with the Innovators Dilemma, a book from the 90s that describes how established technologies and products markets can be overturned by innovators who use new disruptive technology or business models. Existing BI software vendors tend to target the high end of the market and they are vulnerable to disruption from new vendors that start by targeting the under served lower end. Barry believes that Open Source and SaaS are the forces that will overthrow the old guard of established BI vendors.

Next, Barry took us through the BI stack and Open Source projects that address it. Successful Open Source projects concentrate on doing one thing and doing it well. One of the problems with using Open Source is that you have to integrate several Open Source packages to build a system. Integration is made more difficult because active Open Source projects tend to have a very short release cycle. The most active project have a new release every 6 weeks or so. (This certainly struck a chord with me.)

This brought us to the second part of Barry's talk on Software as a Service (SaaS). The point of a SaaS system is to offload the user from the responsibility of building and maintaining an IT system. The job of building a SaaS system is integrating a lot software packages to provide the service. SaaS also has to deal with transparently upgrading the service to the users as it implements new features and fix bugs. As such it complements Open Source and it rapid development cycle.

SaaS is a newer market and there are only a few emerging SaaS BI services available now. Barry touched on three, LucidEra, SeaTab and Oco, all early stage start ups. There are some real architectural challenges to providing BI as a service. For example, one issue is security. For a number of reasons, many Open Source projects put security on the back burner. On the other hand, a Saas customer needs to have solid assurances that their data is secure and safe from other customers of the SaaS service.

We will have to see how these trends play out. Barry quoted a research report that suggested that software innovation goes in a roughly 15 year cycle, and that in Business Intelligence we are just entering a new cycle that can be expected to go on to 2020 or so.

Sunday, February 18, 2007

Like OMG

I overhear my daughter speaking on her new cellphone in the next room, and I get to thinking: OMG, does it mean "Object Management Group" or "Oh My God"? Well there is only one way to settle it: Google Fight. The results when they come in lean in the expected direction, but the results are not as overwhelming as anticipated.

So like any good data analyst I do a little exploration to check the results. I do not think that there is any problem with the Object Management Group side of the equation. On the other hand there are several potential spellings of Oh My God that could make a difference. A Quick test of "Oh Mi God" shows that it is not a problem and "Oh My God" is much more popular than the "O My God"spelling.

The next thing to test is whether the quotes are necessary. Removing the quotes quickly shows wobbly results, particularly with a big difference between O My God and Oh My God. Although curiously, the first page of O My God results from Google all show Oh My God as the search term. Google Fight shows Object Management Group winning over Oh my God but losing to O My God. My conclusion is that the quotes are necessary even although they make a less interesting Google Fight.

The other notable thing is that the numbers returned by Google Fight are not the same as the numbers returned by Google. Why? Well it is probably because they have a different setting of the advanced search options. You can spend hours over this stuff as my little example has shown. In the end the first Goggle Fight results seem to stand up, so perhaps the conclusion is that the world is more serious that I first though!

Wednesday, February 07, 2007

The Vista DRM Morass

After listening to the series of Security Now podcasts on Vista DRM (episodes 73,74,75 and 77), I got to thinking about the difference in position between Microsoft and Apple.

Microsoft is a software company whose software runs on what up to now has been a very open hardware platform provided by a vast array of vendors. Thus Microsoft have chosen and perhaps been forced to implement security features in their Vista operating system to allow protected High Definition content to be displayed safely on HD displays attached to the PC. When I say safely, it is not about protecting you from the content, it is about protecting the content from you and making sure that you do nothing unauthorized with content that you have paid good money for.

All this security comes at a price. It sucks up processing power. Complexity and the requirement for vigilance cuts into software reliability. The problems are evident. Vista had only just been released when the first Service Pack was announced. Moreover, while the Vista software itself is available, the truth is that the video drivers are not up to scratch and it will take some time before they work properly.

Apple, on the other hand, is a hardware and systems company. Their solution to protected video content is the Apple TV box. This is a little piece of hardware that connects to your TV and handles decoding and display of the protected High Definition content. An Apple computer does not need the complexity of the Vista protected video path because it is all wrapped up in a little box. The Apple TV box is not perfect. It is not quite here yet and from the specs it seems to lack codecs, however it does seem to be a better systems solution than Microsoft Vista.

Thursday, January 25, 2007

Visualization for All

If you are into playing with data these are good times. A number of web sites that have sprung up recently that allow you explore data visually. Data360 launched in October. Swivel got a mention from the influential TechCrunch blog. Many Eyes comes from IBM Research, although you may not think so from looking at the site.

Each of these web sites allows you to upload data sets and play with how they are presented, looking for insights into the data. Of the three, Many Eyes is the most approachable. Without having to register, you can play with data sets that others have uploaded. Many Eyes has a great collection visualization tools including scatterplots, stacked graphs and treemaps as well as the more mundane bar and pie charts.

For example, when I first visited Many Eyes, someone had uploaded a data set of restaurant reviews from the San Francisco Chronicle that scored the restaurants by food, atmosphere, service, price, noise and also gave an overall score. I looked at scatterplots and determined that there was no significant correlation between atmosphere and noise and that the only factor that seemed to show some correlation with the overall score is the score for food. I also used stacked graphs to explore US government spending over the last 45 years. The takeaway is that spending on health, particularly medicare and drugs accounts for the largest increase in spending.

The only problem with Many Eyes is that as befits a open research site, anyone can upload their data set, so by the time you read this the restaurant review data may have scrolled off to be replaced by other compelling data. Go and play with whatever data is there anyway. It will be a learning experience.

Thanks to Stephen Few and his great Visual Business Intelligence blog at PerceptualEdge for pointing out these sites.

Tuesday, January 23, 2007

DRM Wishes

The old saying goes "be careful what you wish for, lest it come true". The music industry wished for DRM to protect their content. They found their "white knight" in Steve Jobs who built the iTunes music store to deliver their content safely to iPod users everywhere. The problem is that the music industry now finds itself completely beholden to Apple as their only viable channel for digital music sales.

Apple controls the channel and dictates the terms for music sales, particularly the $.99 price which record executives want to vary. Also, the DRM is now seen to do more good for Apple then the music industry because it locks the music purchaser into Apple products. The more music bought, the more locked in the purchaser becomes. No wonder the music industry is now talking about selling music without DRM. Funnily enough Apple is against selling music on iTunes without DRM!

The only cloud on the horizon is that several European countries are trying to force Apple to open up their DRM for others to use. If these countries succeed, they take away the pressure on the music industry to sell music unencumbered. I view these countries efforts as totally misguided and I wish that they would just stop meddling.

On another front, the Jury is still out on whether the Microsoft Vista operating system is going to be so wrapped up in DRM that it is unusable. (I posted on this a couple of years ago.) There is a great discussion of Vista DRM on the Security Now podcast (episodes 73, 74 and 75).

Many people are surprised that Microsoft has yielded without a whimper to the content industry. If Microsoft had been willing to take a stand they could have negotiated a much better position for themselves and their products. It seems like Ballmer has been too willing to BOGU for the content providers. We will just have to stand back and see if he gets shafted.

Sunday, January 21, 2007

Complex Event Processing

Complex Event Processing (CEP) was the topic for the SDForum Business Intelligence SIG January meeting. Mark Tsimelzon, President, CTO and Founder of Coral8 spoke on "Drinking from a Fire Hose: the Why's and How's of Complex Event Processing".

Mark started out by showing us a long list of applications such as, RFID, financial securities, e-commerce, telecom and computer network security that share the same characteristics. Each of these applications can generate hundreds of thousands of event per second that need to be processed, filtered and have critical events identified and responded to in a millisecond or second timeframe.

The first response to building a system for one of these complex event processing applications is to load the data into a database and continuously run queries against the data. Unfortunately this introduces a number of delays that interfere with response time. Firstly there is the delay in loading the data into the database, as efficient database loading works best in batches. Next there is a delay in waiting for the query to be run as it is run periodically. Finally there is a delay caused by interference between the load process that is writing data and the query process that is trying to read the same data.

Given the problem of using a database, the next response to building a CEP system is to write a custom program in Java or C to do the job. This can be coded to meet the response time and data rate requirements, however it is inflexible. Any change to the requirements or data streams requires recoding and testing which take time and money. Coral8 and other vendors in the CEP space provide a system like a database that is programmable in a high level SQL-like language and that can process event streams at a rate similar to the hand coded system.

In a conventional database system, the data is at rest in the database and the queries act on the data. In a CEP system, the queries are static and the event data streams past the queries. When an event triggers a query, the query typically generate new event data. This structure allows event data processing to be parallelized by having several event processors that run different queries in parallel on the same data stream. Processing can be pipelined by having the output streams of one event processor feed into the inputs of another event processor.

It is important to understand that the purpose of a CEP system is not to store data. While events can linger, they eventually pass out of the system and are gone. A database complements a CEP system. For example, Coral8 can read data from database systems and even caches the data for improved efficiency. Also, output streams from Coral8 can, and usually are, fed into database systems.

If you want to try out CEP, visit the Coral8 web-site. There you can download documentation and a trial version of the software.

Sunday, January 14, 2007

Tableau Software

Business intelligence is about taking business data and turning it into actionable information, and there is a visualization problem at the heart of this process. Business data can be complicated and the user needs help in presenting the information in the best possible way. Unfortunately, many leading Business Intelligence tools seem to be deliberately designed to lead the user into making the worst possible presentation choices.

At previous meetings of the SDForum Business Intelligence SIG we have had great fun looking bad visualizations such as garishly colored 3-D pie charts and 3-D bar graphs that do more to obscure the information than to show it off. At the November meeting of the SIG we heard from a company that is doing something positive about data visualization when Kelly Wright, Director of Sales for Tableau Software, and a Bay area local, presented "Visual Analysis Using Tableau Software".

Tableau Software (www.tableausoftware.com) is a startup that emerged from a research project at Stanford University. There under the leadership of Dr. Pat Hanrahan a team of researchers worked on the difficult problem of enabling people to easily see and understand the information in their databases. As Kelly explained Tableau was formed in 2000 and took 5 years to develop their product, coming out with their first version in 2005. They are now on version 2.1.

Kelly gave us a whirlwind tour of Tableau's capabilities. Firstly Tableau is designed to understand the data that it is presenting, at least to the extent that it can make sensible choices about how to present the data in a useful way, for example, by giving line graphs of continuous data against time. While it is always possible to override the default, Tableau seems to do a good job with its choices. The next issue is being able to present large amounts of data and compare different aspects of the data against one another, and again the Tableau drag and drop interface seems intuitive and easy to use. When you can see all the data the next requirement is to drill down into the interesting data and remove the noise, and again Tableau has a set of tools for selecting the most interesting data points and looking into them further.

In retrospect it seems obvious to take the knowledge that has developed around how to present information, and package it into a data visualization product. However this is not as simple as it seems and the fact that Tableau took 5 years to develop their product shows the amount of work involved in doing this properly. Also, theirs is a lonely path. The other BI vendors prefer to provide flash and features over carefully integrated substance.

TableauĂ‚’s product is not expensive for a data-head, and if you ask, you can get a 10 day free trial to find out exactly what it can do. Go ahead and try it!