Sunday, March 30, 2008

Building Better Products Through Experimentation

Experimentation is the theme of the SDForum Business Intelligence SIG so far this year. The March meeting featured Deepak Nadig, a Principal Architect at eBay, talking about "Building Better Products Through Experimentation". Experimentation is an important technique for Business Intelligence, although its first uses were with medicine. In 1747, James Lind, a British naval surgeon performed a controlled experiment to find a cure for scurvy. In his book "Supercrunchers", Ian Ayres describes how the Food and Drug Administration has used experimentation since the 1940s to determine whether a medical treatment is efficacious.

While eBay has always used experimentation test and fine tune its web pages, in recent years the process has been formalized. While anyone can propose an experiment, product managers are the group of people who are most likely to do so. Deepak took us through the eBay process and discussed issues with using experimentation. Because they have the infrastructure, simple experiments can be set up within a matter of days. eBay usually runs an experiment for at least a week so that it is exposed to a full cycle of user behavior. Simple experiments to test a small feature typically run for a week or so, larger experiments may run for a month or two and some critical tests run continuously.

For example, eBay is interested in whether it is a good idea to place advertising on their pages. On the one hand it brings in extra revenue in the short term, on the other hand, it might cannibalize revenue in the long term. Experimentation has shown that advertising is a good thing in some situations, however its use is being monitored by some long term experiments to ensure that it remains beneficial.

Deepak took us through some of the issues that with experimentation. One issue is concurrency, how many experiments can be carried out at the same time. As eBay has a high traffic web site, they can get good results with experiments on a small proportion of the users, at most a few percent. As each experiment uses a small percentage of the users, several experiments can be run in parallel. Another issue is establishing a signal to noise ratio for experiments to ensure that experiments are working and giving valid results. eBay has done some AB experiments where A and B are exactly the same to establish whether their experimental technique has any biases.

Wednesday, March 26, 2008

The Cogwheel Brain

The Cogwheel Brain by Doron Swade is the story of Charles Babbage and his quest to build the first computer. The book also details how Doron Swade built a Babbage Difference Engine for the 200th anniversary of Babbage's birth in 1991.

Charles Babbage designed 3 machines. His started with the Difference Engine that would use the method of finite differences to generate tables such as logarithms and navigation tables. The computing section of his first design was built although it did not have a printer. Next he conceived and designed an Analytic Engine, which was a fully functioning computer that was programmed by the same kind of punched cards that were used to run a Jacquard weaving loom. In the course of designing the Analytic Engine he realized that he could improve the design of the Difference Engine to make it faster and use less parts. This resulted in the design of Difference Engine 2. Only small demonstration parts of the Analytic Engine were built and the Difference Engine 2 existed only as a set of plans.

I expected the story to be similar to several other computing projects that I have seen and worked with. You know the projects, the ones where the architect keeps jumping to a new idea while the overall project goals get lost and the project overruns for years before it is abandoned. Building the Difference Engine was a lot more disciplined. The core of the first difference machine was built and worked even although it used orders of magnitude more machined parts than any other machine built up to that time. While it did take a long time, given the engineering practices of the day, all the parts had to be made by a single craftsman in a single workshop.

One thing from the book that surprised me is that during the 19th century other difference engines were built by other engineers. Although these machines were completed, they were never successfully used for any purpose. I think this goes to show that the 19th century was not ready for mechanical computing. The book is easy to read and highly recommended.

Thursday, March 13, 2008

Customer Relationship Intelligence

There is a curious thing about the organization of a typical company. While there is one Vice President in charge of Finance and one Vice President in charge of Operations there can be up to three Vice Presidents facing the customer: a Marketing Vice President, a Sales Vice President, and a Service Vice President. On the one hand, the multiplicity of Vice Presidents and their attendant organizations is a testament to the importance of the customer. On the other hand, multiple organizations mean that no one is in charge of the customer relationship and thus no one takes responsibility for it.

We see this in the metrics that are normally used to measure and reward customer-facing employees. Marketing measure themselves on how well they find leads regardless of whether sales uses the leads. Sales measure themselves on the efficiency of the sales people in making sales regardless of whether the customer is satisfied. Service, left to pick up the pieces of an overpromised sale, measure themselves on how quickly they answer the phone. Every one is measuring their own actions and no one is measuring the customer.

Linda Sharp addresses this conundrum head on in her new book Customer Relationship Intelligence. As Linda explains, a customer relationship is built upon a series of interactions between a business and its customer. For example, the interactions starts with acquiring a lead, perhaps through an email or mass mailing response or a clickthrough on a web site. Next, more interactions qualify the lead as a potential customer. Making the sale requires further interactions leading up to the closing. After the sale there are yet more interactions to deliver and install the product and service to keep it working. Linda's thesis is that each interaction builds the relationship and that by recording all the interactions and giving them both a value and a cost, the business builds a quantified measure of the value of its customer relationships and how much it has spent to build them.

Having a value for a customer relationship completely changes the perspective of that relationship. It gives marketing, sales and service an incentive to work together to build the value in the relationship rather than working at cross purposes to build their own empires. Moreover, knowing the cost of having built the relationship suggests the value in continuing the relationship after the sale is made. In the book, Linda takes the whole of the second chapter to discuss customer retention and why that is where the real profit is.

The rest of the book is logically laid out. Chapter Three “A Comprehensive, Consistent Framework” creates a unified model of a customer relationship throughout its entire lifecycle from the first contact by marketing through sales and service to partnership. This lays a firm bedrock for Chapter Four, “The Missing Metric: Relationship Value” which explains the customer relationship metric, the idea that by measuring the interactions that make the relationship we can give a value to the relationship.

The next two chapters discuss how the metric can be used to drive customer relationship strategy and tactics. The discussion of tactics lays the foundation for Chapter Seven, which shows how the metric is used in the execution of customer relationships. Chapters Six and Seven contain enough concrete examples of how the data can be collected and used to give to give us a feeling of the metric’s practicality. Chapter Eight compares the customer relationship metric with other metrics and explores the many ways in which it can be used. Finally, Chapter Nine summarizes the value of the Customer Relationship Intelligence approach.

Linda backs up her argument with some wonderful metaphors. One example is the contrast between data mining and the data farming approach that she proposes with her Relationship Value metric. For data mining, we gather a large pile of data and then use advanced mathematical algorithms to determine which parts of the pile may contain some useful nuggets of information. This is like the hunter-gatherer stage of information management. When we advance into the data farming stage, we know what customer relationship metric is important and collect that data directly.

As the metaphor suggests, we are still in the early days of understanding and developing customer relationship metrics. Until now, these metrics have concentrated on measuring our own performance to see how well we are doing. Linda Sharp’s Relationship Value metric turns this on its head with a new metric that measures our whole relationship with customers. Read the book to discover a new and unified way of thinking about and measuring your customers.

Tuesday, March 04, 2008

Developing on a Cloud

The cloud computer is here and you can have your corner of it for as little as 10 cents an hour. This was the message that author and consultant Chris Richardson offered to the SDForum SAM SIG when he spoke on "Developing on a Cloud: Amazon's revolutionary EC2" at the SIG's February meeting.

As Chris tells it, you go to the Amazon site, sign up with your credit card, go to another screen where you describe how many cloud servers you need and a couple of minutes later you can SSH to the set of systems and start using them. In practice it is slightly more complicated than this. Firstly, you need to create an operating system configuration with all the software packages that you need installed. Amazon provides standard Linux set ups and you can extend them with your requirements and store the whole thing in the associated Amazon S3 storage array. There goes another 10 cents a month.

Next you need to consider how your cloud servers are going to be used. For example, you could configure a classic 3 tier redundant web server system with 2 cloud servers running web servers, and another 2 cloud servers running tomcat application servers and a another cloud server running the database with yet another cloud server on database standby. Chris has created a framework for defining such a network called EC2Deploy (geddit?). He has also implemented a Maven plug-in that sits on top of EC2Deploy that creates a configuration and starts applications on each server. Needless to say the configuration is defined declaratively through the Maven pom.xml files.

So why would want to use EC2 for? Chris suggested a couple of applications that are particularly interesting for impoverished start ups. Firstly, EC2 can be used to do big system tests before a new version of the software is deployed. The start up does not need buy all the hardware to replicate its production systems so that they can do a full scale system test. Big system tests are done on EC2 saving considerable resources. Another use it to have a backup solution for scaling should the startup take of in an unexpected manner. Given the unreliability of ISPs these days, having a quickly deployable backup system sounds like a good idea, and the best thing is that it does not cost you anything when you are not using it.