Tuesday, August 31, 2010

Software Update Business Models

These days software updates are a fact of life. If we do not keep our software up to date we risk all sorts of horrendous infections and debilitating attacks. Unfortunately, the providers of our software know this and are starting to use software update to make money or at least remind us that they exist. I have done several software updates recently and noticed this in action.

Adobe just wants to remind me of their presence, so they insist on putting a shortcut to the Adobe Reader on my desktop every time they update. This is relatively benign as it is a matter of a few seconds at most to confirm that it is a shortcut and delete it. Apple is more pushy. I expect to get a new version of iTunes any day now, and I will need to carefully uncheck boxes to ensure that I do not get several applications more than I want. Most insidious is Java, now owned by Oracle. On one system they offered me the Yahoo tool bar, on another system which already had the Yahoo tool bar, they offered me some other software, so they obviously look to see what is installed to guide the offer. Judging by the fact that these offers were for third party software, I am sure that they get some sort of compensation for it.

Soon we will see advertisements and offers in the installer, and new ways to confuse us. The tactic that always gets me is to require some input that I forget to fill in, then when I go back to fill in this information, all the boxes I so carefully unchecked have been mysteriously filled in again. In a hurry, I just click "Install" not noticing that I am now getting all the extras that I had carefully tried to avoid. It is coming to a computer near you soon.

Saturday, August 28, 2010

Mad Skills for Big Data

Big Data is a big deal these days, so it was with great interest that we welcomed Brian Dolan to the SDForum Business Intelligence SIG August meeting to speak on "MAD Skills: New Analysis Practices for Big Data". MAD is an acronym for Magnetic Agile Deep, and as Brian explained, these skills are all important in handling big data. Brian is a mathematician who came to Fox Interactive Media as Lead Analyst. There he had to help the marketing group with deciding how to price and serve advertisements to users. As they had tens of millions of users that they often knew quite a lot about, and served billions of advertisements per day, this was a big data problem. They used a 40 node Greenplum parallel database system and also had access to a 105 node map reduce cluster.

The presentation started with the three skills. Magnetic, means drawing the analyst in by giving them a free reign over their data and access to use their own methods. At Fox, Brian grappled with a button down DBA to establish his own his own private sandbox where he could access and manipulate his own data. There he could bring in his own data sets, both internal and external. Over time the analysts group established a set of mathematical operations that could be run in parallel over the data in the database system speeding up their analyses by orders of magnitude.

Agile means analytics that adjust react and learn from your business. Brian talked about the virtuous cycle of analytics, where the analyst first acquires new data to be analyzed, then runs analytics to improve performance and finally the analytics causes business practices to suit. He talked through the issues at each step in the cycle and led us through a case study of audience forecasting at Fox which illustrated problems with sampling and scaling results.

Deep analytics is about producing more than reports. In fact Brian pointed out that even data mining can concentrate on finding a single answer to a single problem where big analytics has the need to solve millions of problems at the same time. For example, he suggested that statistical density methods may be better at dealing with big analytics than other more focused techniques. Another problem with deep analysis of big data is that, given the volume of data, it is possible to find data that supports almost any conclusion. Brian used the parable of the Zen Tea Cup to illustrate the issue. The analyst needs to be to approach their analysis without preconceived notions or they will just find exactly what they are looking for.

Of all the topics that came up during the presentation, the one the caused most frissons with the audience was dirty data. Brian's experience has been that cleaning data can lose valuable information and that a good analyst can easily handle dirty data as a part of their analysis. When pressed by an audience member he said "well 'clean' only means that it fits your expectation". As an analyst is looking for the nuggets that do not meet obvious expectations, sanitizing data can lose those very nuggets. The recent trend to load data and then do the cleaning transformations in the database means that the original data is in the database as well as the cleaned data. If that original data is saved, the analyst can do their analysis with either data as they please.

Mad Skills also refers to the ability to do amazing and unexpected things, especially in motocross motor bike riding. Brian's personal sensibilities were more forged in punk rock, so you could say that he showed us the "kick out the jams" approach to analytics. You can get the presentation from the BI SIG web site. The original MAD Skills paper was presented at the 2009 VLDB conference and a version of it is available online.

Monday, August 23, 2010

End of Moore's Law

The recent announcement that Intel is buying McAfee, the security software company, has the analysts and pundits talking. The ostensible reason for the deal is that Intel wants the security company to help them add security to their chips. Now, while security is important, I do not believe that is the reason Intel bought McAfee. In my opinion, this purchase signals that Intel sees the coming end of Moore's Law.

In 2005, the Computer History Museum celebrated 40 years of Moore's Law, the technology trend that every 2 years, the number of transistors on a silicon chip, and thus its capabilities doubles. On the stage Gordon Moore told us that throughout the 40 years, "they have always been able to see out about 3 generations of manufacturing technology", where each generation is about 2 years. So Intel can see its technology path for about the next 6 years. At that time Moore told us that they could still see how they were going to carry on Moore's Law for the next three generations.

Now what would happen if Intel looked 6 years into the future and saw that it was no longer there. That they could see the end of Moore's law and that meant that they would no longer have the ability to create new and more powerful chips to keep their revenue growing. I believe that they would start looking to buy up other profitable companies in related lines of business to diversify their revenue.

McAfee is a large security software company, its main business is selling security solutions to large enterprises. If Intel had wanted to buy security technology they could have gone out and bought a security start-up with better technology than McAfee for a few hundred million dollars. Instead they are spending an expensive 8 billion dollars on an enterprise security software company. This deal does not make sense for the reasons given, however it does make sense if Intel wants to start buying its way into other lines of business.

Now there are many reasons that Intel wants diversify their business. Perhaps they see the profitable sales of processor chips disappearing as chips gain so many transistors that they do not know what to do with them. However the most likely reason is that they can see the end of Moore's Law and that it is now time to move on and add some other lines of business.

Saturday, August 14, 2010

Analytics at Work

Analytics has become a major driving force for competitive advantage in business. The new book "Analytics at Work: Smarter Decisions, Better Results" by Thomas H. Davenport, Jeanne G. Harris and Robert Morison discusses what analytics can do for a business, how to manage analytics and how to make a business more analytical.

Analytics at Work has a useful introductory chapter and then divides into two parts. The first part discusses five major aspects of analytics in a business environment. The second part looks at the lifecycle of managing analytics in a business. The organization is good and there is no overlap between the topics in each part, however the order in which the information is presented seems designed to put the reader off.

The first part starts with a plodding chapter on what needs to be done to get the data organized and related topics, followed by a diffuse chapter called Enterprise. The interesting chapters in this part are the last two chapters. The Targets chapter discusses the important topic of picking targets for analytics. The Analysts chapter discusses how to effectively employ and organize analysts in a large enterprise. Similarly the second part of the book starts with a plodding chapter on how to Embed Analytics in Business Processes, followed by much more inspiring chapters on building an analytical culture, and the need to continually review a business comprehensively as part of an analytics push. If you find yourself stuck reading the book, try skipping to one of the interesting chapters that I have indicated.

Scattered throughout the book are many useful tools. In the introductory chapter there are the six key questions that an analyst asks. We come back to these questions from several places in the book. Running throughout the book is a five step capability maturity model for judging how analytical an organizations is and showing the path to making the organization more analytical. Each chapter in the first part ends with a discussion on how to take that aspect of the organization through the five steps.

It is important to understand the target audience. The book is aimed at senior management and executives, particularly in large enterprises. While the book contains many brief case studies as inspiration and it touches on all the important management issues that need to be considered, it does not go into great depth about what analytics is or the specific analytical techniques and how they can be used. This is not a book for analysts, unless they have ambitions to grow their career beyond analytics. I recommend this book to anyone in the target audience who wants to grow their organizations analytics capabilities.