Build and Break: November 2005

Saturday, November 26, 2005

The Future of Music

Some time ago I wrote the following in a piece on intellectual property in the digital age:

"... record company that on one hand pays radio stations to BROADCAST its hit song, while at the same time complaining that it is losing money because people are sharing the song on their computers"

Today I was looking at internet radio software that receives a broadcast stream from an internet radio station, and converts each song into a MP3 complete with ID3 tags. The point is that to build an MP3 collection, you do not need to illegally download music, all you need to do is capture the broadcast stream.

In practice this turns the economics of music on its head. In the past, performers made most of their money from recordings and used live performances to promote the recordings. The emerging business model is that performers give away their music to build an audience and then make their money from live performances.

In practice, this model has a lot to recommend it. As the audience grows older, they tend to become financially better off and willing to pay more for live performances. Properly managed, a performer can build a lifelong career that becomes more and more rewarding. Not surprisingly, this business model is both well proven and emerged in Silicon Valley.

Sunday, November 20, 2005

Consumer Content Management

Could there really be a huge and important consumer software category that is not being properly addressed? I got to thinking this when I happened across a piece that proposed the emergence of a new type of software, the media file "Type Manager". The concept is that each type of media like pictures and music needs its own file manager that knows about that type of media.

I could write a long harangue about how awful these programs are and how far I will to go to avoid them. For example, one major peeve is that they deliberately hide the location of the file so that when I want to edit it or use it in another application, I have to resort to some trick to find out where it is located so that I can open the file in that application.

However a long harangue would just obscure a much more important point. There already exists a category of software that addresses all the issues raised by the media file type manager and does a lot more. It is called Content Management.

As a software category, Content Management has gone through the usual evolution. The original applications were high-end one off applications created for demanding users like CNN and the BBC that have enormous content management problems. At the same time a mid-range thread emerged from web media organizations like Salon who developed their own software to manage their production (and which has now gone Open Source). Now Content Management is expanding into general use as all organizations discover that they have digital content to manage. The final step is for Content Management to become a consumer product.

Content Management has three major components:

Version management. Every time you edit a media file, for example, changing the level of an MP3 or cropping a digital image, you create a new version of the file. Version management keeps track of all the versions of a piece of media and how they are related.
Metadata management. All types of media files contain metadata, sometimes known as tags. However, there is always the need for more metadata, some of which, like version information, does not really belong in the file. Better to extract all the metadata and put it in a database where it can be searched, collated and aggregated.
Workflow. This is the major part of professional Content Management and in some sense its reason for being. In a consumer application, a single person does all the tasks so there is less need for workflow, however it can still be useful for automating repetitive tasks.

The most important feature of Content Management is that it handles all media types. I do not want a separate application for each media type, each with its own user interface and set of annoying little quirks. Also, it is useful to relate media, such as keeping documents of lyrics with songs, keeping track of the music used in video and slide shows, connecting pictures, text and perhaps music in web posts.

There is a lot more to say about metadata, version management and the structure of a universal content manager. We will have to get to them at another time. For the mean time, are you ready for Consumer Content Management? I know that I am.

Has Sony-BMG been Caught Stealing Software?

The recording companies lecture us on how we should not steal Intellectual Property. Through their industry association, they prosecute our children. Now one of them is being accused of stealing software content. If they want us to respect their property rights, they need to remember that they are not above the law themselves.

Wednesday, November 16, 2005

Predictive Analytics Redux

Anyone who did not come the SDForum Business Intelligence SIG meeting on Tuesday missed a great talk. Eric Zankman led us through a case study of a customer analytics engagement with a large telecom company that addresses a specific business issue and eventually provided a measurable multi-million dollar return.

During the engagement he had: built a data mart to collect the data, built a set of predictive models, segmented the customers, developed a set of strategies for handling the business problem, run a series of tests with different strategies to understand the costs and benefits of each strategy and finally set things up so that the customer could continue to monitor and refine their strategy.

Eric described each stage with enough clarity that I feel that I could reproduce his work if I were asked to. Of course I would not do it as well as Eric did, but that is not the point. I have read books, been to classes, heard presentations on customer analytics and never seen such a simple yet comprehensive walk through of what to do and how to do it.

So if you did not come, you missed a great meeting. Sign up with our mailing list/group so that you do not miss another meeting.

Saturday, November 12, 2005

Master Data Management

There is a new term in enterprise software - Master Data Management. While the term is new, the concept it not quite so new. I view it as the end point of a change that has been going on for some time.

The concept of an Enterprise Data Warehouse emerged during the 1990's. One of the compelling reasons for creating an Data Warehouse is to create a single version of the truth. An enterprise has many different IT systems, each with its own database. The problem is that each database has its own version of the truth.

So for example, consider a typical enterprise that has many customers. It will have marketing databases and several sales databases, each associated with the IT system for that sales channel. There are also service systems with their associated databases. The same customer may appear in several of these databases that support the business operations and in each one the customer information is different. There are variations in the name, different addresses and phone numbers or in some cases no contact information at all. While this example is about customers all other enterprise information also exists in many databases, and anywhere that information is multiplied, there are bound to be contradictions, failures and gaps.

Rather than try to resolve this mess, the idea of the Enterprise Data Warehouse is to create a new database that contains the clean and corrected version of the data from all the operational databases. So the Enterprise Data Warehouse is the one data repository for a single version of the truth about the enterprise.

In the early days, the Data Warehouse was conceived as a place for business analysts to do their work. Business analysts like to ask difficult questions and another reason for creating a separate database is to give them a place where they can run complicated queries to answer these questions without disturbing the operational systems.

In practice, building the Enterprise Data Warehouse is difficult and expensive, and the result is an immensely valuable resource. Far too valuable so be left to the business analysts. So from the earliest days of data warehouses, they were connected to operational systems and used to help run the business.

The problem with this is timing. The original data warehouse was conceived as something that you could load at night with the days data and then the business analysts would query it during the day. However if you are running a call center off the information in a data warehouse because it has the best version of the data, the data warehouse has to contain the latest information.

For example, when a customer calls to complain that the product they bought that day does not work, the data warehouse needs to have been updated with the purchase so that the call center can verify it. This has led to the notion of a real time data warehouse. I explained this when I gave a presentation on "Real Time Business Intelligence" to the SDForum BI SIG in 2003.

So what does all this have to do with Master Data Management? Well I view Master Data Management as the legitimate name for this trend that I have described as real-time data warehousing. After all, real time data warehousing is not a very good name for the concept. It is a description of how we are getting there, while Master Data Management is a statement about the end point.

Of course the real story is not as clean as I have described. Enterprise Information Integration (EII) is about building a virtual data warehouse without bringing all the data together in a single physical database. Master Data Management does not necessarily imply that all the data is integrated into a single database, all it means that there is a single version of the truth about all the enterprise data and that this is available to all enterprise applications.

It is also worth noting that as with any new term there is still a lot of discussion and debate with different points of view about what Master Data Management really means.

Tuesday, November 08, 2005

Give it Away

The other day, I found myself writing "I agree with you that we should give away content to build an audience and membership rather than thinking about making people pay for it. In the internet age, the really successful business models like Yahoo and Google have been about giving away information to build an audience and then figuring out how to capitalize on it."

Its true. A recent article in Wired discussed how Bands are using MySpace to build a following by providing content such as giving some of their music away. It is exactly the kind of thing that Lawrence Lessig talked about when he spoke of the Comedy of the Commons last year to the SDForum Distinguished Speaker Series.

For the last 500 years, ever since the invention of printing, publishers then record companies and movie studios have been controlling the market for content by controlling the means of production. In the Information Age, reproduction and distribution of content is free, and the old content publishing empires that grew fat and happy by exercising their control are going down screaming. Oh to live in such interesting times.

Sunday, November 06, 2005

Good API Design

Joshua Bloch gave a great talk on "How To Design a Good API and Why it Matters" at the SDForum Java SIG last Tuesday. This kind of system design topic can be difficult because it can come across as apple pie unless it comes from someone who really knows what they are talking about. Joshua knows what he is talking about.

I found myself nodding along with Joshua's dictums. Here is one that particularly stuck in my mind. "An API is like a little language. Names should be self explanatory and be consistent, using the right part of speech. Aim for symmetry." I have written in the past about the connection between language design and APIs. The only problem is that while good API design is hard, good language design is even harder.

This dictum struck home because I recently made a mistake in a little API where I used the wrong words for a function. Unfortunately the problem was compounded because that word should have been used for another function that was missing, and that would have completed the symmetry of the API. When we renamed the errant function we could create the missing function and complete the API. Sorting out this little problem took a surprising amount of time.

My touchstone on this topic has been Butler Lampson's paper on "Hints for Computer System Design". A good part of that paper is concerned with interface design, and I have used Lampson's advice on interfaces with great success in the past. In fact, looking at this paper and my notes, Joshua echoes a surprising amount of Lampson's advice.

The Hints paper is more than 20 years old and since it was written we have moved from building computer systems to living in a networked world where APIs are our point of contact with the universe. Currently Joshua's words on API design are only available as a presentation. I hope that he writes it up as a paper or book so that everyone can have the full story.

Build and Break