Monday, April 30, 2007

Duh Typing

The meeting of the SDForum SAM SIG last week was interesting, however the discussion afterwards over a beer was just as interesting. I will write about the SIG meeting another time, here I want to capture an idea from the after session.

There is a lot going on in the world of programming languages, dynamic versus static languages, strong versus weak typing. Proponents of dynamic languages like Python and Perl claim that they are superior because you do not need to continually specify the type of an object, particularly when the type of the object is obvious.

Over a beer someone proposed that this continual restating of the type of an object should be called Duh Typing (a play on Duck Typing). I ran into an example of Duh Typing today when I found myself writing the following Java statement:
ManagedTasks[] managedTasks =
(ManagedTasks[]) tasks.toArray
(new ManagedTasks[tasks.size()]);
The dynamic language guys would laugh at the fact that I have to specify the type 3 times in a single statement. Well, it is not quite as easy all all that as I will discuss in future posts. In the mean time here is a neat name for the concept.

Friday, April 27, 2007

The Times, They Are a-Changing

"Apple’s Steve Jobs, perhaps the most important person in the music industry today, ..."
Well, it is a quote from Michael Arrington who is after all a Valley booster, however it does show how far the music business has come on its strange journey. Gone are the simple days of find and rip-off some gullible kids, now we are at the medium is the message.

Sunday, April 22, 2007

The 60 Hour Data Warehouse Implementation

There was a lot of interesting stuff in the presentation by Stephen Bay to the SDForum Business Intelligence SIG on "Large Scale Detection of Irregularities in Accounting Data". However the one thing that really struck me was their 60 hour data warehouse implementation.

Stephen and his colleagues at the PricewaterhouseCoopers Center for Advanced Research have built a system called Sherlock for detecting fraud in accounting data by applying several analytic techniques. Sherlock works by looking at the general ledger of the business. A general ledger is typically several gigabytes of data and may be fed by sub-ledgers that can run into the hundreds of gigabytes. Before Sherlock can do its analytics, they have to get the accounting data into a standard form in a data warehouse. Sherlock is used during an accounting audit which typically lasts a month, so there is great pressure to get the data warehouse implemented in as short a time as possible.

So how do they do it? Firstly, the schema of the data warehouse is fixed. The PricewaterhouseCoopers team have developed a standard data warehouse design for a general general ledger that is applicable to all non-financial businesses. The data warehouse design is open source and is available from IPHIX. Secondly, the general ledger data usually comes from an SAP, Oracle or PeopleSoft ERP system so some of the connections can be prebuilt. The problem with ERP systems is that they are heavily customized for each user, so the Sherlock team have implemented a GUI tool for building a mapping between ERP content and the data warehouse. The tool is designed for business people and accountants to use so that the data warehouse can be built by people with domain knowledge but no technical knowledge.

With all this they claim that they can build a data warehouse in 60 hours for a new implementation and 20 hours for a repeat implementation. Contrast 60 hours with a typical data warehouse project that takes many months. A large data warehouse project can easily take a year or two to implement. So a 60 hour implementation is an astonishing achievement.

Sunday, April 08, 2007

Music and the Limits to Consumption

Time and money are limits on the consumption of media. If I am playing a video game, or watching a video, I will not be listening to music. If I listen to less music, I will want to buy less music. Anyway, my budget is not unlimited, so if I spend money on a movie, TV Show or video games, I will not spend that money on buying music. I got to thinking this after reading a piece in the New York Times where a pair of record store owners complain that it is actions of the recording industry and RIAA that have lead to the recent decline in the sales of music.

There was a time when music was the choice media to purchase, but that was when the main alternative was books or magazines. Now we live in a much richer environment where that are many more media choices to entertain us. Music has lost its monopoly on our attention, however the music industry still behaves like it has that monopoly. Rather than market music to us, the RIAA goes out and sues the most ardent music collectors.

There are plenty of things that the recording industry could do to improve their position. For example, they should be marketing music to us as the more entertaining and lasting of the media alternatives. After all, we listen to the same music over and over again, while we usually watch an movie or TV show only once. I have never once heard a compelling argument for buying music being made by the music industry, whereas I have heard interesting arguments being made on forums like SlashDot.

Thursday, April 05, 2007

Cut and Paste

It is the little details in a User Interface that make the difference between something that is straightforward and something that is frustrating. For example, lets take the simplest editing function, cut and paste. If text appears on my screen, I want to be able to select the text and paste it into a document, message or whatever. However, there is a lot of text that appears on my screen that I cannot select. There are title bars, menu items and worst of all, all the interesting text in dialog boxes.

Imagine this scenario: a modal dialog box appears on my screen with an important error message. What do I do? Well I cannot select and copy the text, and with a truly modal dialog box, I cannot do anything else until I have dismissed the box, so if I want to preserve the message in the box, I have to find a piece of PAPER to write down what it says, and later transcribe my written notes into the email sent to support. I have been on the receiving end of software errors and so I know that the first question is "What EXACTLY does the message say?". So this is why you need to keep a piece of paper and pencil handy when you use a computer!

I was reminded of this problem today by another frustrating problem with cut and paste. A colleague IMed me to ask for an email address of a third party. As I had to send the third party an email as well, I brought up Outlook (ugg) typed the first three letters of the persons name and auto-completed. Feeling happy that everything could be done in a few keystrokes, I selected the email address in the To: part of the message, copied it and pasted into the IM window. Imagine my surprise when the name of the person appeared in the IM window but not their email address. I had copied the text "name <email>" and when I pasted all that appeared was "name". After 3 goes, I gave up and had to type the email address that appeared in one window into another. Who invented this feature, and what were they thinking?

On reflection, I suppose that I could have gone through the option and disabled all the ones that call themselves intelligent or smart. I did not and I will not. As I have said before in this blog, software should do the right thing out of the box. Anyway, life is too short to customize all the software tools that I use, especially as I am expected to upgrade to new and more complicated versions every few years.