Saturday, October 03, 2009

Search User Experience Innovations

Innovations in the Search User Experience was the topic at the September meeting of the SDForum Search SIG. The distinguished panel from Microsoft, Google and Yahoo was chaired by Safa Rashtchy, a long time analyst and commentator on the Search scene.

First, Sean Suchter General Manager of Microsoft's Search Technology Center Silicon Valley told us about the latest innovations in Bing. Sean started out with some numbers, showing that the Internet is still growing at a fast pace and that search is growing faster than the Internet in general. They measure their user's experience and see that about a quarter of searches are failures, resulting in an immediate click back. On the other hand, getting on for a half the search queries are further refined meaning that the user is engaged in a search session. Microsoft will recognize these sessions and use them to improve the user experience.

To simplify the user experience, when they are confident about what a user is searching for, Bing will show one subject on the first page with a number of related links. Sean showed us two examples. Firstly for the search term "target", where they assume the person is looking for the Target chain of stores, they show a complete set of links to Target and shopping related pages with a single link to get other search results that are not related to Target stores. The second example was "ups" where they they only show links related to United Parcel Services and sending parcels on the first page.

Next up was Johanna Wright, Director of Web Search Product Management at Google. Johanna started off by telling us that that 20% of searches have never been seen before, and that Google is dedicated to serving the long tail of web searches as well as more popular ones. To show us how far the search experience has come in the last few years, she applied the search term "how to tie a tie" to an index that they had saved from 2001, and compared it with what you get today. In 2001 you got a miscellaneous collection of links to sites like "The Indus Entrepreneur" with none about tying ties. Now you get relevant links along with image and video links, a tremendous improvement.

Johanna talked about how speed is essential to a good user experience. A couple of years ago, they added related links to popular search terms like "target" to reduce the number of steps a user needs to make to get to the page they want. Google continues to work on helping users with query formulation. She showed us the options panel that you access by clicking the "search options" link on a search results page and how it can be used to refine a search.

Finally, Dr. Larry Cornett, vice president of the Yahoo! Search Consumer Products division spoke. He started by reassuring us that Yahoo! is still in the search business and that if and when the planned combination of Yahoo! Search with Microsoft goes through, they will still provide their own front end and control their user's experience. Yahoo!'s goal has always been to personalize and structure the web. We saw the new layout for Yahoo! search results in the typical Yahoo! busy style.

After the demo's, the floor was thrown open to audience questions. Someone asked about natural language support for queries. Sean told the story as he has been in the search business for a long time. In the early days of search, natural language queries were considered important research area. Then the issue went away as providing relevant answers to queries became the dominant problem. Now that giving good answers is under control, natural language queries are making a comeback. Recently Microsoft bought Powerset to help them in this area.

There were several questions about the sizes of market segments, and growth rates, particularly in the mobile space, to which the panel would not give answers. The audience did manage to uncover the fact that while adult searches are more prevalent than mobile searches, mobile searches have been growing fast since the introduction of the iPhone and other smartphones.

Another set of questions related to real time search. All three search engines have been working on improving the speed with which they update their indexes so that they are current. There is still an open question about whether the major search engines embrace real time search or make it a separate option.

Monday, September 07, 2009

Ikea Culture

We live in an Ikea world. I like to find excuses to visit the nearest Ikea in Palo Alto to lunch in their cafeteria, eating either a smoked salmon plate or Swedish meatballs with Lingonberry jam. The cafeteria has a great view over the South Bay and the East Bay hills. However the reason for this post is to note that Ikea has been popping up in the conversation all over the world.

In China, the Ikea stores have become a great success, for the people, if not for Ikea. This LA Times story reports that Chinese people are flocking to the local Ikea store, to test the bedding, hang out and eat in the cafeteria, maybe even buy some plates, just not to buy anything big.

Meanwhile in LA itself, several young aspiring producers have noticed that an Ikea store is just like a movie studio with lots of little well lit sets showing off bedrooms, living rooms, kitchens. Just the place to make a short episode on the cheap. The actors mike up with wireless mikes outside, rush in and take a few shots and then rush out before any employees notice. Here is Ikea Hights, a soap opera, and here is a send up of The Real World.

Finally, as reported in the New York Times, there has been outrage over the decision by Ikea to change the font in their latest catalog from Futura to Verdana. Futura is a well respected modern san-serif font that suits the Ikea style. Verdana is the generic Microsoft version of a san-serif font that comes on every computer with Windows. I am not sure why this is so important, are these people really complaining that Ikea has lowered its standards to encompass the lowest common denominator font?

Wednesday, September 02, 2009

Project Voldemort

There were three interesting trends exposed in the talk about Project Voldemort at the August meeting of the SDForum SAM SIG. Firstly Voldemort is another tuple store as opposed to a relational database, the trend that interested me the most. The second trend is implementation of systems described in academic papers. The final trend is to use Open Source as a support mechanism for a large software project. Lets break down each of these trends one at a time. By the way, the presentation was given by Bhupesh Bansal and Jay Kreps, of LinkedIn.

The relational databases have been the reliable store for serious computing for the last 20 years, but recently tuple stores and tuple processing like Map-Reduce have appeared and are starting to challenge the relational database hegemony. In the simplest terms, a tuple store is just a very degenerate relational database. Relations are based on the n-tuple, that is each row in a table contains a number of data items whereas a plain tuple is two data items, a key and a value.

As Jay Kreps explained, to get a web service application to scale, you need to distribute it over a over a cluster of computer systems, and to make this work with a relational database, you need to denormalize your database. The end point of database denormalization is the plain flat tuple store. Jay Kreps also complained that relational databases are not very good at handling data structures like the graphs of connections found in social networking applications, and semi-structured data like text.

In my opinion, tuple stores are no better or worse than relational databases at dealing with graphs between tuples. Tuple stores are more flexible for handling semi-structured data, but again this depends on the application (for more, read my comparison of Map-Reduce with relational databases). Tuple stores are certainly simpler, easier to use, more stable under load and cheaper than a relational database. I will write more about tuple stores at another time.

The second notable trend is for groups to pick up on systems described in academic papers and just implement them. Voldemort is an implementation of the Amazon Dynamo system as described in their paper at the ACM Symposium on Operating Systems Principles. We have seen several other examples of this recently. Google released a set of papers about their data processing systems including Map-Reduce, that has created a number of projects to emulate their functionality. I have written about Hadoop and Hypertable, two examples, and there are others. These are systems for doing very large scale analytic data processing, while Amazon Dynamo and Voldemort are systems for supporting rapid access to large volumes of data such as is needed to support large and complex web sites.

The final trend is Open Source as a support model. Voldemort was developed by LinkedIn, a company whose main business is providing a social and business network on the web. Their primary business is social networking, not writing and supporting a lot of complicated software. LinkedIn decided that they needed a tuple store like Amazon Dynamo and, as they could not buy it, they built it. However they decided they wanted help with support, so they released the software as an Open Source project. Now, Voldemort is being used by several organizations and at least half the people working on code are from outside LinkedIn. When Sandeep Giri started the OpenI project, I asked him why he was releasing it as an Open Source project and he gave the same reason.

Sunday, August 30, 2009

Augmented Reality

Earlier this month there was an explosion of posts and comments on the TechCrunch blog about Apple rejecting the Google Voice application for the iPhone. Michael Arrington wrote a post about how Apple reasons for the rejection were misleading and untrue that got over 400 responses. At the time I did not understand the reasons for the intensity of the comments and responses, particularly in a quiet news month like August. Last week I went to the SDForum Virtual Worlds SIG to hear a talk about Augmented Reality, and started to appreciate what is going on.

The Augmented Reality presentation was given by Kari Pulli and Radek Grzeszczuk, researchers at the Nokia Research Center in Palo Alto. What they mean by Augmented Reality is that you point the camera in your smart phone at something and the phone displays more information about what you are looking at. For example, you point the phone at a building and it tells you which building you are looking at with perhaps a link to a map or information about the building. Alternatively, you could point the phone at a book cover and the phone will identify the book and give you links to reviews and a web site where you can buy the book.

Someone in the audience asked the interesting question "Where do you get your data?" There are many different places to get data. For the demos, the book cover data had been scraped from Amazon.com. But when it comes to data, the elephant in the room is Google, the company that promises to organize the worlds information. To organize the worlds information, they first have to collect it and then they have to have the computer systems and technology to organize it. Google has been busy doing that for many years now.

The history of mobile appears to be going like this. First came the cellular network companies. They proved themselves incapable of providing anything more than voice and data services, so they are doomed to continue providing nothing but these basic services. As time goes on these services become less differentiated and eventually mere commodities.

Next come the smart phones providers like Blackberry and Apple. They have opened up the cell phone business model to provide services that their users really want. But they still rely on others for data to run these services. When the data providers get a little too close to core functionality they back off their openness, as Apple has with the Google Voice application.

The final step is for the data providers to take over, as Google is doing with the Google Android cell phone operating system. This is a play to reduce the devices to mere commodities and put the interesting business where it really belongs, with software and data.

Thursday, August 20, 2009

Media Convergence

The digital age has brought an extraordinary convergence of media that I have not seen remarked on anywhere. In the old world, each type of media was manufactured and delivered in its own different way. Movies were printed onto film and shown in movie theaters. Newspapers were printed on newspaper printing presses and delivered through a content delivery network that ends up with the product being thrown onto driveways in the early morning hours. Books were printed on book printing presses, bound and delivered through wholesalers to bookstores around the country. Records were printed in record presses, delivered to music wholesalers and then to record stores. Radio and TV were produced in studios, sometimes recorded and sent around the country to be broadcast on local transmitters.

That has all changed. In the new digital world, each media type has the same underlying form. Spoken words, music, written words, pictures, moving pictures are all buckets of digital bits. While we can still get each type of media in its old form we can also get them all delivered to our computer, cell phone or media player through the internet or the cellular phone network.

Even the devices we use to consume media are converged. Most of them can handle everything, so lets take an extreme example, the Amazon Kindle book reader. While the primary purpose of the Kindle is a book reader, it also has text to speech and handles audio files so that you can listen to music while reading. It will also display black and white pictures in the 3 common formats. So when you come to list the types of media that a Kindle can handle, it is quicker to say what it cannot do, that is color and moving pictures, than list all the things that it can do.

This change to digital media is just upon us, so it is going to take some time for all the consequences to shake out. At the moment, there is great wailing and gnashing of teeth from the newspaper industry. Newspapers rely on advertising which always does badly in a recession, but this time they also have to deal with the air being sucked out of their lungs by internet advertising and free listings. For some time, movie producers have been worried that they may be MP3ed like the music industry. More recently, book publishers have become aware that their business model is targeted and they are starting to behave like deer in the headlights as well.

These are just media industries and their travails are just the price of doing business in a time of technological change. The interesting question is how it will affect culture. If all types of media are fundamentally equivalent, will our preferences, being unfettered, change? One change is that there is a move towards shorter forms. For example, online journalism is certainly shorter and more punchy than the printed equivalent. This is just one example of one direction that change could go in. I am sure that there will be more consequential changes, so let me know what you think.

Thursday, August 13, 2009

Too Cheap to Meter

Last month Malcolm Gladwell wrote a snarky review of Chris Anderson's book Free: The Future of a Radical Price for the New Yorker. You will remember that Anderson's last book The Long Tail produced a wide range of reactions and "Free" will be no different. I did not like the Gladwell review. He picks up on a lot of little things while missing the big picture. On the other hand the book is somewhat carelessly written so that it is easy to find little things to criticize.

An example is the discussion of the phrase "too cheap to meter". In the 1950's, Lewis Strauss, then head of the Atomic Energy Commission, predicted that atomic energy would make electricity so cheap to produce that there would be no need for electricity meters. Unfortunately too many people see that phrase and take it to mean that electricity would be free, which is not what Strauss was claiming as I will explain.

The book "Free" has a chapter called "Too Cheap to Matter" that starts with Strauss's claim and goes on to Moore's Law and other laws of shrinking prices. Anderson seems to imply that electricity could be free, and in a long and rambling footnote still does not get to the point. Gladwell in his review of the book picks up on the implication and castigates Anderson for thinking that electricity could ever be free, using his own words against him.

To understand Strauss, you need to look at a utility bill, where you will see that the charges come in two parts, a fixed component for providing the service and a variable component which is your actual metered use of the utility. Strauss was claiming that for electricity there would be no need for the variable part, all that would be needed was a fixed part to cover the cost of fixed generator plant, transmission and billing. Sorry Gladwell and Anderson, Lewis Strauss was not trying to say that electricity would ever be so abundant that it would be free.

It is not unusual for utilities to be unmetered. Here are three examples of unmetered utilities from my personal experience. Firstly, I pay a fixed price for the broadband pipe of my internet service. Secondly, where I grew up, domestic water is plentiful enough that it is not metered, householders pay a fixed price for a 1/2 inch water main connection or somewhat more for a 3/4 inch water main connection. You could say that my garbage is too difficult to meter, so I just pay a fixed price for the weekly emptying of a 32 gallon garbage cart.

Apart from some writing that seems to imply more than is actually there, I found Chris Anderson's new book to be forward looking and full of familiar arguments. Well recommended. I will write more on the subject.

Sunday, July 26, 2009

Twittering Foodies

Given these difficult economic times, the latest trend in San Francisco dining is the unresturant, according to San Francisco Magazine. That is a posh way of describing eating from a food cart or truck. For example: Spenser-On-The-Go serves Caper Braised Skate Cheeks or Frogs Legs and Curry from a converted Taco truck; Boccalone serves exquisite pulled pork sandwiches from a bicycle; the Creme Brulee Cart and Magic Curry Kart are just street carts.

As the vendors come and go and many of them are not properly licensed, the only way to find out where they are going to be serving is to follow them on Twitter. At last! a purpose for Twitter, if you are a committed foodie. As I have not quite gotten to the Escargot Puffs level yet, I have not yet joined Twitter, although I can see a glimmer of hope. On the other hand David Letterman is still firmly in the camp that twitter is a colossal waste of time as this hilarious segment with Kevin Spacey shows.

LinkedIn's Data

LinkedIn has an extraordinary data resource. They have more than 40 million members and a complete job history of each member, in some cases going back 30 or 40 years. DJ Patil, Chief Scientist and Senior Director of Product Analytics at LinkedIn showed us some examples of their data when he spoke to a packed meeting of the SDForum Business Intelligence SIG on "The Analytics Behind LinkedIn" last week. Paul O'Rorke has written an excellent account of the meeting and here I am just adding my impression to that record.

DJ believes in the growing importance of the "data analyst" as a profession. He backed up that belief with some hard data when he shows us the growing importance of the job title over the last 35 years. Up to the mid 90's the appearance of that job title as a percentage of all job titles was flat, but since then it has been growing at a steady pace. As an aside, DJ told us that they use the Amazon Mechanical Turk service to do data cleansing of things like job titles. This is the first time I have heard of the service being used for this purpose.

We were shown other interesting examples of LinkedIn analytics including the change in the top five job titles over the Dot Com bust and an excellent display of the volume of cross country links between LinkedIn members. The big problem with this data is that we cannot have access to it because it is private to LinkedIn and they will keep it private to protect the privacy of their members.

Saturday, July 11, 2009

Graphs That Suck

Many years ago in the early days of the web, I learned about web site design by reading "Web Pages That Suck: Learn Good Design by Looking at Bad Design". It is a delightfully easy beginner level crawl through web site design, filled with examples ranging from excellent to awful with a capital 'A'. I would recommend the book today except that the examples that make up the bulk of the book are way out of date.

For Business Intelligence the equivalent would be a book called something like "Graphs that Suck", and Stephen Few's Perceptual Edge blog is a good place to find examples of this genre. Recently they posted a spectacularly bad example, a pie chart put out by Business Objects to promote a user conference. I will not repeat the critique, however I will say that if this is an example of what Business Objects thinks their software should be used for, I would be leery of using it!

Friday, July 03, 2009

Musician Uses Twitter to Her Advantage, Shock Horror Probe

Technology is turning the music business upside down, like any other media business. Some people embrace the change and some people decry it. When I read a post like this one about using Twitter to make money, I always read the comments. Whether the post is at the Berklee School of Music or TechCrunch, the range of responses is wide and consistent. Some commenters accept the new world and cheer it on, while others complain bitterly. Typical complaints range from: "I cannot do that because I do not have any fans" through "people should respect copyright and give me the money I am due" to "the record company put you there so you should give it all back to them".

The most ridiculous response is the complaint that a musician who spends time developing their fan base is wasting time that could be better spend on creative activities. The point of the Amanda Palmer post is that if you are properly organized, it does not take a lot of time or effort to keep in contact with your fans, particularly when using new instant communication tools like Twitter.

Technology changes. Music is no longer distributed as sheets of paper or by stamping it on 5, 7 or 12 inch pieces of plastic. The business model must change with the times.
The moving finger [of technology change] writes; and having writ,
Moves on: nor all your piety nor wit
Shall lure it back to cancel half a line,
Nor all your tears wash out a word of it.
HT to Roger for the Berklee post.