Configuring the weights of a result ranking algorithm - you can't turn a VW Beetle into a Porsche Boxter
One of the requirements I often come across in RFTs which talk about search concerns the ability of the customer to "configure search engine weightings and other settings".
This comes up frequently enough that I know people believe it to be important. However, I have always felt this to be blatantly misguided.
Some years ago, I worked in a research group which, among other things, organised the Web track at TREC. Highly trained information retrieval researchers spend years designing, testing, and evaluating search engine ranking algorithms. Conducting tests to see whether the changes you make to algorithms is difficult, and relies on standardised test collections, including queries and the known relevant answers to these queries.
Where I suspect the requirement arises from is fundamental dissatisfaction with the performance (in particular, the result ranking algorithm) of poor quality search engines.
People appear to believe that by being able to "fiddle with the knobs" they will somehow be able to solve the problem. The sad reality is that this is not the case - a poor quality ranking algorithm doesn't improve by such fiddling; it requires replacement by a proven higher quality ranking algorithm, in a high quality search engine. It's analogous to thinking that maybe if I could have fiddled with the engine settings on my old Volkswagon Beetle, I could have made it perform like a brand new Porsche Boxter.
Sometimes of course, the problem also arises because of poor quality data. If you don't have good content, with the right information in it, no search engine is going to be able to find it.
Another problem is that there is a vanishingly small number of people with the appropriate expertise to meddle with ranking algorithms, and this usually requires deep understanding of the algorithms being used, and the effects of modifying any such settings. A not insignificant percentage of these people now work at Google Labs, who assiduously went around vaccuuming up many good information retrieval researchers from 1999 onwards. Others work at Microsoft Research, and for Yahoo, among others. Universities and research institutes retain still more.
That said, what I suspect is really the business requirement is that there are particular searches (e.g. "Panoptic search engine") for which there is a known answer (e.g. http://www.panopticsearch.com), and customers would like to be able to instruct the search engine to return at least this answer as a highly visible result, regardless of what else is returned. If that's the case, then that should be what people ask for. After all, we don't usually buy a car so that we may practice becoming motor mechanics.
0 Comments:
Post a Comment
<< Home