First Batch from RustySearch

Rusty just posted the initial results from RustySearch – an attempt to measure search engine relevancy by allowing users to rate listings.

Interestingly, so far it looks like all the search engines are returning equally relevant results.

Sidenote: Orion (Dr. Garcia), author of the Fractals paper, commented on Rusty’s site about problems regarding the project.

The problem with this type of test is that you are measuring user’s perception of relevancy, not what a search engine scores as relevancy. This mismatch has nothing to do with sample size. See Nacho’s post at this site.

Each SE has its own scoring functions, so it is like comparing apples with oranges. I wish you luck with the experiment.

Orion’s comment is totally backwards to me – If a search engine’s idea of relevancy is not the same as user’s idea of relevancy then what good is the search engine? As a tool for information retreival, a search engine needs to return data that users determine to be useful; ultimately a search engine’s goal should be returning the results that users perceive to be relevant. Maybe I’m not getting the point of Orion’s comment.

One thought on “First Batch from RustySearch

  1. “If a search engine’s idea of relevancy is not the same as user’s idea of relevancy then what good is the search engine? As a tool for information retreival, a search engine needs to return data that users determine to be useful;”

    1. This an ideal scenario. Many SEOs still don’t understand that SEs return results that have been sorted based on similarity scores. These scores are supose to be a measure of relevancy which often is measured in terms of terms vector weights (local, global and normalized) and link weights. The process by which this is carried out has nothing to do with what humans want or perceive as relevant.

    2. The term vector models do not account for polysemy and synonymity, hence assume terms are independent from each other. This affects both precision and recall. LSI tries to solve this by considering topic concepts rather mere terms. Unfortunately it is computationally expensive as to be used by large, commercial repositories. So SEOs as users are stuck with what is available. Item-to-item personalization machines seem to be a positive approach but is still evolving.

    3. Human perception of relevancy based on the information a user sees mostly is a non linear process. The processing of information as “seen” by a search engine is based on the linear discovery of tags present in source codes. Positioning of the information to users affect their perception of relevancy, but this has no effect on the linear way SEs find tags in documents. This renders many metrics that could measure relevancy from the user side completely irrelevant and explains why linearization strategies as part of any SEO strategy is so important. All this is well explained in http://www.miislita.com/fractals/keyword-density-optimization.html

    Hope this help

    Orion

Comments are closed.