Thursday, October 30, 2008

Knol - now with Custom Search

Cedric Dupont, Product Manager
Knol is a project aimed at helping people share their knowledge. Knols (units of knowledge) are authoritative articles written by people about a specific subject, ranging from tooth pain to solar energy to buttermilk pancakes. With all of these knols to browse, our readers have been begging us for a more powerful search tool. Well, today we have good news - we've added the power of Custom Search to Knol.

Custom Search gives us all of the speed and relevance of Google's search technology, but required none of the hard labor that went into making Google search what it is today. With Custom Search, Knol visitors will have a fast search experience that features all of the bells and whistles Google searchers have come to love (including our advanced spell checker). When you search within Knol, the search results look a little different from Web Search results (see this search for knee surgery, for instance). We were able to maintain that distinct feel for Knol search by taking advantage of the Custom Search XML results format (via Google Site Search) and creating the search result display that we wanted.

We want to hear from you if you have feedback on Knol search. Please leave your comment on our help page.

Friday, October 3, 2008

Synonyms for your Custom Search Engine

Victor Wang and Bartlomiej Niechwiej, Software Engineers

With our launch of Google Site Search, we added a new feature to the Custom Search platform: custom synonyms. Here's how this feature can be used to improve the quality of your Custom Search Engine (CSE).

How can custom synonyms help?
Synonyms help by finding documents with relevant related terms and ranking them higher. Synonyms can help to alleviate the mismatch that often occurs between the queries that users type, and the actual words, phrases and concepts used in the documents being searched. Custom Search, of course, automatically takes advantage of synonyms used in Google Web search. In addition, Custom Search goes one step further: we allow you to explicitly define custom synonyms that are specific to your web site, community or topic of interest.

To illustrate situations where CSE custom synonyms can help, we created 2 CSEs that both search content from the Palo Alto Medical Foundation (PAMF). The first CSE does not have custom synonyms enabled, while the second CSE has a few custom synonyms enabled:

Terminology: The queries people use sometimes don't match up with the words and phrases used in the content being searched. During allergy season, for example, many people look for information on "hayfever", but the results without synonyms aren't that great since the web pages we're searching across don't necessarily use this specific term. However, if the technical phrase "allergic rhinitis" is added as a synonym for "hayfever", the results with the synonym are far better.

Acronyms: Acronyms often stand for different terms in different contexts. This is especially true within organizations, where acronyms are used frequently. In such cases, it may be possible to improve retrieval via the use of synonyms. Thus, "PAMF" refers to "Palo Alto Medical Foundation", and adding this synonym improves search results: searching for "PAMF" without synonyms gives only a few relevant results, while the CSE with the synonym returns more relevant results.

Community: Within specific target user communities, words or phrases have different usage and significance. When a patient is looking for "hearing doctor", adding the synonym "audiologist" to the CSE provides much better results while the results in the CSE without synonyms are not optimal. Conversely, if a doctor is searching for "somnambulism", she finds no results at all in the CSE without synonyms, but much better results via addition of the synonym "sleepwalking" to the CSE. Synonyms can therefore be used to improve the experience of specific classes of users of a web site.

How can I add synonyms to my Custom Search Engine?
The CSE administrator can control the set of synonyms used by uploading a synonym dictionary that is specific to the domain and website. The synonym dictionary can include alternate words or phrases for common search queries. The following steps show how to add the synonyms for the Palo Alto Medical Foundation CSE.

  1. Download the existing CSE context file through "Control panel"->Advanced->"Download context"

  2. Add custom synonyms to your search engine. The synonym dictionary is uploaded as part of the context XML file. Here are the synonyms we added for the above examples:

    <customsearchengine>
    <title>...</title>
    <description>...</description>
    <context>
    <backgroundlabels>...</backgroundlabels>

    <synonyms>
    <synonymentry word="hearing doctor">
    <synonym>audiologist</synonym>
    </synonymentry>
    <synonymentry word="hayfever">
    <synonym>allergic rhinitis</synonym>
    </synonymentry>
    <synonymentry word="somnambulism">
    <synonym>sleepwalking</synonym>
    </synonymentry>
    <synonymentry word="pamf">
    <synonym>Palo Alto Medical Foundation</synonym>
    </synonymentry>
    </synonyms>

    </context>
    </customsearchengine>

  3. Upload the context file through "Control panel"->Advanced->"Upload context"


A few notes:

  • CSE synonyms are unidirectional, not bidirectional. Thus, a context file with
    <synonymentry word="migraine"><synonym>headache</synonym></synonymentry>
    defines "headache" to be a synonym for "migraine". However, if you also want "migraine" to be considered as a synonym for "headache", you need to add a separate SynonymEntry to the XML, as follows:

    <synonymentry word="migraine"><synonym>headache</synonym></synonymentry>
    <synonymentry word="headache"><synonym>migraine</synonym></synonymentry>

  • In the current version, the synonym dictionary can only be uploaded/downloaded as a part of the context XML file. We hope to make this much easier in the future.

  • We allow up to 500 individual synonyms for a given CSE.

  • Each word can have no more than 10 synonyms. If there multiple synonyms, the query will be expanded to include all synonyms uniformly.


If you are using custom synonyms in your CSE, we hope to get feedback from you about what improvements we can make.