Thursday, February 21, 2008

Specifying patterns for your Custom Search Engine



Creating a basic Custom Search Engine (CSE) is very easy. You enter a list of sites, select a few basic preferences, and you are done, right? But in fact there's more to Custom Search -- consider it a very powerful way of building your own search engine on top of Google search. You can exclude sites, add labels for drill-down and even change the ranking of results for your search engine. In this blog post, we look at the basic element of Custom Search - URL patterns

URL patterns specify the part of the web you want to search or exclude from your search. Custom Search is based on approximation algorithms that use these patterns to give you your customized results.

Consider the "I Love Veggies" search engine that we created. Here's how the "I Love Veggies" search engine made use of patterns effectively:

  • Be very specific. Use the longest possible pattern for specifying a site. For example, in the "I Love Veggies" search engine, we wanted to search all of www.goveg.com, so we added "www.goveg.com/*" as a pattern. But we wanted to search only the vegetarian part of the "allrecipes.com" site. So instead of adding all of "allrecipes.com/*" we added the more specific "allrecipes.com/Recipes/Everyday-Cooking/Vegetarian/*".
  • Specify multiple pages in a site with a "*" at the end of the pattern. If you specify just "www.goveg.com", Custom Search will search just the single page http://www.goveg.com. You need to remember this only if you are write your XML file of annotations directly. If you are using the Control Panel, it automatically adds the "/*" at the end for you, unless you indicate otherwise.
  • Sometimes, you might have a few hosts on a domain with the same path that you want to search. In our example, we wanted to search "mideastfood.about.com/od/vegetarianrecipes/*" and "indianfood.about.com/od/vegetarianrecipes/*". In such a case it is better to specify these patterns individually instead of a very general "*.about.com/od/vegetarianrecipes/*" as more specific the patterns, better the approximation.
  • You can only use the * in the hostname at the beginning of the pattern and it can only represent a full token. For example, "*.about.com/*" is a valid pattern and so is "*.food.about.com/*". However, "*ood.about.com/*" is not valid, nor is "food.*.about.com/*".

Keep reading this blog for more tips and tricks as we develop our "I Love Veggies" search engine. If you have specific questions or feature requests you can visit our Help Center or ask a question on the Discussion group.

Friday, February 8, 2008

Promoting useful information and web pages



Webmasters often want the ability to promote specific information or web pages when users search for specific things via their Custom Search Engine (CSE).
Here are some examples where this is useful:

* You have a travel site, and want to draw attention to a spring promotion above the search results for keywords "hawaii", "maui", "kona". You want a nice image of the beach and the three most popular packages listed right on top.
* You've created a soccer search engine, and you want soccer fans to quickly get to the results of the Cup of Nations tournament in Ghana for queries [africa], [ghana], and [cup of nations]
* Your company just launched a brand new product, and you'd like people to know about it via a headline link to your product blog post when they search for older products in the same category.

We've always had the ability to do this in Custom Search (via a feature called Subscribed Links) but now, we've tried to make it a little easier for you.

When you go to to the control panel for your CSE, you'll see a new option in the "Preferences" section towards the bottom of the "Basics" tab. Selecting this option will enable your Subscribed Link to be triggered in your CSE for the keywords you've specified. If you don't already have a Subscribed Link defined, you can create one in minutes. Just specify the trigger keywords, the summary text you'd like displayed, and the URL of the target web page when the link is clicked.


When visitors search on your CSE using the special trigger keywords, your special link will show up right on top of the results, where they won't miss it.

You can manage your Subscribed Link via the Subscribed Link console. Check the developer documentation for advanced options. Remember that subscribed links display differently on your CSE (above the results) versus on Google.com (inline).