search

why your search engine (probably) isn’t rubbish

Now all search engines struggle,  to varying degrees,  with the knotty mess that is natural language. But they don’t generally don’t get called rubbish for not succeeding with the meaty search challenges.

Rubbish search engines are the ones that can’t seem to answer the most basic requests in a sensible manner. These are ones that get mocked as “random link generators”, the jibbering wrecks of their breed.

Go to  Homebase and search for “rabbit hutch” (we need another one as two of our girls are about to produce heaps of bunnies at the same time).

The first result is “Small plastic pet carrier”. There’s a number of other carriers and cages. Then there’s a “Beech Finish Small Corner Desk with Hutch”. Finally there’s a Pentland Rabbit Hutch at result no #8.  This is a rubbish set of results. I asked for “rabbit hutch” and they’ve got a rabbit hutch to sell me but they’re showing me pet carriers and beech finish corner desks.

This is a rubbish set of results. But it doesn’t mean the search engine is rubbish.

Somebody made a rubbish decision. They’ve set it up shonky.

So before you reach for the million pound enterprise search project, try having a quick look under the bonnet with a spanner.

Is it AND or OR?

This is reasonably easy to test, if you can’t ask someone who knows.

Pick a word that will be rare on your site and another word that doesn’t appear with the rare one  e.g.  ”Topaz form” for my intranet.  A rare word is one that should only appear one or two times in the entire dataset so you can check that the other word doesn’t appear with it.  You may need to be a bit imaginative but unique things like product codes can be helpful here.  If the query returns no results you’ve probably got an AND search.  More than a couple of results (and ones that don’t mention Topaz) and you’ve probably got OR.

(this can get messed up if there is query expansion going on but hopefully the rare word isn’t one whatever query expansion rules there are will work on).

AND is more likely to be problematic as a setting. You’ll get lots of “no results”. You’ll need your users to be super precise with their terminology and spell every word right.  If they are looking for “holiday form” and the form is called “annual leave form” they’ll get no results.

OR will generate lots of results. This is ok if the sort order is sensible. Very few people care that Google returned 2,009,990 results for their query. They just care that the first result is spot-on.

So most of the time you probably want an OR set-up.

(preferably combined with support for phrase searching so the users can choose to put their searches in nice speech marks to run an AND search if they want to and know how to).

Is there crazy stemming/query expansion going on?

Query expansion is search systems trying to be clever,  often getting it wrong and not telling you what they’ve done so you can unpick it. Basically the search system is taking the words you gave it and giving you results for those words, plus some others that it thinks are relevant or related.

Typical types of expansion are stemming (expand a search for fish to include fishes and fishing), misspellings and synonyms (expand a search for cockerel to include rooster).

This is probably what is happening if you are getting results that don’t seem to include the words you searched for anywhere on the page (although metadata is another option).

Now this stuff can be really, really helpful. If it is any good.

Have you got smart sophisticated query expansion like Google?  Or does it do silly (from a day-to-day not a Latin perspective) stemming like equating animation with animals? If it is the silly version then definitely switch it off (or tweak it if you can).

Even if you’ve got smart expansion options available, it’s generally best practice to either give the user the option of running the expanding (or alternate) query, or at the very least of undoing it if you’ve got it wrong. They won’t always spot the options (Google puts lots of effort into coming up with the right way of doing this) but it’s bad search engine etiquette to force your query on a user.

Is the sort order sensible?

That Homebase example. The main problem here is sorting by price low-high. That’d be fine (actually very considerate of Homebase) if I’d navigated to a category full of rabbit hutches. But I didn’t. I searched for rabbit hutches and got a mixed bag of results that included plenty of things that a small child could tell you aren’t rabbit hutches.

The solution? Sort by relevancy.

I’ve seen quite a lot of bad search set-ups recently where the search order was set to alphabetical. Why? Unless as Martin said when I bemoaned this on Twitter your main use case is “to enable people to find stuff about aardvarks”.

News sites sometimes go with most recent as the sort order. Kinda makes sense but you need to be sure the top results are still relevant not just recent.

Interestingly sort order doesn’t matter so much if you’ve gone for AND searches and you haven’t got any query expansion going on. If you’re pretty sure that everything in the result set is relevant, then you’ve got more freedom over sort order.  If not,  stick with relevancy.

(I don’t need to tell you that you want relevancy is high-low, do I?)

So people stop giving me grief over navigation.  Let’s talk about that rubbish search engine you’ve got.  I could probably fix that for you.

search

Comments (0)

Permalink

Search Solutions 2009

Last week I went to the Search Solutions event, held by BCS in their lovely office in Southampton Street. There were maybe 50 people, 6 or 7 women and seemingly even less laptops (which rather made it stand out from the more web-focused events I usually attend – because of lack of laptops not the male-female ratio).

I didn’t make masses of notes but I did capture a few points and reminders:

Vivian Lin Dufour from Yahoo talked about Search Pad, an attempt to make search more “stateful”.

Richard Russell from Google explained how the auctions for Google Ads work. Always interesting to hear more about the money side of things.

Dave Mountain, a geographer (another example of Nominative Determinism?) talked about geographical aspects of searching. He explained that if the task is “finding the nearest cafe”, then the ‘near’ isn’t a simple statement. There are types of near: as the crow flies, in travel time, in the direction I’m already going. After all you may not be interested in a cafe that’s already 5 miles behind you on the motorway. He had some good slides covering this, so hopefully they’ll be made available.

Tony Russell-Rose discussed Endeca’s impending pattern library. Should be interesting – public version to be available in the new year.

David White of Web Optimiser talked amongst other things about the importance of cross-media optimisation. He asked why don’t more companies, especially b2b ones, have phone numbers in title/description of search results? He also touched on the growth of twitter as a substantial source of referrals (in response to a question about whether Bing was increasing referrals and thus changing optimisation tactics).

Richard Boulton, as well as discussing his efforts with open source search, introduced us to the marvelous concept of dev/fort/.

“Imagine a place of no distractions, no IM, no Twitter — in fact, no internet. Within, a group of a dozen or more developers, designers, thinkers and doers. And a lot of a food.

Now imagine that place is a fort.”

Well marvellous to me but I wanted to get married in a Napoleonic fort so perhaps I’m not typical. He also mentioned searchevent.org, a day dedicated to open source search systems, which will hopefully happen again sometime.

Andrew Maisey talked about a school of thought that search will increasingly become less important on the site. Dynamic user journeys will encourage more browsing.

(Food was pretty good as usual for the venue.  I’m hoping that we’re going back to BCS for our team away-day later in the year and then I can have more of the strawberry tarts.)

events
search

Comments (1)

Permalink

SharePoint search: more insights

Surprisingly this white paper on building multilingual solutions in SharePoints provides a good overview of how the search works, regardless of whether you are interested in the multilingual aspect.

White paper: Plan for building multilingual solutions.

Read page 15, titled “overview of the language features in search” for a description of content crawling and search query extraction. Then 16-18 provide a good overview of individual features and what they are doing.

Word breakers A word breaker is a component used by the query and index engines to break compound words and phrases into individual words or tokens. If there is no word breaker for a specific language, the neutral word breaker is used, in which case word breaking occurs where there are white spaces between the words and phrases. At indexing time, if there is any locale information associated with the document (for example, a Word document contains locale information for each text chunk), the index engine will try to use the word breaker for that locale. If the document does not contain any locale information, the user locale of the computer the indexer is installed on is used instead. At query time, the locale (HTTP_ACCEPT_LANGUAGE) of the browser from which the query was sent is used to perform word breaking on the query. Additional information about the language availability of the word breaker component is available in Appendix B: Search Language Considerations.

Stemming Stemming is a feature of the word breaker component used only by the query engine to determine where the word boundaries are in the stream of characters in the query. A stemmer extracts the root form of a given word. For example, ”running,” ”ran,” and ”runner“ are all variants of the verb ”to run.” In some languages, a stemmer expands the root form of a word to alternate forms. Stemming is turned off by default. Stemmers are available only for languages that have morphological expansion; this means that, for languages where stemmers are not available, turning on this feature in the Search Result Page (CoreResult Web Part) will not have any effect. Additional information about language availability for the Stemmer feature is available in Appendix B: Search Language Considerations.

Noise words dictionary Noise words are words that do not add value to a query, such as ”and,” ”the,” and ”a.” The indexing engine filters them to save index space and to increase performance. Noise word files are customizable, language-specific text files. These files are a simple list of words, one per line. If a noise word file is changed, you must perform a full update of the index to incorporate the changes. Additional information about the noise words dictionary and how to customize it is available at www.microsoft.com.

Custom dictionary The custom dictionary file contains values that the search server must include at index and query times. Custom dictionary lists are customizable, language-specific text files. These files are used by Search in both the index and query processes to identify exceptions to the noise word dictionaries. A word such as “AT&T,” for example, will never be indexed by default because the word breaker breaks it into single noise words. To avoid this, the user can add ”AT&T” to the custom dictionary file; as result, this word will be treated as an exception by the word breaker and will be indexed and queried. These files contain a simple list of words, one per line. If the custom dictionary file is changed, you must perform a full update of the index to incorporate the changes. By default, no custom dictionary file is installed during Office SharePoint Server 2007 Setup. Additional information about the custom dictionary file and how to customize it is available at www.microsoft.com.

Thesaurus There is a configurable thesaurus file for each language that Search supports. Using the thesaurus, you can specify synonyms for words and also automatically replace words in a query with other words that you specify. The thesaurus used will always be in the language of the query, not necessarily the server’s user locale. If a language-specific thesaurus is not available, a neutral thesaurus (tseneu.xml) is used. Additional information about the thesaurus file and how to customize it is available at www.microsoft.com.

Language Auto Detection The Language Auto Detection (LAD) feature generates a best guess about the language of a text chunk based on the Unicode range and other language patterns. Basically, it’s used for relevance calculation by the index engine and in queries sent from the Advanced Search Web Part, where the user is able to specify constraints on the language of the documents returned by a query.

Did You Mean? The Did You Mean? feature is used by the query engine to catch possible spelling errors and to provide suggestions for queries. The Did You Mean? feature builds suggestions by using three components:

· Query log Information tracked in the query log includes the query terms used, when the search results were returned for search queries, and the pages that were viewed from search results. This search usage data helps you understand how people are using search and what information they are seeking. You can use this data to help determine how to improve the search experience for users.

· Dictionary lexicon A dictionary of most-used lexicons provided at installation time.

· Custom lexicon A collection of the most frequently occurring words in the corpus, built at query time by the query engine from indexed information.

The Did You Mean? suggestions are available only for English, French, German, and Spanish.

Definition Extraction The Definition Extraction feature finds definitions for candidate terms and identifies acronyms and their expansions by examining the grammatical structure of sentences that have been indexed (for example, NASA, radar, modem, and so on). It is only available for English.

search
sharepoint

Comments (0)

Permalink

BCS IRSG – Search Solutions 2009

I’m going to “Innovations in Web and Enterprise Search” at BCS next week

Search Solutions is a special one-day event dedicated to the latest innovations in web and enterprise search. In contrast to other major industry events, Search Solutions aims to be highly interactive and collegial, with attendance limited to 60-80 delegates.

Provisional programme

09:30 – 10:00 Registration and coffee

Session 1: (Chair: Tony Russell-Rose)

* 10:00 Introduction – Alan Pollard, BCS President

* 10:10 “Enterprising Search” – Mike Taylor, Microsoft

* 10:35 Accessing Digital Memory: Yahoo! Search Pad – Vivian Lin Dufour, Yahoo

* 11:00 “How Google Ads Work” – Richard Russell, Google

11:25 – 11:45 COFFEE BREAK

Session 2: (Chair: Andy MacFarlane)

* 11:45 “Location-based services: Positioning, Geocontent and Location-aware Applications” – Dave Mountain, Placr

* 12:10 “Librarians, metadata, and search” – Alan Oliver, Ex Libris

* 12:35 “UI Design Patterns for Search & Information Discovery”- Tony Russell-Rose, Endeca

13:00 – 14:15 LUNCH

Session 3: (Chair: Leif Azzopardi)

* 14:15 “Search-Based Applications: the Maturation of Search” – Greg Grefenstette, Exalead

* 14:40 “How and why you need to calculate the true value of page 1 natural search engine positions” – Gary Jennings, WebOptimiser

* 15:05 “Search as a service with Xapian” – Richard Boulton, Lemur Consulting

15:30 – 16:00 TEA BREAK

Session 4: (Chair: Alex Bailey)

* 16:00 “The Benefits of Taxonomy in Content Management”, Andrew Maisey, Unified Solutions

* 16:25 Panel: “Interactive Information Retrieval” – details to follow

17:00 – 19:00 DRINKS RECEPTION

via BCS IRSG – Search Solutions 2009.

events
search

Comments (0)

Permalink

search forms on online shops

I’ve been thinking about the search functionality for our online shop this week. I’ll write up our approach to search properly at a later date but for now I thought I share the variety of search forms I’ve seen on other online shops.

E-commerce search forms: simple boxes

E-commerce search forms: labelled boxes

E-commerce search forms: scope drop-downs

E-commerce search forms: guidance text

Some things of note:

  • The longer search boxes were mostly on book sites.
  • 3 sites also offered “suggestions as you type” (Amazon, Borders, Ocado)
  • Only 1 site had an obvious link to an advanced search
  • All sites handled scopes with a dropdown

(Visio stencil is from GUUUI)

e-commerce
search

Comments (4)

Permalink

webinar on SEOMoz tools

I often refer back to SEOMoz ranking factors article when I think teams are getting hung up on minor SEO issues.

Will from Distilled just ran a free webinar about the SEOMoz tools so it seemed a good opportunity to learn more about what more is available from SEOMoz.

Will says that SEO tools (some free) give you three things:

  1. Quick research (basic understanding)
  2. Deep dive research (actionable insights)
  3. Making things pretty for boss/client (ever important)

The Pro tools aren’t particularly cheap, so it was useful to have someone talk you through what the return on that investment would actually be. In places the data looks a lot like the stuff you get from your web analytics tool e.g. Google Analytics. But remember this is data on your competitors as well as your own site.

Using AutoTrader as an example, Will talked about

  1. SEOToolbox: Free tools. Will likes and uses Firefox plugins instead of some of these. Still likes and uses Domain Age tool
  2. Term Target: free, aggregates data on a given page, identifies keyphrases
  3. Term Extractor tool: free, uses for competitor and keyword research. 3 word phrases might give you something new.
  4. Geotarget. Get Listed is an alternative.
  5. Popular searches. Particularly likes the Amazon content.
  6. Trifecta. Useful aggregator. But has the comparison of your site to the rest of the web as whole (possibly unique data).
  7. CrawlTest: pro-tool. Xenu is an alternative.
  8. JuicyLinkfinder: finds linking opportunities
  9. Keyword Difficulty: how hard a keyword is going to be to rank for, regardless of domain.
  10. Rank Tracker: Will keen to stress that individual keyword ranking isn’t the important thing. Often your boss will demand it. Makes little graphs and will export to CSV. Can combine with analytics data e.g. using Google Analytics API
  11. Firefox toolbar Will loves this. Uses it more than any other SEO tool. Pro version better. Shows some pagerank-esque data for page and domain. Going up 1 MozRank point is equivalent to 8x stronger. So decimal points are important.MozTrust is similar but restricted to links from trusted sites.  Page Analysis also part of the toolbar? Alternative is Bronco tools.
  12. Linkscape: the tool SEOMoz are heavily investing in. Web graph of which pages link to each other on the web. Will doesn’t see an alternative to this. Free version does basic stuff. Pro version produces more data and prettier data. Will recommends the Adv Link Intelligence Report. You can get data on who links with “nofollow” which Will thinks is unique data.
  13. Labs: Online Non-Linear Regression is scary. Visualizing Link Data is more mortal friendly. Link Acquisition Assistant helps you construct queries for search engines to find link opportunities.Other tools include Social Media Monitoring and Blogscape.

(As a side point, Will recommends learning Excel functions MATCH and LOOKUP. And pivot tables.)

Distilled are going to do more conference calls, including one on keyword research tactics.  Could be  useful. Free webinars are another useful alternative to conferences when budgets are tight but you need to keep learning.

analytics
search

Comments (0)

Permalink

keyword tools for seo and navigation design

There are lots of tools that help you choose terms to purchase in PPC campaigns and to target for SEO.

They can also be useful in helping you design navigation, choose your site name and even your company name.

Google provides all sorts of resources, some which seem to do very similar things.

There are analytics specifically for your own site:

And some that anyone can use:

Of the ‘public’ tools I mostly use the Adwords Keyword Tool, inspite of not using Adwords.

Try searching for ‘phones’. From the results you can see whether ‘cell phone’, ‘wireless phone’ or ‘mobile phone’ is the dominant language in your area. When there are labels that my team is arguing about, Ill sometimes see if the Keyword Tool can add evidence to the argument.

But beware, they can get addictive.

information architecture
navigation
search

Comments (0)

Permalink

testing site search: solutions

So you’ve tested your site search. You’ve submitted some bugs. You’ve probably got lots of responses to those bugs along the lines of “oh, that’s just a config setting” , “you don’t understand – that’s a feature of how this product works” and “the search is fine, you just need to get the authors to do their metadata properly”.

Now the config statement is fine. So long as changing the configurations actually sorts the problem. Don’t sit back at this point. Either make the recommended changes yourself or insist the supplier does. Don’t close the bug until they’ve proved the point.

Changes you can usually make to the configuration

  • change the crawled pages
  • change the indexed fields
  • default query syntax
  • change stop/noise words, stemming and the thesaurus
  • ranking parameters

Be very, very careful if you are changing the ranking parameters. If fact, I’d suggest this is a mini-project in it’s own right. You’ll need to be able to make one change at a time and compare the new results with the old, across a large set of queries. You probably want to do this with someone who has experience with the specific search engine.

The other two scenarios/excuses are more problematic. If the search has a feature that you thing make the results bad you’ll need to see if you can get it switched off/removed. If you can’t you may have chosen the wrong product.

If your supplier thinks that teaching authors to do metadata properly is a simple goal then you may need a new supplier. This is hardly the attitude that made Google the search masters.

(I’m not contradicting my Best Bets post here: I think there are scenarios where properly motivated and focused editorial staff can do a better job than natural search results. But I’m not thinking of your average author, I mean your central web or search team. I mean people paid to care about search.)

You change the guidelines/training for authors. You can probably get the current batch of authors to listen to some simple tips and pointers. They might remember. They might pass them on. But be realistic, how much control do you have over the authors? Metadata education is often a thankless and futile task. The best solutions are those that don’t require the authors to think about search, whether that is technology or intervention by search specialists.

Where the natural results just aren’t good enough and the authors can’t help there are things you can do on the search results page to help the user out.

Not really about testing but still coming soonish:  Changing the interface

Related posts:

search

Comments (0)

Permalink

testing site search: running the tests

So you’ve prepared for testing site search.  Now you have to run the tests.

Set aside a reasonable block of time where you won’t be interupted. Schedule later sessions bearing in mind the crawl timescales. If you make changes you’ll need to wait for the crawl to run before you can test again.

You need content in the system before you can test search.  The ideal scenario is to be testing search once a site or system is fully populated with real content but this often isn’t possible. Don’t wait for the system to be populated if that means you won’t be able to make any technical changes.

So allow time for content creation as part of testing. You’ll probably want a mix of real content and dummy content that has been specifically written to test an aspect of search.

You’ll need to record the results so you need a spreadsheet.

  • Set up columns something like this:the query (linked if you are running the tests from here), whether the results are ok, a description of the issues, hypotheses about causes, changes or adjustments made to validate, bugs reported, screenshots (where necessary)
  • Create new versions of the worksheet each time you test, and label accordingly. If you make changes to the content or the configuration then test again after the crawl has run
  • Add queries to the spreadsheet as you go. No matter how good your original lists, you’ll explore other issues as you actually use the system.

I’m not merely testing. I’m attempting to analyse and resolve the issues. You could argue that I shouldn’t need to do this, I could just log all the issues with the supplier and get them to resolve them. In my experience it is more successful to do as much as possible yourself.

So what does ok mean? Inevitably it is subjective and it is also qualitative. You could compare with benchmarking metrics for the existing site but some part of the testing usually relies on the subjective judgement of the expert tester. Where time for testing is fixed, I raise the bar with different rounds of testing i.e. round one could be focusing on results that are patently unacceptable, with later rounds raising the standard of quality.

(this testing is in no way meant to replace user testings, the intention is more to test that the functionality works as promised and to get the results to the sort of quality that it is worth putting in front of test participants!)

Mostly you’ll have no problem spotting bad results. Explaining the bad result is the challenge.

Possible sources of issues

Incomplete crawls. First check the search engine successfully completed a crawl. Testing is easiest if you can check yourself. Otherwise you’ll keep having to nag the suppliers/IT to tell you if the crawl went ok. Ask if there is an interface that shows how the crawl went and ask for access.

What is the default query syntax? This is a simple one to check off. If you thought the search was performing an OR search and it is actually running AND then that might well explain why you aren’t happy with the results. And vice versa.

Documents/pages that shouldn’t be crawled? Pages I’ve seen in the results that shouldn’t have been there include:

  • admin pages (in one case the blocked profanity list!)
  • permission controlled pages
  • quiz answers
  • form thank-you page
  • user profile information

You may need to get rid of a lot of these pages before you can see the true quality of the results.

Documents/pages that should be crawled?

  • other specified domains in addition to your main site e.g. www.rnibcollege.ac.uk as well as www.rnib.org.uk
  • all sub-domains e.g. not just www.bbc.co.uk but also jobs.bbc.co.uk and news.bbc.co.uk.
  • pages regardless of their position in the site
  • Office and other documents
  • images, video, audio (depending on how you want these assets to appear)

What is being indexed within a document/page? You can check by creating a variety of dummy content and adding your test keyword to a different field on each piece of dummy content. Choose an unusual keyword that won’t be appearing in the rest of the content (I tend to use my mother’s Polish maiden name). Fields to check:

  • titles
  • URLs
  • meta descriptions and keywords
  • main page content
  • authors and other metadata relevant to your content set
  • navigation and page furniture (you’ll see this cause trouble more when the content set is small)
  • full content of Office document, pdfs etc?
  • metadata attached to multimedia assets

What filters are being applied? Check for:

  • stop words
  • stemming
  • thesaurus

Ask if there is an interface where you can view/edit these filters. If not, ask for copies of the actual files.

What is affecting the ranking? This is complicated to test with any ease as most systems use a variety of factors and there’s usually a level of mystery in the supplier communications. Consider:

  • where the keyword appears
  • how many times the keyword appears
  • the ratio of keywords/article length
  • type of document
  • links to the document, text of those links, authority/rank of the linking page

If you’ve been told that your search system utitlises “previous user behaviour” to adjust ranking then this can make testing a bit tricky. It also gives the suppliers a black box to hide behind if you don’t think the search is working right.

I’ve been told “don’t worry about testing search, this is a learning system”. Which sounds lovely but on day one the search results still need to be good enough to go live and you’re going to have to really work hard to get a grip on how the system is working. And who says it is learning the right lessons? In this particular scenario I doubled the amount of time I had set aside for testing.

Next:  Solutions to try

search

Comments (0)

Permalink

testing site search: preparation

In last week’s post about Best Bets I commented that search software is “certainly not good enough without a lot of work. A lot of expensive work. If your supplier says ‘the search is really good, you don’t need to worry about it’ then you definitely need to worry about it.”

Worrying about and testing search systems has been a common theme in my working life: whether that involves benchmarking the performance of existing system, testing a new one prior to launch and comparing vendors when choosing a new system.

I’ve had varying levels of exposure to APR Smartlogik, Google, Inktomi/Yahoo, Fast, Verity, Autonomy, SharePoint. At this moment I’m in the middle of testing and tweaking the search for a SharePoint powered website. The challenges are surprisingly similiar to those I encountered when working with Muscat in 2001.

Having gone through such similar processes so many times, now seemed a good time to write it all down. I’ve divided my process into three stages: preparation, running the tests, and making changes.

Preparation


1. Ask the suppliers lots and lots of questions. You are after actual answers, testing their level of knowledge and letting them know that the quality of the search matters to you. Don’t rely wholy on the suppliers answers. Find other users and do your own reading to validate what the supplier tells you.

Most important to find out:

  • Ranking criteria
  • What is configurable; of those configurations which have a graphical interface; and of those which have a user friendly graphical interface?

Other useful things to find out:

  • What query syntax is supported? What is the default syntax?
  • What are the stemming rules and which words are stop words? Ask for copies
  • Is there a default thesaurus? Ask for a copy
  • What will the crawl timescales be during testing?
  • How to construct queries using the URL query strings

2. Build a list of test queries. You really need hundreds. Good sources are:

  • Names of a pages/articles on your current site or items in your catalogue
  • Real queries from your search logs or from a similar site if you can find someone willing to share
  • Obvious variants of these terms – thesaurus, misspellings, abbreviations
  • Known problems – ask for feedback from users
  • Include a range of specific items, broad topics and ambiguous queries

Your list could be a simple list of terms but you’ll find it easier to run many rounds of tests if you set your list up as http links that will run the query in your test search engine.

If you are testing multiple search engines and you have access to coding skills then you can set up the list to run automatically across the range of search engines and display your result back to you, saving lots of time. Or if you are running multiple rounds of testing on the same search system, an interface that checks to see if the results have changed since last time is invaluable.

But for most of us, we’ll be working from a list of queries and running them one by one.

Next: Running the tests

search

Comments (0)

Permalink