ia play

the good life in a digital age

Archive for the ‘digital’ Category

for newspapers, content is (still) the problem

with one comment

I’m not exactly a digital native, more first generation immigrant. Nor am I an enthusiastic internet pirate.

I grew up with the habit of newspaper buying. I once worked for a national newspaper. I enthusiastically read the paper cover to cover.

Not anymore.

I stopped buying during the week, once we moved away from Finchley Central. The combination of the regular delays on the Northern line and a newspaper  shop on the southbound platform meant a reasonably regular thought process of  “sod it, might as well buy a paper while we wait”.  The mere geography of the new tube station undermined the purchase process.

For a while it remained a weekend pleasure (with coffee and cats) but in the end I stopped that too.

I stopped because the content alienated me. I was disappointed with the bizarre fashion supplements, with the obsession with new media (biogs for authors that were nothing more than “who blogs at”) and some frustratingly elitist editorials ( Few people know nothing at all by Beethoven). I’m still annoyed about the folksonomic zeitgeist.

And I felt like I knew little more when I put down the paper than when I had picked it up. I knew the gist of the news before I read it and I could guess what the columnists were going to say about. There was never any real analysis, nothing that made me understand.

I tried other papers, even straying a long way from my political comfort land. They all annoyed me. Oddly the Financial Times annoyed me least, perhaps because I had a lot to learn about their particular view of the world. And then I just gave up and saved the pounds.

These days I don’t normally get news from the internet, whether that be blogs, the BBC or newspapers. I get it from the radio.

I do go to newspaper websites (of all stripes) to read the comment stuff but mostly it just annoys me.  Reading it is irrational but I still do it. Paywalls will help me stop irritating myself.

I do still like the supplements ( food , money, gardening and the like)  but figured I might as well just buy a dedicated magazine. They’ll cover those subjects better anyway.  And so we do. Shedloads of magazines still pass through our house.  Proper dead tree media.

So perhaps we could move on from all this paywall business and complaining about the internet.  Maybe it is time to sort out the lazy, trite content instead?

Written by Karen

May 26th, 2010 at 6:25 am

Posted in digital

e-commerce project: the browse structure

without comments

This article is part of a series about our e-commerce redesign.

The browse structure of any website is always controversial within the organisation. I’m always struck by the discrepancy between how interested the organisation is in browse (as opposed to search) and how interested the users are. I’m not saying users don’t want a sensible, intuitive navigation scheme but they also want a really effective search engine. Most web design project involve huge amounts of effort invested in agreeing the navigation and very few discussions of how search will work.

Partly this is because navigation is easy for stakeholders to visualise. We can show them a sitemap and they can instantly see where their content is going to sit. And they know the project team is perfectly capable of changing it if they can twist their arm. With search on the other hand, stakeholders often aren’t sure how they want it to work (until they use it) and they’re not sure if it is possible to change anyway (search being a mysterious technical thing).

Even forgetting search, the focus on navigation is almost always about primary navigation with most stakeholders have very little interest in the cross-links or related journeys. The unspoken assumption is still that the important journey is arriving at the homepage and drilling down the hierarchy.

So I went into the e-commerce project assuming we’d need to spend alot of time consulting around the navigation structure (but knowing that I’d need to make sure I put equal energy into site search, seo and cross-linking, regardless of whether I was getting nagged about it).

A quick glance also showed that the navigation wasn’t going to be simple to put together. Some of my colleagues thought I wasn’t sufficiently worried but I’m used to the pain of categorising big diverse websites or herding cats as Martin puts it. I participated in at least three redesigns of the BBC’s category structure, which endeavours to provide a top-down view of the BBC’s several million pages on topics as diverse as Clifford the Big Red Dog, the War on Terror and Egg Fried Rice.

My new challenge was a simple, user friendly browse structure that would cover a huge book catalogue,  RNIB publications, subscriptions to various services, magazines, and a very diverse product catalogue of mobility aids, cookware, electronics and stationery. And those bumpons, of course.

Card-sorting is usually the IA’s weapon of choice in these circumstances. Now I’ve got my doubts about card-sorting anyway, particularly where you are asking users to sort a large, diverse set of content of which they are only interested in a little bit of it. Card-sorting for bbc.co.uk always came up with a very fair, balanced set of categories but one that didn’t really seem to match what the site was all about. It was too generous to the obscurer and less trafficked bits of the site and didn’t show due respect to the big guns. Users didn’t really use it, probably even the users who’d sorted it that way in the testing. My favourite card-sorting anecdote was the  guy who sorted into two piles “stuff I like” and “stuff I don’t like”. Which I think also alludes to why card-sorting isn’t always successful.

In any case, card-sorting isn’t going to half as simple and cheap when your users can’t see.

We decided to put together our best stab at a structure and create a way for users to browse on screen. Again not just any old prototyping methods is going to work here – however the browse structure was created would need to be readable with a screenreader.  So coded properly.

I wrote some principles for categories and circulated them to the stakeholders. Nothing controversial but it is helpful to agree the ground rules so you can refer back to them when disagreements occur later.

I reviewed the existing structure, which has been shaped over the years by technical constraints and the usual org structure influence.  I also looked at lots of proposed re-categorisations that various teams had worked on. I looked at which items and categories currently performed well. I reviewed the categorisation structures as part of the competitive review.

I basically gathered lots of information. And then stopped. And looked at it for a bit. And wondered what to do next.  Which is also pretty normal for this sort of problem.

(actually one of the things I did at this point was write up the bulk of this blog post – I find it really, really helpful to reset my thinking by writing up what I’m doing)

Somewhat inevitably I got the post-it notes out. I wrote out a post-it for each type of product and laid them out in groups based on similarity (close together for very similiar products and further away as the relationship gets weaker). This is inevitably my sense of similarity but remember this is a first stab to test with users.

Where obvious groups developed I labelled them with a simple word, some like books or toys. If a group needed a more complex label then I broke it up or combined it until I felt I had very simple, easily understood labels (essentially a stab at “basic categories”).

There were too many groupings and there were also a scattering of items that didn’t fit any group (the inevitable miscellaneous group). I dug out the analytics for the shop to see how my grouping compared in terms of traffic. I made sure the busiest groups were kept and the less popular sections got grouped up or subsumed.

This gave me a first draft to share with the business units. Which we argued about. A lot.

I referred everyone back to the principles we’d agreed and the analytics used to make the decisions. Everyone smiled sweetly at me and carried on with the debate.

After some advice from my eminently sensible project manager, I conceded one of the major sticking points. As I reported on Twitter at the time:

“Have given in and allowed the addition of a 13th category. Will the gates of hell open?”

Luckily at this stage we were finally able to do some usability testing with some real users. Only four mind, but they all managed to navigate the site fine and actually said some nice stuff about the categories. One tester even thought there must be more products on the new site, in spite of us cutting the categories by two-thirds.

So if someone attempts to re-open the browse debate, hopefully we can let usability tester #2 have the last word as in her opinion the new shop is…

“very, very clearly divided up”

Enough navigation, time to concentrate on search….

Related posts:
Re-branding miscellaneous

Written by Karen

May 12th, 2010 at 6:50 am

tripped up by “you might also like”

without comments

My rabbit hutch purchasing has been an interesting vein of UX experiences. In the end I bought a hutch from JustRabbitHutches, whose website was mostly pleasant to use and whose service was great.

That said, once I’d added my hutch to the basket I noticed they’d been tripped up by recommendations. Under my basket were suggestions that I might enjoy. Unfortunately one of them was a “delivery surcharge”.

Surcharges are always so much fun

Now this isn’t as damaging as Walmart’s dodgy DVD recommendations but it’s another example of how careful you have to be.

You could also ask why JustRabbitHutches thought they needed a recommendation engine here. After all the clue is in the title. If I’m buying a rabbit hutch, how likely is it that they’ll be able to sell me another one?

Written by Karen

March 23rd, 2010 at 6:42 am

e-commerce project: competitive review

without comments

This article is part of a (rather drawn-out)  series about our e-commerce redesign.

Competitive reviews do what they say on the tin: they review what your competitors are doing. They are particularly useful in a busy, well-developed marketplace where you can find good matches for your site/product.

With our e-commerce project, my first step was to identify what I meant by competitors. The definition is much wider than other charities for blind and partially sighted people with online shops. You are looking for sites that your audience will be familiar with, with similar product sets, with similar challenges and sites that may be interesting/innovative in general. They don’t have to be all of these things.

Some are easy to identify. If you are looking for market leading commerce facing sites that you can probably reel them off yourself.

You can also:

  • ask your colleagues
  • ask your network (Twitter is pretty good for this)
  • do some Google searches (try searching for all the sites you’ve already thought of, this often brings up other people’s lists)
  • look for market reports from Nielsen, Forrester etc…

I then bookmark the websites, using delicious. This means I have quick access to the set as I can reopen all the websites in one go (or in smaller tagged sub-sets) by selecting “open all in tabs” (I think you need the Firefox plugin to do this, I can’t see a way from the main site).

My four main sub-sets for the e-commerce project were

  • mainstream shops
  • charity shops
  • alternative format bookstores
  • disability/mobility stores

1. Mainstream shops (link to delicious tag)
These are sites that UK webusers are likely to be familiar with e.g. Amazon, Argos and John Lewis. I chose some for the breadth of their catalogue (a problem we knew we were facing) and some for specific range matches e.g. Phones4U or WHSmiths

Where these sites consistently treat functionality or layout in a particular way, I considered that to be a standard pattern and therefore something the users might well be familiar and comfortable with.

(it is worth noting that we don’t have definitive data on the extent to which RNIB shop customers also use other online shops. On one hand their motivation to use online shopping may be stronger than average UK users as they may face more challenges in physical shops, but on the other hand the accessibility of mainstream shops may discourage them)

2. Charity shops

These sites are slightly less useful as competitors that it might appear at first. They were useful when considering elements like donations but in many cases the shops were targeted at supporters not beneficiaries and they carried much narrower ranges. There are however some very high quality sites where it is clear that a lot of thought, time and effort has been invested.

3. Alternative format bookstores

This included mass market audiobook stores and some that are targetted particularly at people with sight loss. Most of these sites were dated and a little awkward to use. I reviewed them briefly but mostly didn’t return to them.

4. Disability/mobility stores

There are quite a number of these sites. They often feel like print catalogue slung on a website and weren’t very sophisticated from an IA perspective. I did look in detail at the language they used to describe products as there was likely to be a heavy overlap with our product set.

I had a number of initial questions that I wanted to research.
1. The number of categories on the homepage
2. Other elements on the homepage
3. How they handled customer login

I created a spreadsheet and when through the sites one by one, recording what I found. It took me about 2 hours to review 60 sites against this limited set of criteria.

I did the original review ages ago but I went back to the sites reasonably regularly during our design phase, usually when we couldn’t reach agreement and we needed more evidence to help make a decision. Sometimes I would just add a column to an existing spreadsheet e.g. when checking which sites had a separate business login. At other times I created whole new spreadsheets e.g. when auditing how the search function worked.

These later reviews took less time, either because I was checking for less criteria or because I dropped less relevant or low quality sites. I’m still going back to the competitive review even during testing, as various testers start finding their own favourite website and asking “why doesn’t it work like this?”.  It is always useful to know if they are right that “normal” websites do X. The competitive review  saves a lot of argument time.

Written by Karen

March 2nd, 2010 at 6:54 am

Posted in e-commerce,rnib

SharePoint search: more insights

without comments

Surprisingly this white paper on building multilingual solutions in SharePoints provides a good overview of how the search works, regardless of whether you are interested in the multilingual aspect.

White paper: Plan for building multilingual solutions.

Read page 15, titled “overview of the language features in search” for a description of content crawling and search query extraction. Then 16-18 provide a good overview of individual features and what they are doing.

Word breakers A word breaker is a component used by the query and index engines to break compound words and phrases into individual words or tokens. If there is no word breaker for a specific language, the neutral word breaker is used, in which case word breaking occurs where there are white spaces between the words and phrases. At indexing time, if there is any locale information associated with the document (for example, a Word document contains locale information for each text chunk), the index engine will try to use the word breaker for that locale. If the document does not contain any locale information, the user locale of the computer the indexer is installed on is used instead. At query time, the locale (HTTP_ACCEPT_LANGUAGE) of the browser from which the query was sent is used to perform word breaking on the query. Additional information about the language availability of the word breaker component is available in Appendix B: Search Language Considerations.

Stemming Stemming is a feature of the word breaker component used only by the query engine to determine where the word boundaries are in the stream of characters in the query. A stemmer extracts the root form of a given word. For example, ”running,” ”ran,” and ”runner“ are all variants of the verb ”to run.” In some languages, a stemmer expands the root form of a word to alternate forms. Stemming is turned off by default. Stemmers are available only for languages that have morphological expansion; this means that, for languages where stemmers are not available, turning on this feature in the Search Result Page (CoreResult Web Part) will not have any effect. Additional information about language availability for the Stemmer feature is available in Appendix B: Search Language Considerations.

Noise words dictionary Noise words are words that do not add value to a query, such as ”and,” ”the,” and ”a.” The indexing engine filters them to save index space and to increase performance. Noise word files are customizable, language-specific text files. These files are a simple list of words, one per line. If a noise word file is changed, you must perform a full update of the index to incorporate the changes. Additional information about the noise words dictionary and how to customize it is available at www.microsoft.com.

Custom dictionary The custom dictionary file contains values that the search server must include at index and query times. Custom dictionary lists are customizable, language-specific text files. These files are used by Search in both the index and query processes to identify exceptions to the noise word dictionaries. A word such as “AT&T,” for example, will never be indexed by default because the word breaker breaks it into single noise words. To avoid this, the user can add ”AT&T” to the custom dictionary file; as result, this word will be treated as an exception by the word breaker and will be indexed and queried. These files contain a simple list of words, one per line. If the custom dictionary file is changed, you must perform a full update of the index to incorporate the changes. By default, no custom dictionary file is installed during Office SharePoint Server 2007 Setup. Additional information about the custom dictionary file and how to customize it is available at www.microsoft.com.

Thesaurus There is a configurable thesaurus file for each language that Search supports. Using the thesaurus, you can specify synonyms for words and also automatically replace words in a query with other words that you specify. The thesaurus used will always be in the language of the query, not necessarily the server’s user locale. If a language-specific thesaurus is not available, a neutral thesaurus (tseneu.xml) is used. Additional information about the thesaurus file and how to customize it is available at www.microsoft.com.

Language Auto Detection The Language Auto Detection (LAD) feature generates a best guess about the language of a text chunk based on the Unicode range and other language patterns. Basically, it’s used for relevance calculation by the index engine and in queries sent from the Advanced Search Web Part, where the user is able to specify constraints on the language of the documents returned by a query.

Did You Mean? The Did You Mean? feature is used by the query engine to catch possible spelling errors and to provide suggestions for queries. The Did You Mean? feature builds suggestions by using three components:

· Query log Information tracked in the query log includes the query terms used, when the search results were returned for search queries, and the pages that were viewed from search results. This search usage data helps you understand how people are using search and what information they are seeking. You can use this data to help determine how to improve the search experience for users.

· Dictionary lexicon A dictionary of most-used lexicons provided at installation time.

· Custom lexicon A collection of the most frequently occurring words in the corpus, built at query time by the query engine from indexed information.

The Did You Mean? suggestions are available only for English, French, German, and Spanish.

Definition Extraction The Definition Extraction feature finds definitions for candidate terms and identifies acronyms and their expansions by examining the grammatical structure of sentences that have been indexed (for example, NASA, radar, modem, and so on). It is only available for English.

Written by Karen

September 30th, 2009 at 6:56 am

Posted in search,sharepoint

search forms on online shops

with 4 comments

I’ve been thinking about the search functionality for our online shop this week. I’ll write up our approach to search properly at a later date but for now I thought I share the variety of search forms I’ve seen on other online shops.

E-commerce search forms: simple boxes

E-commerce search forms: labelled boxes

E-commerce search forms: scope drop-downs

E-commerce search forms: guidance text

Some things of note:

  • The longer search boxes were mostly on book sites.
  • 3 sites also offered “suggestions as you type” (Amazon, Borders, Ocado)
  • Only 1 site had an obvious link to an advanced search
  • All sites handled scopes with a dropdown

(Visio stencil is from GUUUI)

Written by Karen

September 4th, 2009 at 6:34 am

Posted in e-commerce,search

content management resources

without comments

Debora emailed asking for resources about content management from an IA perspective.

I had a rummage around and created a quick list of content management books, presentations, and websites. Plus a short flurry of content strategy links as quite a few of the interesting structured content debates seem to have moved that way (is that a sympton of all IAs being UX designers these days?)

Written by Karen

August 21st, 2009 at 6:18 am

dodgy recommendations

without comments

I always like examples of recommendation engines and the like that have got a bit muddled. The WalMart Apes scandal remains the classic. In this case the book is Apocalypses: Prophecies, Cults and Millennial Beliefs Throughout the Ages and the sponsored link reads “Cheap Weber BBQs”.

dodgy recommendations

It would be nice to think that the suggestion that customer interested in a book on apocalypses might also like a BBQ had some sort of ‘burn in hell’ connnection but it appears to just be that the author is called “Weber” which is a BBQ brand.

Which started me thinking about how to improve the recommendation engine with a bit of semantic insight about which fields to match upon. You could just not match on the author field but presumably some of the sponsored links are actually related to the author (I’m thinking the Gillian McKeiths and Deepak Chopras of the world). So you’d need some semantic information about the content of the sponsored link as well. Which could be a bit more challenging…

Written by Karen

July 30th, 2009 at 6:18 am

conversion rates affected by CAPTCHAs

without comments

Interesting stuff on the impact of CAPTCHAs:

“From the data you can see that with CAPTCHA on, there was an 88% reduction in SPAM but there were 159 failed conversions. Those failed conversions could be SPAM, but they could also be people who couldn’t figure out the CAPTCHA and finally just gave up. With CAPTCHA’s on, SPAM and failed conversions accounted for 7.3% of all the conversions for the 3 month period. With CAPTCHA’s off, SPAM conversions accounted for 4.1% of all the conversions for the 3 month period. That possibly means when CAPTCHA’s are on, the company could lose out on 3.2% of all their conversions!

Given the fact that many clients count on conversions to make money, not receiving 3.2% of those conversions could put a dent in sales. Personally, I would rather sort through a few SPAM conversions instead of losing out on possible income.”

via SEOmoz | CAPTCHAs’ Effect on Conversion Rates.

Written by Karen

July 23rd, 2009 at 6:05 am

SharePoint search: Inside the Index book ‘review’

without comments

Inside the Index and Search Engines is 624 pages of lovely SharePoint search info. It is the sort of book that sets me apart from my colleagues. I was delighted when it arrived, everyone else was sympathetic.

The audience is “administrators” and “developers”. I’m never sure how technical they are imagining when they say “administrators” so I waded in anyway. The book defines topics for administrators as; managing the index file; configuring the end-user experience; managing metadata; search usage reports; configuring BDC applications; monitoring performance; administering protocol handlers and iFilters. I skimmed through the content for developers and found some useful nuggets in there too.

1. Introducing Enterprise Search in SharePoint 2007
2. The End-User Search Experience
3. Customizing the Search User Interface
4. Search Usage Reports
5. Search Administration
6. Indexing and Searching Business Data
7. Search Deployment Considerations
8. Search APIs
9. Advanced Search Engine Topics
10. Searching with Windows SharePoint Services 3.0

The book begins by setting the scene, and with lots of fluff about why search matters and some slightly awkward praise for Microsoft’s efforts. It gets much more interesting later, so you can probably skip most of the introduction.

Content I found useful:

Chapter 1. Introducing Enterprise Search in SharePoint 2007

p.28-33 includes a comparison of features for a quick overview of Search Server, Search Server Express and SharePoint Server.

“Queries that are submitted first go through layers of word breakers and stemmers before they are executed against the content index file is available. Word breaking is a technique for isolating the important words out of the content, and stemmers store the variations on a word” p.32

Keyword query syntax p.44

  • maximum query length 1024 characters
  • by default is not case sensitive
  • defaults to AND queries
  • phrase searches can be run with quote marks
  • wildcard searching is not supported at the level of keyword syntax search queries. Developers could build this functionality using CONTAINS in the SQL query syntax
  • exclude words with
  • you can search for properties  e.g rnib author:loasby
  • property searches can include prefix searches e.g author:loas
  • properties are ANDed unless it the same property repeated (which would run as OR search)

Search URL parameters p.50

  • k = keyword query
  • s = the scope
  • v = sort e.g “&v=date”

Chapter 4: The Search Usage Reports

Search queries report contains:

  • number of queries
  • query origin site collections
  • number of queries per scope
  • query terms

Search results report contains:

  • search result destination pages (which URL was clicked by users)
  • queries with zero results
  • most clicked best bets
  • search results with zero best bets
  • queries with low clickthrough

Data can be exported to Excel (useful if I need to share the data in an accessible format).

You cannot view data beyond the 30 day data window. The suggested solution is to export every report!

Chapter 5: Search Administration

Can manage the crawl by:

  • create content sources
  • define crawl rules : exclude content (can use wildcard patterns), follow/noindex, crawl URLs with query strings
  • define crawl schedules
  • removed unwanted items with immediate effect
  • troubleshoot crawls

There’s a useful but off-topic box about file shares vs. sharepoint on p.225

Crawler can discover metadata from:

  • file properties e.g name, extension, date and size
  • additional microsoft office properties
  • SharePoint list columns
  • Meta Tags from in HTML
  • Email subject and to fields
  • User profile properties

You can view the list of crawled properties via the Metadata Property Mappings link in the Configure Search Settings page. The Included In Index indicates if the property is searchable.

Managed properties can be:

  • exposed in advanced search and in query syntax
  • displayed in search results
  • used in search scope rules
  • used in custom relevancy ranking

Adjusting the weight of properties in ranking is not an admin interface task and can only be done via the programming interface.

High Confidence Results: A different (more detailed?) result for results that the search engine believes are an exact match for the query.

Authoritative Pages

  • site central to high priority business process should be authoritative
  • sites that encourage collaboration and actions should be authoritative
  • external sites should not be authoritative

Thesaurus p.291

  • an XML file on the server with no admin interface
  • no need to include stemming variations
  • different lanuage thesauri exist. The one used depends on the language specified by client apps sending requests
  • tseng.xml and tsenu.xml

Noise words p.294

  • language specific plain text files, in the same directory as the thesaurus
    • for US english the file name is noiseenu.txt

Diacritic-sensitive search

  • off by default

Chapter 8 – Search APIs

Mostly too technical but buried in the middle of chapter 8 are the ranking parameters:

  • saturation constant for term frequency
  • saturation constand for click distance
  • weight of click distance for calculating relevance
  • saturation constant for URL depth
  • weight of URL depth for calculating relevance
  • weight for ranking applied to non-default language
  • weight of HTML, XML and TXT content type
  • weight of document content types (Word, PP, Excel and Outlook)
  • weight of list items content types

They’ll come in handy when I’m baffling over some random ranking decisions that SP has made.

Chapter 9 – Advanced Search Engine Topics

Skipped through most of this but it does covers the Codeplex Faceted Search on p.574-585


A good percentage of the book was valuable to a non-developer, particularly one who is happy to skip over chunks of code. I’ve seen and heard a lot of waffle about what SharePoint search does and doesn’t do, so it was great to get some solid answers.
Inside the Index and Search Engines: Microsoft® Office SharePoint® Server 2007

Related posts
SharePoint search: some ranking factors
SharePoint search: good or bad?

Written by Karen

July 22nd, 2009 at 6:33 am

Posted in books,search,sharepoint