Following on from the controlled vocabulary resources, I dug out what I have on automatic classification.
Strangely most of the information available on automatic indexing/classification/tagging is pretty dated (although it has been a couple of years since I was immersed in this stuff daily). The most detailed stuff seems to precede the arrival of folksonomies and user tagging, perhaps the buzz around tagging sucked up all the available energy in the metadata space?
DM Review’s 2003 article on Automatic Classification is a good intro to the various types of auto-classification: rules-based, supervised learning and unsupervised learning.
CMS Review has a good list of Metadata Tagging Tools and a list of other resources at the end.
Taxonomy Strategies provide a bibliography on info-retrieval that includes automatic classification articles.
From 2004 there’s the AMeGA project and Delphi’s white paper ‘Information Intelligence: Intelligent Classification and the Enterprise Taxonomy Practice’. Download from Delphi’s whitepaper request form.
There must be more recent stuff that this. I’ll start gathering stuff on the automating metadata page.