Search
05/27/2014 - 15:20 to 16:00
Kesselhaus
long talk (40 min)
Session abstract:
- Understanding search quality: relevancy, snippets, user interface(?)
 - How to measure search quality: metrics, comparison of two search systems request-by-request, classic evaluation of top N, by-pair evaluation with swiss system. The cheapest way.
 - Examples of search quality problems.
 - Production system. Which data available: clicks, queries, shows in SERPs.
 - Text relevancy ranking: different approaches, absence of silver bullet. BM25, tf*idf, using hits of different types, using language models, quorum, words proximity in query and document.
 - How to effectively mix all signals: manual linear model, polynomial model, gradient decision trees, known implementations. Where to get labels?
 - Doing snippets well: candidates labeling, blind test, infrastructure for candidates ranking, features examples, infrastructure for candidates features and ranking, features examples,
 - How to measure search quality using clicks?
 - Other signals: comments, likes.
 - Example project: Filesystem path classifier based on search results.
 






























