Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Design for Interaction

Design for Interaction

This SIGMOD 2009 invited talk presents human-computer information retrieval (HCIR) as a general approach that addresses some of the key challenges facing both research communities and presents an interactive set retrieval approach that responds to queries with an overview of the user's current context and an organized set of options for incremental exploration.

Avatar for Daniel Tunkelang

Daniel Tunkelang

May 21, 2026

More Decks by Daniel Tunkelang

Other Decks in Technology

Transcript

  1. © 2009 Endeca Technologies, Inc. All rights reserved. design for

    interaction Daniel Tunkelang Chief Scientist, Endeca
  2. © 2009 Endeca Technologies, Inc. All rights reserved. 2 about

    me Organizing SIGIR ’09 Industry Track in Boston on July 22nd!
  3. © 2009 Endeca Technologies, Inc. All rights reserved. 3 about

    endeca 250M+ end users per month 250M+ end users per month 600+ customers $100M+ annual sales leading provider of search applications
  4. © 2009 Endeca Technologies, Inc. All rights reserved. 4 what

    i hope you learn from this talk the db and ir perspectives have a common thread convergence may be upon us but we need interaction to make it work
  5. © 2009 Endeca Technologies, Inc. All rights reserved. 5 overview

    don't put all your eggs in one basket design for interaction human-computer information retrieval
  6. © 2009 Endeca Technologies, Inc. All rights reserved. 6 don’t

    put all your eggs in one basket Still Life with Basket and Broken Eggs by Michael Edwards, 2008
  7. © 2009 Endeca Technologies, Inc. All rights reserved. 7 the

    db approach: perfection in, perfection out http://www.storeitfoodsblog.com/category/food-preparation/meat-grinder/
  8. © 2009 Endeca Technologies, Inc. All rights reserved. 8 db

    usability researchers recognize the pain
  9. © 2009 Endeca Technologies, Inc. All rights reserved. 9 sql

    is hard Making Database Systems Usable [Jagadish et al., SIGMOD 2007] • labor-intensive query construction • lengthy query evaluation • high query reformulation cost __ sql
  10. © 2009 Endeca Technologies, Inc. All rights reserved. 10 data

    sucks and users are lazy Extracting Problems for Database and IR Researchers [Naughton, Spring 2008 North East DB/IR Day] • real data is – incomplete – inconsistent – incorrect • users don’t want to learn – data schemas – structured query languages we’re not gonna take it!
  11. © 2009 Endeca Technologies, Inc. All rights reserved. 11 the

    ir way: don’t worry, be happy http://adsoftheworld.com/media/print/mcdonalds_burger_mysteries
  12. © 2009 Endeca Technologies, Inc. All rights reserved. 12 ir

    for db people: what would google do? information Need query select from results rank using IR model USER: SYSTEM: tf-idf PageRank
  13. © 2009 Endeca Technologies, Inc. All rights reserved. 13 assumptions

    of relevance-centric ir approach • self-awareness • self-expression • model knows best • answer is a document • one-shot query
  14. © 2009 Endeca Technologies, Inc. All rights reserved. 14 life

    is not a batch • db approach expects too much of user • ir approach expects too much of system both approaches act as if it all comes down to a single query is that your final answer question?
  15. © 2009 Endeca Technologies, Inc. All rights reserved. 15 design

    for interaction The Future of Social Interaction by Jim Stoten
  16. © 2009 Endeca Technologies, Inc. All rights reserved. 16 changes

    assumptions about what to optimize recall precision complexity relevance communication
  17. © 2009 Endeca Technologies, Inc. All rights reserved. 17 how

    do we optimize communication? transparency control guidance
  18. © 2009 Endeca Technologies, Inc. All rights reserved. 18 ir

    offers a black box ca c'est la caisse. le mouton que tu veux est dedans.
  19. © 2009 Endeca Technologies, Inc. All rights reserved. 19 db

    / set retrieval offers 2 out of 3 transparency control guidance
  20. © 2009 Endeca Technologies, Inc. All rights reserved. 20 but

    we need it all! • set retrieval is a failure in the ir world – though quite successful in the db world! • but ranked retrieval is inherently crippled – no transparency, control, or guidance! how do we optimize for communication?
  21. © 2009 Endeca Technologies, Inc. All rights reserved. 21 human-computer

    information retrieval • don’t just guess the user’s intent • increase user responsibility and control • require and reward human intellectual effort “Toward Human-Computer Information Retrieval” Gary Marchionini
  22. © 2009 Endeca Technologies, Inc. All rights reserved. 23 treat

    query construction as a process A Case for Interaction [Koenemann and Belkin, 1996] • used term feedback to improve alerting queries • users select from suggested terms • 17 – 34% improvement in precision @ 30 • users liked the feedback interface
  23. © 2009 Endeca Technologies, Inc. All rights reserved. 25 success

    in the lab and the field • favored in user studies by Marti Hearst – http://flamenco.berkeley.edu/ • ubiquitous in ecommerce – amazon.com – eBay – endeca powers 42 of top 100 online retailers • taking over media, libraries, enterprise, etc.
  24. © 2009 Endeca Technologies, Inc. All rights reserved. 26 even

    a few db folks have drunk the kool-aid DataGuides [Goldman and Widom, VLDB 1997] • user-friendly schema summaries Magnet [Sinha and Karger, SIGMOD 2005] • navigation and refinement options common theme: semistructured
  25. © 2009 Endeca Technologies, Inc. All rights reserved. 27 what

    is semistructured data? • one universe • self-describing • blends data / meta-data
  26. © 2009 Endeca Technologies, Inc. All rights reserved. 28 data

    modeling flexibility • no a-priori schema – integrated sources without up-front schema design • richer modeling capabilities tame data complexity – hierarchy, multi-valued fields, sparse fields • schema flexibility eases schema evolution – new entity types, new data source Databases Content Management ERP Groupware and Collaboration WWW Internet SOA, ESB, Web Service File Systems
  27. © 2009 Endeca Technologies, Inc. All rights reserved. 29 semantically

    direct queries <shirt> <sku>1234</sku> <sleeve>Long</sleeve> <desc>Classic end-on-end shirt</desc> <price>39.99</price> <salePrice>29.99</salePrice> <color>Blue</color> <color>Yellow</color> <color>White</color> ... </shirt> <trousers> <sku>1579</sku> <price>59.99</price> <color>Khaki</color> ... </trousers> which on-sale items are available in blue? <buyingGuide> <title>Selecting the right ski coat for you.</title> <file>skiguide.pdf</file> <keyword>ski</keyword> <keyword>coat</keyword> ... </buyingGuide> which attributes characterize on-sale blue items? price, sleeve, color, salePrice, brand, fabric, …
  28. © 2009 Endeca Technologies, Inc. All rights reserved. 30 but

    let’s make this concrete Uh oh, I’m presenting at SIGMOD! Better find a good book about databases!
  29. © 2009 Endeca Technologies, Inc. All rights reserved. 32 i

    know, i’ll go to the library! #%@$!
  30. © 2009 Endeca Technologies, Inc. All rights reserved. 35 life

    in a semistructured world • search is a great starting point – users can’t / won’t initiate structured queries • ranked lists are an inadequate ending point – search queries are lossy projections of intent • hcir leads users down a garden path to structure
  31. © 2009 Endeca Technologies, Inc. All rights reserved. 36 lots

    of trade-offs “everything should be made as simple as possible, but no simpler” “speed of thought” vs. “going nowhere quickly” “to err is human, but to really foul things up requires a computer” simple interfaces don’t always yield satisfaction
  32. © 2009 Endeca Technologies, Inc. All rights reserved. 37 users

    want the triumvirate • transparency • control • guidance transparency and control are easy guidance requires cleverness
  33. © 2009 Endeca Technologies, Inc. All rights reserved. 38 in

    closing all of us want to help people access information the best help is to help them help themselves design for interaction though transparency, control, guidance
  34. © 2009 Endeca Technologies, Inc. All rights reserved. 39 thank

    you…and come to SIGIR! communication 1.0 email: [email protected] communication 2.0 blog: http://thenoisychannel.com twitter: http://twitter.com/dtunkelang SIGIR: July 19-23 in Boston Industry Track on July 22nd!