Efficient Spatial Sampling of Large Geographical Tables

Efficient Spatial Sampling of Large Geographical Tables

90-minute presentation at the InfoCloud (cloud.kaust.edu.sa) group meeting, on "Efficient Spatial Sampling of Large Geographical Tables" by Anish Das Sarma et al., published in SIGMOD '12 and TODS '13.

Ed09e933a899fcae158439f11f66fed0?s=128

Emaad Manzoor

March 10, 2014
Tweet

Transcript

  1. Efficient Spatial Sampling of Large Geographical Tables (SIGMOD ‘12 /

    TODS ‘13) Anish Das Sarma, Hongrae Lee, Hector Gonzalez, Jayant Madhavan, Alon Halevy Google Research Presented by Emaad Ahmed Manzoor March 10, 2014
  2. Thinning

  3. Constraints Objectives Challenges

  4. None
  5. None
  6. Definitions

  7. Definitions

  8. Visibility Zoom Consistency Adjacency Constraints

  9. None
  10. The Thinning Problem

  11. K = 1 M1 = { 4, 4, 4, 4,

    4 } M2 = { 1, 3, 4, 4, 4 } M3 = { 2, 3, 4, 4, 4 }
  12. Maximality Fairness Importance Objectives

  13. K = 1 M1 = { 4, 4, 4, 4,

    4 } M2 = { 1, 3, 4, 4, 4 } M3 = { 2, 3, 4, 4, 4 }
  14. Problem Maximality Fairness Importance Visibility Zoom Consistency Adjacency Constraints Objectives

  15. Problem Maximality Fairness Importance Visibility Zoom Consistency Adjacency Constraints Objectives

    Optimization
  16. Integer Programming

  17. Variables

  18. None
  19. Sampling Constraints

  20. Zoom Consistency & Visibility Constraints

  21. Thinning solution

  22. None
  23. Program Size

  24. None
  25. None
  26. None
  27. None
  28. None
  29. Critical nodes

  30. None
  31. Bounded Cover

  32. Critical nodes

  33. Program Size

  34. Relaxing Integer Constraints

  35. Objectives

  36. Maximality

  37. Strong Maximality There does not exist M’ such that:

  38. K = 1 M1 = { 4, 4, 4, 4,

    4 } M2 = { 1, 3, 4, 4, 4 } M3 = { 2, 3, 4, 4, 4 } M4 = { 1, 4, 4, 4, 3 }
  39. Strong Maximality is NP-Hard

  40. Weak Maximality There does not exist M’ such that: for

    some 1 <= i <= n
  41. K = 1 M1 = { 4, 4, 4, 4,

    4 } M2 = { 1, 3, 4, 4, 4 } M3 = { 2, 3, 4, 4, 4 } M4 = { 1, 4, 4, 4, 3 }
  42. None
  43. DFS

  44. None
  45. None
  46. K = 1 M2 = { 1, 3, 4, 4,

    4 }
  47. Point-only Datasets

  48. None
  49. Experiments

  50. 2.67GHz quad-core 12GB (starting at 1GB, or 4GB for the

    scalability tests) Java 1.6 Apache Simplex K=500 “Some plots were too big, so we threw them out.”
  51. Program Size

  52. None
  53. Integer Relaxation

  54. Scalability

  55. None
  56. Objectives

  57. None
  58. None
  59. Takeaways

  60. Use DFS if you care only about maximality Otherwise use

    the minimised LP The randomized points-only algorithm consumes constant memory and scales arbitrarily (not shown)
  61. .