Upgrade to Pro — share decks privately, control downloads, hide ads and more …

From Service to Platform: A Ranking System in Go

From Service to Platform: A Ranking System in Go

What started out as an experimental service to rank livestreams evolved to a platform powering all types of content recommendations. Go's stringent code philosophy paved the way to a modular pipeline-based system for scatter-gather workflows enabling anyone to add new ranking algorithms.

GopherCon Europe 2022, Berlin
https://www.youtube.com/watch?v=5jSyctW1rPg

GopherCon UK 2022, London
https://www.youtube.com/watch?v=TNyoKBLxfTM

Konrad Reiche

August 18, 2022
Tweet

More Decks by Konrad Reiche

Other Decks in Programming

Transcript

  1. From Service To Platform A Ranking System in Go Konrad

    Reiche r/streetwear r/aww r/dogecoin r/golang r/yoga
  2. Redis Monitor 1655767305.672853 [0 10.8.144.28:61122] "get" "post:3hg9w" 1655767305.672862 [0 10.8.144.28:61122]

    "get" "post:5cm6n9" 1655767305.672869 [0 10.8.144.28:61122] "get" "post:62s7bk" 1655767305.672876 [0 10.8.144.28:61122] "get" "post:2zqpf" 1655767305.672880 [0 10.8.144.28:61122] "get" "post:2fn65o" ...
  3. Redis Monitor 1655767305.672853 [0 10.8.144.28:61122] "get" "post:3hg9w" 1655767305.672862 [0 10.8.144.28:61122]

    "get" "post:5cm6n9" 1655767305.672869 [0 10.8.144.28:61122] "get" "post:62s7bk" 1655767305.672876 [0 10.8.144.28:61122] "get" "post:2zqpf" 1655767305.672880 [0 10.8.144.28:61122] "get" "post:2fn65o" ... Timestamp Database, Network Address Command and Key
  4. Redis Monitor 1655767305.672853 [0 10.8.144.28:61122] "get" "post:3hg9w" 1655767305.672862 [0 10.8.144.28:61122]

    "get" "post:5cm6n9" 1655767305.672869 [0 10.8.144.28:61122] "get" "post:62s7bk" 1655767305.672876 [0 10.8.144.28:61122] "get" "post:2zqpf" 1655767305.672880 [0 10.8.144.28:61122] "get" "post:2fn65o" ...
  5. Redis Monitor 1655767305.672853 [0 10.8.144.28:61122] "get" "deleted_posts" 1655767305.672862 [0 10.8.144.28:61122]

    "get" "post:5cm6n9" 1655767305.672869 [0 10.8.144.28:61122] "get" "deleted_posts" 1655767305.672876 [0 10.8.144.28:61122] "get" "deleted_posts" 1655767305.672880 [0 10.8.144.28:61122] "get" "post:2fn65o" ...
  6. func main() { flag.Parse() b, err := os.ReadFile(path) if err

    != nil { log.Fatal(err) } countByKey := make(map[string]int) lines := strings.Split(string(b), "\n") for _, line := range lines { split := strings.Split(line, " ") if len(split) < 5 { continue } key := split[4] countByKey[key] += 1 } type keyCount struct { key string count int } counts := make([]keyCount, 0, len(countByKey)) for key, count := range countByKey { counts = append(counts, keyCount{key: key, count: count}) } sort.Slice(counts, func(i, j int) bool { return counts[i].count > counts[j].count }) for i := 0; i < topKeys; i++ { fmt.Println(keyCounts[i].count, keyCounts[i].key) } }
  7. func main() { flag.Parse() b, err := os.ReadFile(path) if err

    != nil { log.Fatal(err) } countByKey := make(map[string]int) lines := strings.Split(string(b), "\n") for _, line := range lines { split := strings.Split(line, " ") if len(split) < 5 { continue } key := split[4] countByKey[key] += 1 } type keyCount struct { key string count int } counts := make([]keyCount, 0, len(countByKey)) for key, count := range countByKey { counts = append(counts, keyCount{key: key, count: count}) } sort.Slice(counts, func(i, j int) bool { return counts[i].count > counts[j].count }) for i := 0; i < topKeys; i++ { fmt.Println(keyCounts[i].count, keyCounts[i].key) } } Parse flags, read file into memory
  8. func main() { flag.Parse() b, err := os.ReadFile(path) if err

    != nil { log.Fatal(err) } countByKey := make(map[string]int) lines := strings.Split(string(b), "\n") for _, line := range lines { split := strings.Split(line, " ") if len(split) < 5 { continue } key := split[4] countByKey[key] += 1 } type keyCount struct { key string count int } counts := make([]keyCount, 0, len(countByKey)) for key, count := range countByKey { counts = append(counts, keyCount{key: key, count: count}) } sort.Slice(counts, func(i, j int) bool { return counts[i].count > counts[j].count }) for i := 0; i < topKeys; i++ { fmt.Println(keyCounts[i].count, keyCounts[i].key) } } Parse flags, read file into memory Parse each line, split by column and count keys
  9. func main() { flag.Parse() b, err := os.ReadFile(path) if err

    != nil { log.Fatal(err) } countByKey := make(map[string]int) lines := strings.Split(string(b), "\n") for _, line := range lines { split := strings.Split(line, " ") if len(split) < 5 { continue } key := split[4] countByKey[key] += 1 } type keyCount struct { key string count int } counts := make([]keyCount, 0, len(countByKey)) for key, count := range countByKey { counts = append(counts, keyCount{key: key, count: count}) } sort.Slice(counts, func(i, j int) bool { return counts[i].count > counts[j].count }) for i := 0; i < topKeys; i++ { fmt.Println(keyCounts[i].count, keyCounts[i].key) } } Parse flags, read file into memory Parse each line, split by column and count keys Convert to slice of tuples, sort and print
  10. Redis: Finding Hot Keys $ ./find-hotkeys -file monitor.log -n 5

    8252 "deleted_posts" 1907 "post:2tk95" 1756 "post:2xcv7" 772 "post:3nasz" 509 "post:2qjpg"
  11. $ cat monitor.log | awk '{print $4}' | sort |

    uniq -c | sort -nr | head -n 5
  12. UNIX Pipes Output log Print 4th column Count unique lines

    Sort numerical Print first 5 lines The same can be achieved with existing programs and UNIX pipes. $ cat monitor.log | awk '{print $4}' | sort | uniq -c | sort -nr | head -n 5
  13. UNIX Pipes Output log Print 4th column Count unique lines

    Sort numerical Print first 5 lines The same can be achieved with existing programs and UNIX pipes. UNIX Toolbox Philosophy Write programs that: • Do one thing well • Compose • Easily communicate $ cat monitor.log | awk '{print $4}' | sort | uniq -c | sort -nr | head -n 5
  14. What is a ranking (recommendation) system? A recommendation system helps

    users to find content they find compelling.
  15. What is a ranking (recommendation) system? A recommendation system helps

    users to find content they find compelling. 1. Candidate Generation Start from a potentially huge corpus and generate a much smaller subset of candidates.
  16. What is a ranking (recommendation) system? A recommendation system helps

    users to find content they find compelling. 1. Candidate Generation Start from a potentially huge corpus and generate a much smaller subset of candidates. 2. Filtering Some candidates should be removed, for example content already watched or content the user marked as something they do not want to consume.
  17. What is a ranking (recommendation) system? A recommendation system helps

    users to find content they find compelling. 1. Candidate Generation Start from a potentially huge corpus and generate a much smaller subset of candidates. 2. Filtering Some candidates should be removed, for example content already watched or content the user marked as something they do not want to consume. 3. Scoring Assign scores to sort the candidates.
  18. 32 From Service to Platform: A Ranking System in Go

    Popular Posts Example: Ranking Service
  19. 33 From Service to Platform: A Ranking System in Go

    Popular Posts Fetch posts Example: Ranking Service
  20. 34 From Service to Platform: A Ranking System in Go

    Popular Posts Fetch posts Filter posts Example: Ranking Service
  21. 35 From Service to Platform: A Ranking System in Go

    Popular Posts Fetch posts Filter posts User Post Views Example: Ranking Service
  22. 36 From Service to Platform: A Ranking System in Go

    Example: Ranking Service Popular Posts Fetch posts Filter posts Score posts User Post Views
  23. 37 From Service to Platform: A Ranking System in Go

    Example: Ranking Service Popular Posts Fetch posts Filter posts Score posts User Post Views Model
  24. 38 From Service to Platform: A Ranking System in Go

    Example: Ranking Service Popular Posts Fetch posts Filter posts Score posts User Post Views Model
  25. 39 From Service to Platform: A Ranking System in Go

    Example: Ranking Service Popular Posts Fetch posts Filter posts Score posts Video Posts User Post Views Model
  26. 40 From Service to Platform: A Ranking System in Go

    Example: Ranking Service Popular Posts Fetch posts Filter posts Score posts Video Posts User Post Views Model remove duplicates
  27. Example: Ranking Service func (s *Service) GetPopularFeed(ctx context.Context, req *pb.FeedRequest)

    (*pb.PopularFeed, error) { posts, err := s.fetchPopularAndVideoPosts(ctx) if err != nil { return nil, err } posts = s.filterPosts(posts) posts, scores, err := s.model.ScorePosts(ctx, req.UserID, posts) if err != nil { return nil, err } posts = s.sortPosts(posts, scores) return pb.NewPopularFeed(posts), nil }
  28. Example: Ranking Service func (s *Service) GetPopularFeed(ctx context.Context, req *pb.FeedRequest)

    (*pb.PopularFeed, error) { posts, err := s.fetchPopularAndVideoPosts(ctx) if err != nil { return nil, err } posts = s.filterPosts(posts) posts, scores, err := s.model.ScorePosts(ctx, req.UserID, posts) if err != nil { return nil, err } posts = s.sortPosts(posts, scores) return pb.NewPopularFeed(posts), nil }
  29. Example: Ranking Service func (s *Service) GetPopularFeed(ctx context.Context, req *pb.FeedRequest)

    (*pb.PopularFeed, error) { posts, err := s.fetchPopularAndVideoPosts(ctx) if err != nil { return nil, err } posts = s.filterPosts(posts) posts, scores, err := s.model.ScorePosts(ctx, req.UserID, posts) if err != nil { return nil, err } posts = s.sortPosts(posts, scores) return pb.NewPopularFeed(posts), nil }
  30. Example: Ranking Service func (s *Service) GetPopularFeed(ctx context.Context, req *pb.FeedRequest)

    (*pb.PopularFeed, error) { posts, err := s.fetchPopularAndVideoPosts(ctx) if err != nil { return nil, err } posts = s.filterPosts(posts) posts, scores, err := s.model.ScorePosts(ctx, req.UserID, posts) if err != nil { return nil, err } posts = s.sortPosts(posts, scores) return pb.NewPopularFeed(posts), nil }
  31. Example: Ranking Service func (s *Service) GetPopularFeed(ctx context.Context, req *pb.FeedRequest)

    (*pb.PopularFeed, error) { posts, err := s.fetchPopularAndVideoPosts(ctx) if err != nil { return nil, err } posts = s.filterPosts(posts) posts, scores, err := s.model.ScorePosts(ctx, req.UserID, posts) if err != nil { return nil, err } posts = s.sortPosts(posts, scores) return pb.NewPopularFeed(posts), nil }
  32. Example: Ranking Service func (s *Service) GetPopularFeed(ctx context.Context, req *pb.FeedRequest)

    (*pb.PopularFeed, error) { posts, err := s.fetchPopularAndVideoPosts(ctx) if err != nil { return nil, err } posts = s.filterPosts(posts) posts, scores, err := s.model.ScorePosts(ctx, req.UserID, posts) if err != nil { return nil, err } posts = s.sortPosts(posts, scores) return pb.NewPopularFeed(posts), nil }
  33. Example: Ranking Service func (s *Service) GetPopularFeed(ctx context.Context, req *pb.FeedRequest)

    (*pb.PopularFeed, error) { posts, err := s.fetchPopularAndVideoPosts(ctx) if err != nil { return nil, err } posts = s.filterPosts(posts) posts, scores, err := s.model.ScorePosts(ctx, req.UserID, posts) if err != nil { return nil, err } posts = s.sortPosts(posts, scores) return pb.NewPopularFeed(posts), nil }
  34. Example: Ranking Service func (s *Service) GetPopularFeed(ctx context.Context, req *pb.FeedRequest)

    (*pb.PopularFeed, error) { posts, err := s.fetchPopularAndVideoPosts(ctx) if err != nil { return nil, err } posts = s.filterPosts(posts) posts, scores, err := s.model.ScorePosts(ctx, req.UserID, posts) if err != nil { return nil, err } posts = s.sortPosts(posts, scores) return pb.NewPopularFeed(posts), nil }
  35. Example: Ranking Service func (s *Service) GetPopularFeed(ctx context.Context, req *pb.FeedRequest)

    (*pb.PopularFeed, error) { posts, err := s.fetchPopularAndVideoPosts(ctx) if err != nil { return nil, err } posts = s.filterPosts(posts) posts, scores, err := s.model.ScorePosts(ctx, req.UserID, posts) if err != nil { return nil, err } posts = s.sortPosts(posts, scores) return pb.NewPopularFeed(posts), nil }
  36. Example: Ranking Service func (s *Service) GetPopularFeed(ctx context.Context, req *pb.FeedRequest)

    (*pb.PopularFeed, error) { posts, err := s.fetchPopularAndVideoPosts(ctx) if err != nil { return nil, err } imagePosts, err := s.cache.FetchImagePosts(ctx) if err != nil { return nil, err } posts = s.filterPosts(posts, imagePosts) posts, scores, err := s.model.ScorePosts(ctx, req.UserID, posts) if err != nil { return nil, err } posts = s.sortPosts(posts, scores) return pb.NewPopularFeed(posts), nil }
  37. 53 From Service to Platform: A Ranking System in Go

    UNIX Toolbox Philosophy Candidate Generation Filter Score
  38. 54 From Service to Platform: A Ranking System in Go

    UNIX Toolbox Philosophy Stage 1 Stage 2 …
  39. 55 From Service to Platform: A Ranking System in Go

    UNIX Toolbox Philosophy Stage 1 Stage 2 … type Stage interface { Rank(ctx context.Context, req *pb.Request) }
  40. 56 From Service to Platform: A Ranking System in Go

    UNIX Toolbox Philosophy Stage 1 Stage 2 … type Stage interface { Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) }
  41. 57 From Service to Platform: A Ranking System in Go

    UNIX Toolbox Philosophy Stage 1 Stage 2 … type Stage interface { Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) } type Request struct { Context *Entity Candidates []*Entity }
  42. 58 From Service to Platform: A Ranking System in Go

    UNIX Toolbox Philosophy Stage 1 Stage 2 … type Stage interface { Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) } type Request struct { Context *Entity Candidates []*Entity }
  43. 59 From Service to Platform: A Ranking System in Go

    type Request struct { Context *Entity Candidates []*Entity }
  44. 60 From Service to Platform: A Ranking System in Go

    type Request struct { Context *Entity Candidates []*Entity }
  45. 61 From Service to Platform: A Ranking System in Go

    type Request struct { Context *Entity Candidates []*Entity } type Entity struct { ID string Features map[string]*Feature Score float64 }
  46. 62 From Service to Platform: A Ranking System in Go

    type Request struct { Context *Entity Candidates []*Entity } type Entity struct { ID string Features map[string]*Feature Score float64 }
  47. 63 From Service to Platform: A Ranking System in Go

    type Request struct { Context *Entity Candidates []*Entity } type Entity struct { ID string Features map[string]*Feature Score float64 }
  48. 64 From Service to Platform: A Ranking System in Go

    type Request struct { Context *Entity Candidates []*Entity } type Entity struct { ID string Features map[string]*Feature Score float64 }
  49. 65 From Service to Platform: A Ranking System in Go

    UNIX Toolbox Philosophy Stage 1 Stage 2 … type Stage interface { Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) } type Request struct { Context *Entity Candidates []*Entity }
  50. 66 From Service to Platform: A Ranking System in Go

    UNIX Toolbox Philosophy Stage 1 Stage 2 … type Stage interface { Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) } type Request struct { Context *Entity Candidates []*Entity }
  51. gRPC Protobuf Definition syntax = "proto3"; service Ranking { rpc

    Rank (Request) returns (Request); } message Request { Entity context = 1; repeated Entity candidates = 2; RequestOptions options = 3; }
  52. gRPC Protobuf Definition message Entity { string id = 1;

    map<string, Feature> features = 2; double score = 3; } message Feature { oneof value { string as_string = 1; int64 as_int = 2; double as_float = 3; bool as_bool = 4; // ... }; }
  53. gRPC Protobuf Request Example context: { id: "t2_bd5ts" features: {

    key: "geo_city" value: { as_string: "SAN_FRANCISCO" } } features: { key: "geo_country" value: { as_string: "US" } } } options: { method: "rank_popular_feed" limit: 20 }
  54. gRPC Service type server struct { *grpc.Server stage stage.Stage }

    func (s *server) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { return s.stage.Rank(ctx, req) }
  55. 72 A General-Purpose Ranking Service From Service to Platform: A

    Ranking System in Go Quickly and flexibly perform complex scatter-gather ranking workflows at Reddit
  56. 73 A General-Purpose Ranking Service From Service to Platform: A

    Ranking System in Go Fetch Popular Posts Fetch Video Posts Quickly and flexibly perform complex scatter-gather ranking workflows at Reddit Fetch Image Posts
  57. 74 A General-Purpose Ranking Service From Service to Platform: A

    Ranking System in Go Fetch Popular Posts Fetch Video Posts Series Series Quickly and flexibly perform complex scatter-gather ranking workflows at Reddit Fetch Image Posts Series
  58. 75 A General-Purpose Ranking Service From Service to Platform: A

    Ranking System in Go Fetch Popular Posts Fetch Video Posts Series Series Parallel Quickly and flexibly perform complex scatter-gather ranking workflows at Reddit Fetch Image Posts Series
  59. 76 A General-Purpose Ranking Service From Service to Platform: A

    Ranking System in Go Fetch Popular Posts Fetch Video Posts Series Series Parallel Merge Candidates Quickly and flexibly perform complex scatter-gather ranking workflows at Reddit Fetch Image Posts Series
  60. 77 A General-Purpose Ranking Service From Service to Platform: A

    Ranking System in Go Fetch Popular Posts Fetch Video Posts Series Series Parallel Merge Candidates Fetch Recently Viewed Posts Quickly and flexibly perform complex scatter-gather ranking workflows at Reddit Series Fetch Image Posts Series Filter Recently Viewed Posts
  61. 78 A General-Purpose Ranking Service From Service to Platform: A

    Ranking System in Go Fetch Popular Posts Fetch Video Posts Series Series Parallel Merge Candidates Fetch Recently Viewed Posts Quickly and flexibly perform complex scatter-gather ranking workflows at Reddit Series Fetch Image Posts Series Score Candidates Filter Recently Viewed Posts Sort Candidates
  62. 79 A General-Purpose Ranking Service From Service to Platform: A

    Ranking System in Go Fetch Popular Posts Fetch Video Posts Series Series Parallel Merge Candidates Fetch Recently Viewed Posts Quickly and flexibly perform complex scatter-gather ranking workflows at Reddit Series Fetch Image Posts Series Score Candidates Filter Recently Viewed Posts Sort Candidates Candidates Features Filtering Meta-Stages
  63. 80 A General-Purpose Ranking Service From Service to Platform: A

    Ranking System in Go Fetch Popular Posts Fetch Video Posts Series Series Parallel Merge Candidates Fetch Recently Viewed Posts Quickly and flexibly perform complex scatter-gather ranking workflows at Reddit Series Fetch Image Posts Series Score Candidates Filter Recently Viewed Posts Sort Candidates Candidates Features Filtering Meta-Stages
  64. Stage: Fetch Popular Posts type fetchPopularPosts struct { cache *store.PostCache

    } func FetchPopularPosts(cache *store.PostCache) *fetchPopularPosts { return &fetchPopularPosts{cache: cache} } func (s *fetchPopularPosts) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { postIDs, err := s.cache.FetchPopularPostIDs(ctx) if err != nil { return nil, err } for _, id := range postIDs { req.Candidates = append(req.Candidates, pb.NewCandidate(postID)) } return req, nil }
  65. Stage: Fetch Popular Posts type fetchPopularPosts struct { cache *store.PostCache

    } func FetchPopularPosts(cache *store.PostCache) *fetchPopularPosts { return &fetchPopularPosts{cache: cache} } func (s *fetchPopularPosts) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { postIDs, err := s.cache.FetchPopularPostIDs(ctx) if err != nil { return nil, err } for _, id := range postIDs { req.Candidates = append(req.Candidates, pb.NewCandidate(id)) } return req, nil }
  66. Stage: Fetch Popular Posts type fetchPopularPosts struct { cache *store.PostCache

    } func FetchPopularPosts(cache *store.PostCache) *fetchPopularPosts { return &fetchPopularPosts{cache: cache} } func (s *fetchPopularPosts) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { postIDs, err := s.cache.FetchPopularPostIDs(ctx) if err != nil { return nil, err } for _, id := range postIDs { req.Candidates = append(req.Candidates, pb.NewCandidate(id)) } return req, nil }
  67. Stage: Fetch Popular Posts type fetchPopularPosts struct { cache *store.PostCache

    } func FetchPopularPosts(cache *store.PostCache) *fetchPopularPosts { return &fetchPopularPosts{cache: cache} } func (s *fetchPopularPosts) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { postIDs, err := s.cache.FetchPopularPostIDs(ctx) if err != nil { return nil, err } for _, id := range postIDs { req.Candidates = append(req.Candidates, pb.NewCandidate(id)) } return req, nil }
  68. Stage: Fetch Popular Posts type fetchPopularPosts struct { cache *store.PostCache

    } func FetchPopularPosts(cache *store.PostCache) *fetchPopularPosts { return &fetchPopularPosts{cache: cache} } func (s *fetchPopularPosts) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { postIDs, err := s.cache.FetchPopularPostIDs(ctx) if err != nil { return nil, err } for _, id := range postIDs { req.Candidates = append(req.Candidates, pb.NewCandidate(id)) } return req, nil }
  69. 86 A General-Purpose Ranking Service From Service to Platform: A

    Ranking System in Go Fetch Popular Posts Fetch Video Posts Series Series Parallel Merge Candidates Fetch Recently Viewed Posts Quickly and flexibly perform complex scatter-gather ranking workflows at Reddit Series Fetch Image Posts Series Score Candidates Filter Recently Viewed Posts Sort Candidates Candidates Features Filtering Meta-Stages
  70. Stage: Filtering Recently Viewed Posts func (s *filterRecentlyViewedPosts) Rank(ctx context.Context,

    req *pb.Request) (*pb.Request, error) { seen := req.Context.Features["recently_viewed_post_ids"].GetAsBoolMap() var filtered []*pb.Entity for _, candidate := range req.Candidates { if !seen[candidate.Id] { filtered = append(filtered, candidate) } } req.Candidates = filtered return req, nil }
  71. Stage: Filtering Recently Viewed Posts func (s *filterRecentlyViewedPosts) Rank(ctx context.Context,

    req *pb.Request) (*pb.Request, error) { seen := req.Context.Features["recently_viewed_post_ids"].GetAsBoolMap() var filtered []*pb.Entity for _, candidate := range req.Candidates { if !seen[candidate.Id] { filtered = append(filtered, candidate) } } req.Candidates = filtered return req, nil }
  72. Stage: Filtering Recently Viewed Posts func (s *filterRecentlyViewedPosts) Rank(ctx context.Context,

    req *pb.Request) (*pb.Request, error) { seen := req.Context.Features["recently_viewed_post_ids"].GetAsBoolMap() n := 0 for _, candidate := range req.Candidates { if !seen[candidate.Id] { req.Candidates[n] = candidate n++ } } req.Candidates = req.Candidates[:n] // in-place filtering return req, nil }
  73. 90 A General-Purpose Ranking Service From Service to Platform: A

    Ranking System in Go Fetch Popular Posts Fetch Video Posts Series Series Parallel Merge Candidates Fetch Recently Viewed Posts Quickly and flexibly perform complex scatter-gather ranking workflows at Reddit Series Fetch Image Posts Series Score Candidates Filter Recently Viewed Posts Sort Candidates Candidates Features Filtering Meta-Stages
  74. 91 A General-Purpose Ranking Service From Service to Platform: A

    Ranking System in Go Fetch Popular Posts Fetch Video Posts Series Series Parallel Merge Candidates Fetch Recently Viewed Posts Quickly and flexibly perform complex scatter-gather ranking workflows at Reddit Series Fetch Image Posts Series Score Candidates Filter Recently Viewed Posts Sort Candidates Candidates Features Filtering Meta-Stages
  75. Meta-Stage: Series type series struct { stages []Stage } func

    Series(stages ...Stage) *series { return &series{stages: stages} } func (s *series) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { var err error resp := req for _, stage := range s.stages { resp, err = stage.Rank(ctx, req) if err != nil { return nil, err } req = resp } return resp, nil }
  76. type series struct { stages []Stage } func Series(stages ...Stage)

    *series { return &series{stages: stages} } func (s *series) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { var err error resp := req for _, stage := range s.stages { resp, err = stage.Rank(ctx, req) if err != nil { return nil, err } req = resp } return resp, nil } Meta-Stage: Series
  77. type series struct { stages []Stage } func Series(stages ...Stage)

    *series { return &series{stages: stages} } func (s *series) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { var err error resp := req for _, stage := range s.stages { resp, err = stage.Rank(ctx, req) if err != nil { return nil, err } req = resp } return resp, nil } Meta-Stage: Series
  78. type series struct { stages []Stage } func Series(stages ...Stage)

    *series { return &series{stages: stages} } func (s *series) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { var err error resp := req for _, stage := range s.stages { resp, err = stage.Rank(ctx, req) if err != nil { return nil, err } req = resp } return resp, nil } Meta-Stage: Series
  79. type series struct { stages []Stage } func Series(stages ...Stage)

    *series { return &series{stages: stages} } func (s *series) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { var err error resp := req for _, stage := range s.stages { resp, err = stage.Rank(ctx, req) if err != nil { return nil, err } req = resp } return resp, nil } Meta-Stage: Series
  80. Meta-Stage: Parallel func (s *parallel) Rank(ctx context.Context, req *pb.Request) (*pb.Request,

    error) { resps := make([]*pb.Request, len(s.stages)) g, groupCtx := errgroup.WithContext(ctx) for i := range s.stages { i := i g.Go(func() error { defer log.CapturePanic(groupCtx) resp, err := s.stages[i].Rank(groupCtx, pb.Copy(req)) if err != nil { return err } resps[i] = resp return nil }) } if err := g.Wait(); err != nil { return nil, err } return s.merge(ctx, req, resps...) }
  81. Meta-Stage: Parallel func (s *parallel) Rank(ctx context.Context, req *pb.Request) (*pb.Request,

    error) { resps := make([]*pb.Request, len(s.stages)) g, groupCtx := errgroup.WithContext(ctx) for i := range s.stages { i := i g.Go(func() error { defer log.CapturePanic(groupCtx) resp, err := s.stages[i].Rank(groupCtx, pb.Copy(req)) if err != nil { return err } resps[i] = resp return nil }) } if err := g.Wait(); err != nil { return nil, err } return s.merge(ctx, req, resps...) } golang.org/x/sync/errgroup
  82. Meta-Stage: Parallel func (s *parallel) Rank(ctx context.Context, req *pb.Request) (*pb.Request,

    error) { resps := make([]*pb.Request, len(s.stages)) g, groupCtx := errgroup.WithContext(ctx) for i := range s.stages { i := i g.Go(func() error { defer log.CapturePanic(groupCtx) resp, err := s.stages[i].Rank(groupCtx, pb.Copy(req)) if err != nil { return err } resps[i] = resp return nil }) } if err := g.Wait(); err != nil { return nil, err } return s.merge(ctx, req, resps...) } Calls the function in a goroutine, first non-nil error to be returned cancels the group
  83. Meta-Stage: Parallel func (s *parallel) Rank(ctx context.Context, req *pb.Request) (*pb.Request,

    error) { resps := make([]*pb.Request, len(s.stages)) g, groupCtx := errgroup.WithContext(ctx) for i := range s.stages { i := i g.Go(func() error { defer log.CapturePanic(groupCtx) resp, err := s.stages[i].Rank(groupCtx, pb.Copy(req)) if err != nil { return err } resps[i] = resp return nil }) } if err := g.Wait(); err != nil { return nil, err } return s.merge(ctx, req, resps...) }
  84. Meta-Stage: Parallel func (s *parallel) Rank(ctx context.Context, req *pb.Request) (*pb.Request,

    error) { resps := make([]*pb.Request, len(s.stages)) g, groupCtx := errgroup.WithContext(ctx) for i := range s.stages { i := i g.Go(func() error { defer log.CapturePanic(groupCtx) resp, err := s.stages[i].Rank(groupCtx, pb.Copy(req)) if err != nil { return err } resps[i] = resp return nil }) } if err := g.Wait(); err != nil { return nil, err } return s.merge(ctx, req, resps...) }
  85. Meta-Stage: Parallel func (s *parallel) Rank(ctx context.Context, req *pb.Request) (*pb.Request,

    error) { resps := make([]*pb.Request, len(s.stages)) g, groupCtx := errgroup.WithContext(ctx) for i := range s.stages { i := i g.Go(func() error { defer log.CapturePanic(groupCtx) resp, err := s.stages[i].Rank(groupCtx, pb.Copy(req)) if err != nil { return err } resps[i] = resp return nil }) } if err := g.Wait(); err != nil { return nil, err } return s.merge(ctx, req, resps...) }
  86. Meta-Stage: If-Else type Selector func(context.Context, *pb.Request) bool func (s *ifElse)

    Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { if s.selector(ctx, req) { return s.ifStage.Rank(ctx, req) } return s.elseStage.Rank(ctx, req) }
  87. 104 A General-Purpose Ranking Service From Service to Platform: A

    Ranking System in Go Fetch Popular Posts Fetch Video Posts Series Series Parallel Merge Candidates Fetch Recently Viewed Posts Quickly and flexibly perform complex scatter-gather ranking workflows at Reddit Series Fetch Image Posts Series Score Candidates Filter Recently Viewed Posts Sort Candidates Candidates Features Filtering Meta-Stages
  88. 105 A General-Purpose Ranking Service From Service to Platform: A

    Ranking System in Go Quickly and flexibly perform complex scatter-gather ranking workflows at Reddit func PopularFeed(d *service.Dependencies) stage.Stage { return stage.Series( stage.Parallel(merger.MergeCandidates, stage.FetchPopularPosts(d.PostCache), stage.FetchVideoPosts(d.PostCache), stage.FetchImagePosts(d.PostCache), ), stage.FetchRecentlyViewedPosts(d.UserPostViews), stage.FilterRecentlyViewedPosts(), stage.ScoreCandidates(d.RankingModel), stage.SortCandidates(), ) }
  89. From Service to Platform When does a service become a

    platform? Service Customers are the users / product management
  90. From Service to Platform When does a service become a

    platform? Service Customers are the users / product management Platform Customers are engineers developing on the service
  91. 109 From Service to Platform: A Ranking System in Go

    What is your API? Path /api/user/{id}.json Description Get a user object Method GET
  92. package user type User struct { ID string Name string

    permissions []string } 110 From Service to Platform: A Ranking System in Go What is your API?
  93. package user type User struct { ID string Name string

    permissions []string } 111 From Service to Platform: A Ranking System in Go What is your API?
  94. package user type User struct { ID string Name string

    permissions []string } 112 From Service to Platform: A Ranking System in Go What is your API?
  95. package user type User struct { ID string Name string

    permissions []string } 113 From Service to Platform: A Ranking System in Go What is your API?
  96. package user type User struct { ID string Name string

    permissions []string } 114 From Service to Platform: A Ranking System in Go What is your API?
  97. package user type User struct { ID string Name string

    permissions []string } 115 From Service to Platform: A Ranking System in Go What is your API?
  98. package user type User struct { ID string Name string

    permissions []string } 116 From Service to Platform: A Ranking System in Go What is your API? • Start with unexported identifiers • Make decisions for exporting identifiers based on how you want them to be consumed
  99. 117 A General-Purpose Ranking Service From Service to Platform: A

    Ranking System in Go Quickly and flexibly perform complex scatter-gather ranking workflows at Reddit func PopularFeed(d *service.Dependencies) stage.Stage { return stage.Series( stage.Parallel(merger.MergeCandidates, stage.FetchPopularPosts(d.PostCache), stage.FetchVideoPosts(d.PostCache), stage.FetchImagePosts(d.PostCache), ), stage.FetchRecentlyViewedPosts(d.UserPostViews), stage.FilterRecentlyViewedPosts(), stage.ScoreCandidates(d.RankingModel), stage.SortCandidates(), ) }
  100. 119 Transition to Platform From Service to Platform: A Ranking

    System in Go Ranking Service Livestream Feed
  101. 120 Transition to Platform From Service to Platform: A Ranking

    System in Go Ranking Service Livestream Feed …
  102. 121 Transition to Platform From Service to Platform: A Ranking

    System in Go Ranking Service Livestream Feed Popular Feed …
  103. 122 Transition to Platform From Service to Platform: A Ranking

    System in Go Ranking Service Livestream Feed Home Feed Popular Feed …
  104. 123 Transition to Platform From Service to Platform: A Ranking

    System in Go Ranking Service Livestream Feed Home Feed Popular Feed … …
  105. 124 Transition to Platform From Service to Platform: A Ranking

    System in Go Ranking Service Livestream Feed Home Feed Popular Feed … Ranking Platform …
  106. package stage 125 Maintaining the Platform From Service to Platform:

    A Ranking System in Go Series Parallel Shuffle Candidates Fetch Posts Filter By Feature … If-Else Fetch Subscriptions
  107. package stage 126 Maintaining the Platform From Service to Platform:

    A Ranking System in Go Series Parallel Shuffle Candidates Fetch Posts Filter By Feature … If-Else Fetch Subscriptions Filter Private Posts Filter by Timestamp Shuffle Topics …
  108. Creating A Toolbox Four Principles for a Platform API 127

    From Service to Platform: A Ranking System in Go Limited Scope Clear Naming Decoupling Strive for Reuse
  109. Applying the Four Principles const shuffleProbability = 0.2 type imagePosts

    struct { cache *store.PostCache } func (s *imagePosts) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { subredditIDs := req.Context.GetStringArrayFeature("subreddit_ids") postIDs, err := s.cache.FetchImagePosts(ctx, subredditIDs) if err != nil { return nil, err } s.shufflePostIDs(postIDs, shuffleProbability) for _, postID := range postIDs { req.Candidates = append(req.Candidates, pb.Candidate(postID)) } return req, nil }
  110. Applying the Four Principles const shuffleProbability = 0.2 type imagePosts

    struct { cache *store.PostCache } func (s *imagePosts) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { subredditIDs := req.Context.GetStringArrayFeature("subreddit_ids") postIDs, err := s.cache.FetchImagePosts(ctx, subredditIDs) if err != nil { return nil, err } s.shufflePostIDs(postIDs, shuffleProbability) for _, postID := range postIDs { req.Candidates = append(req.Candidates, pb.Candidate(postID)) } return req, nil }
  111. Applying the Four Principles const shuffleProbability = 0.2 type imagePosts

    struct { cache *store.PostCache } func (s *imagePosts) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { subredditIDs := req.Context.GetStringArrayFeature("subreddit_ids") postIDs, err := s.cache.FetchImagePosts(ctx, subredditIDs) if err != nil { return nil, err } s.shufflePostIDs(postIDs, shuffleProbability) for _, postID := range postIDs { req.Candidates = append(req.Candidates, pb.Candidate(postID)) } return req, nil }
  112. Applying the Four Principles const shuffleProbability = 0.2 type imagePosts

    struct { cache *store.PostCache } func (s *imagePosts) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { subredditIDs := req.Context.GetStringArrayFeature("subreddit_ids") postIDs, err := s.cache.FetchImagePosts(ctx, subredditIDs) if err != nil { return nil, err } s.shufflePostIDs(postIDs, shuffleProbability) for _, postID := range postIDs { req.Candidates = append(req.Candidates, pb.Candidate(postID)) } return req, nil } Clear Naming
  113. Applying the Four Principles const shuffleProbability = 0.2 type fetchImagePosts

    struct { cache *store.PostCache } func (s *fetchImagePosts) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { subredditIDs := req.Context.GetStringArrayFeature("subreddit_ids") postIDs, err := s.cache.FetchImagePosts(ctx, subredditIDs) if err != nil { return nil, err } s.shufflePostIDs(postIDs, shuffleProbability) for _, postID := range postIDs { req.Candidates = append(req.Candidates, pb.Candidate(postID)) } return req, nil } Clear Naming
  114. Applying the Four Principles const shuffleProbability = 0.2 type fetchImagePosts

    struct { cache *store.PostCache } func (s *fetchImagePosts) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { subredditIDs := req.Context.GetStringArrayFeature("subreddit_ids") postIDs, err := s.cache.FetchImagePosts(ctx, subredditIDs) if err != nil { return nil, err } s.shufflePostIDs(postIDs, shuffleProbability) for _, postID := range postIDs { req.Candidates = append(req.Candidates, pb.Candidate(postID)) } return req, nil } Limited Scope
  115. Applying the Four Principles type fetchImagePosts struct { cache *store.PostCache

    } func (s *fetchImagePosts) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { subredditIDs := req.Context.GetStringArrayFeature("subreddit_ids") postIDs, err := s.cache.FetchImagePosts(ctx, subredditIDs) if err != nil { return nil, err } for _, postID := range postIDs { req.Candidates = append(req.Candidates, pb.Candidate(postID)) } return req, nil } Limited Scope
  116. Applying the Four Principles type fetchImagePosts struct { cache *store.PostCache

    } func (s *fetchImagePosts) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { subredditIDs := req.Context.GetStringArrayFeature("subreddit_ids") postIDs, err := s.cache.FetchImagePosts(ctx, subredditIDs) if err != nil { return nil, err } for _, postID := range postIDs { req.Candidates = append(req.Candidates, pb.Candidate(postID)) } return req, nil } Strive for Reuse
  117. Applying the Four Principles type fetchImagePosts struct { cache *store.PostCache

    } func (s *fetchImagePosts) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { subredditIDs := req.Context.GetStringArrayFeature("subreddit_ids") postIDs, err := s.cache.FetchImagePosts(ctx, subredditIDs) if err != nil { return nil, err } for _, postID := range postIDs { req.Candidates = append(req.Candidates, pb.Candidate(postID)) } return req, nil } Strive for Reuse
  118. Applying the Four Principles type fetchImagePosts struct { cache *store.PostCache

    subredditFeature string } func (s *fetchImagePosts) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { subredditIDs := req.Context.GetStringArrayFeature(s.subredditFeature) postIDs, err := s.cache.FetchImagePosts(ctx, subredditIDs) if err != nil { return nil, err } for _, postID := range postIDs { req.Candidates = append(req.Candidates, pb.Candidate(postID)) } return req, nil } Decoupling
  119. Applying the Four Principles type fetchImagePosts struct { cache *store.PostCache

    subredditFeature string } func (s *fetchImagePosts) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { subredditIDs := req.Context.GetStringArrayFeature(s.subredditFeature) postIDs, err := s.cache.FetchImagePosts(ctx, subredditIDs...) if err != nil { return nil, err } for _, postID := range postIDs { req.Candidates = append(req.Candidates, pb.Candidate(postID)) } return req, nil } Decoupling
  120. Functions Implementing Interfaces “The bigger the interface, the weaker the

    abstraction” ― Rob Pike type RankFunc func(context.Context, *pb.Request) (*pb.Request, error)
  121. Functions Implementing Interfaces “The bigger the interface, the weaker the

    abstraction” ― Rob Pike type RankFunc func(context.Context, *pb.Request) (*pb.Request, error) func (f RankFunc) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { return f(ctx, req) }
  122. Functions Implementing Interfaces “The bigger the interface, the weaker the

    abstraction” ― Rob Pike type RankFunc func(context.Context, *pb.Request) (*pb.Request, error) func (f RankFunc) Rank(ctx context.Context, req *pb.Request) (*pb.Request, error) { return f(ctx, req) }
  123. func Pipeline(d *service.Dependencies) stage.Stage { return stage.Series( stage.FetchSubscriptions(d.SubscriptionService), stage.FetchPosts(d.Cache), stage.RankFunc(func(context.Context,

    *pb.Request) (*pb.Request, error) { if req.Context.Features["geo_country"] == "uk" { // ... } return req, nil }), stage.ShufflePosts(0.2), ) } Anonymous Interface Implementations
  124. Middlewares type Middleware func(stage Stage) Stage func ExampleMiddleware(next stage.Stage) stage.Stage

    { return stage.RankFunc(func(ctx context.Context, req *pb.Request) (*pb.Request, error) { // ... return next.Rank(ctx, req) }) }
  125. func Monitor(next stage.Stage) stage.Stage { return stage.RankFunc(func(ctx context.Context, req *pb.Request)

    (*pb.Request, error) { startedAt := time.Now() method := req.Options.Method defer func() { stageLatencySeconds.With(prometheus.Labels{ methodLabel: method, stageLabel: stage.Name(next), }).Observe(time.Since(startedAt).Seconds()) }() return next.Rank(ctx, req) }) } Middleware: Monitor
  126. func Monitor(next stage.Stage) stage.Stage { return stage.RankFunc(func(ctx context.Context, req *pb.Request)

    (*pb.Request, error) { startedAt := time.Now() method := req.Options.Method defer func() { stageLatencySeconds.With(prometheus.Labels{ methodLabel: method, stageLabel: stage.Name(next), }).Observe(time.Since(startedAt).Seconds()) }() return next.Rank(ctx, req) }) } Middleware: Monitor Record elapsed time in deferred statement Delegate to underlying stage
  127. Middleware: Log func Log(next stage.Stage) stage.Stage { return stage.RankFunc(func(ctx context.Context,

    req *pb.Request) (resp *pb.Request, err error) { defer func() { if err != nil { log.Errorw( "stage failed", "error", err, "request", req.JSON(), "response", resp.JSON(), "stage", stage.Name(stage), ) } }() return stage.Rank(ctx, req) }) }
  128. Middleware: Log func Log(next stage.Stage) stage.Stage { return stage.RankFunc(func(ctx context.Context,

    req *pb.Request) (resp *pb.Request, err error) { defer func() { if err != nil { log.Errorw( "stage failed", "error", err, "request", req.JSON(), "response", resp.JSON(), "stage", stage.Name(stage), ) } }() return stage.Rank(ctx, req) }) } func (r *Request) JSON() string { b, err := protojson.Marshal(r) if err != nil { log.Error(err) return "" } return string(b) }
  129. Middleware: Feature Flags for Incident Mitigation func FeatureFlag(next stage.Stage) stage.Stage

    { return stage.RankFunc(func(ctx context.Context, req *pb.Request) (*pb.Request, error) { key := "feature_flag.stage." + stage.Name(current) if !liveconfig.GetBool(key) { return req, nil } return next.Rank(ctx, req) }) }
  130. Middleware: Feature Flags for Incident Mitigation func FeatureFlag(next stage.Stage) stage.Stage

    { return stage.RankFunc(func(ctx context.Context, req *pb.Request) (*pb.Request, error) { key := "feature_flag.stage." + stage.Name(current) if !liveconfig.GetBool(key) { return req, nil } return next.Rank(ctx, req) }) } Skip stage entirely
  131. A Framework for Refactoring • Being forced to write your

    business logic into small components helped to limit accidental complexity but it did not eliminate the need for refactoring • Our ability to refactor increased due stage interface providing a framework • This framework requires a set of principles for designing those components • Any code contributed to middlewares or meta-stages pays off due to their multiplier effect
  132. • Platform-centric thinking starts with the first developers outside of

    our team contributing code—this is not something you can simulate. • Providing an opinionated framework will create friction. • Ways to resolve confusion or disagreement: 1. Enforce the existing design 2. Quick-and-dirty workaround 3. Rethink the existing design Platform Building is a Discourse
  133. Meta-Stage: Parallel func (s *parallel) Rank(ctx context.Context, req *pb.Request) (*pb.Request,

    error) { resps := make([]*pb.Request, len(s.stages)) g, groupCtx := errgroup.WithContext(ctx) for i := range s.stages { i := i g.Go(func() error { defer log.CapturePanic(groupCtx) resp, err := s.stages[i].Rank(groupCtx, pb.Copy(req)) if err != nil { return err } resps[i] = resp return nil }) } if err := g.Wait(); err != nil { return nil, err } return s.merge(ctx, req, resps...) }
  134. 172 What’s next? No-Code Abstraction --- name: PopularFeed stages: -

    name: Series stages: - name: Parallel stages: - name: FetchPopularPosts - name: FetchVideoPosts - name: FetchImagePosts - name: FetchRecentlyViewedPosts - name: FilterRecentlyViewedPosts - name: ScoreCandidares - name: SortCandidates
  135. 173 What’s next? Pipelines at Runtime --- name: PopularFeed stages:

    - name: Series stages: - name: Parallel stages: - name: FetchPopularPosts - name: FetchVideoPosts - name: FetchImagePosts - name: FetchRecentlyViewedPosts - name: FilterRecentlyViewedPosts - name: ScoreCandidares - name: SortCandidates
  136. Summary • Essential complexity describes a problem at its core,

    accidental complexity happens as part of solving the problem, creating unnecessary challenges; accidental complexity can be reduced • A recommendation system consists of a variety of different ranking flows with the goal to generate content that users find compelling • UNIX pipes as an inspiration to build a system of small, reusable components with the help of the single-method interface in Go • Reusability and clarity are competing concepts. This is Go: choose clarity.