Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Going Infinite, handling 1M websockets connections in Go

Twistlock
February 11, 2019

Going Infinite, handling 1M websockets connections in Go

Twistlock's Eran Yanay presented this deck at #GopherConIL in February 2019

Twistlock

February 11, 2019
Tweet

More Decks by Twistlock

Other Decks in Technology

Transcript

  1. The goal Developing high-load Go server that is able to

    manage millions of concurrent connections • How to write a webserver in Go? • How to handle persistent connections? • What limitations arise in scale? • How to handle persistent connections efficiently? ◦ OS limitations ◦ Hardware limitations
  2. How a Go web server works? package main import (

    "io" "net/http" ) func main() { http.HandleFunc("/", hello) http.ListenAndServe (":8000", nil) } func hello(w http.ResponseWriter, r *http.Request) { io.WriteString(w, "Hello Gophercon!" ) }
  3. How a Go web server works? package main import (

    "io" "net/http" ) func main() { http.HandleFunc("/", hello) http.ListenAndServe (":8000", nil) } func hello(w http.ResponseWriter, r *http.Request) { io.WriteString(w, "Hello Gophercon!" ) }
  4. How a Go web server works? // Serve accepts incoming

    connections on the Listener l, creating a // new service goroutine for each. The service goroutines read requests and // then call srv.Handler to reply to them. func (srv *Server) Serve(l net.Listener) error { // ... for { rw, e := l.Accept() // ... c := srv.newConn(rw) c.setState(c.rwc, StateNew) // before Serve can return go c.serve(ctx) } }
  5. How a Go web server works? // Serve accepts incoming

    connections on the Listener l, creating a // new service goroutine for each. The service goroutines read requests and // then call srv.Handler to reply to them. func (srv *Server) Serve(l net.Listener) error { // ... for { rw, e := l.Accept() // ... c := srv.newConn(rw) c.setState(c.rwc, StateNew) // before Serve can return go c.serve(ctx) } } func hello(w http.ResponseWriter, r *http.Request) { io.WriteString(w, "Hello Gophercon!" ) }
  6. The need for persistent connections • Message queues • Chat

    applications • Notifications • Social feeds • Collaborative editing • Location updates
  7. What is a websocket? WebSockets provide a way to maintain

    a full-duplex persistent connection between a client and server that both parties can start sending data at any time, with low overhead and latency GET ws://websocket.example.com/ HTTP/1.1 Connection: Upgrade Host: websocket.example.com Upgrade: websocket Client Server HTTP/1.1 101 WebSocket Protocol Handshake Connection: Upgrade Upgrade: WebSocket
  8. Websockets in Go func ws(w http.ResponseWriter, r *http.Request) { //

    Upgrade connection upgrader := websocket.Upgrader{} conn, err := upgrader.Upgrade(w, r, nil) if err != nil { return } for { _, msg, err := conn.ReadMessage() if err != nil { log.Printf("Failed to read message %v", err) conn.Close() return } log.Println(string(msg)) } } func main() { http.HandleFunc("/", ws) http.ListenAndServe(":8000", nil) }
  9. Too many open files • Every socket is represented by

    a file descriptor • The OS needs memory to manage each open file • Memory is a limited resource • Maximum number of open files can be changed via ulimits
  10. Resources limit Ulimit provides control over the resources available to

    processes • The kernel enforces the soft limit for the corresponding resource • The hard limit acts as a ceiling for the soft limit • Unprivileged process can only raise up to the hard limit • Privileged process can make any arbitrary change • RLIMIT_NOFILE is the resource enforcing max number of open files
  11. Resources limit in Go func SetUlimit() error { var rLimit

    syscall.Rlimit if err := syscall.Getrlimit(syscall.RLIMIT_NOFILE, &rLimit); err != nil { return err } rLimit.Cur = rLimit.Max return syscall.Setrlimit(syscall.RLIMIT_NOFILE, &rLimit) }
  12. pprof Package pprof serves via its HTTP server runtime profiling

    data in the format expected by the pprof visualization tool. • Analyze heap memory: go tool pprof http://localhost:6060/debug/pprof/heap • Analyze goroutines: go tool pprof http://localhost:6060/debug/pprof/goroutine import _ "net/http/pprof" go func() { if err := http.ListenAndServe ("localhost:6060" , nil); err != nil { log.Fatalf("Pprof failed: %v" , err) } }()
  13. Memory consumption Each connection in the naive solution consumes ~20KB:

    func ws(w http.ResponseWriter, r *http.Request) { // ... }
  14. Memory consumption Each connection in the naive solution consumes ~20KB:

    func ws(w http.ResponseWriter, r *http.Request) { // ... } upgrader := websocket.Upgrader{} conn, err := upgrader.Upgrade(w, r, nil) if err != nil { return }
  15. Memory consumption Each connection in the naive solution consumes ~20KB:

    Serving a million concurrent connections would consume over 20GB of RAM! func ws(w http.ResponseWriter, r *http.Request) { // ... } upgrader := websocket.Upgrader{} conn, err := upgrader.Upgrade(w, r, nil) if err != nil { return }
  16. Optimizations If we could… • Optimize goroutines • Optimize net/http

    objects allocations • Reuse allocated buffers across websockets read/write
  17. Optimization #1: goroutines Knowing when data exists on the wire

    would allow us to reuse goroutines and reduce memory footprint • goroutines • select / poll • epoll
  18. Optimization #1: goroutines Knowing when data exists on the wire

    would allow us to reuse goroutines and reduce memory footprint • goroutines • select / poll • epoll func ws(w http.ResponseWriter, r *http.Request) { // Upgrade connection … for { _, msg, err := conn.ReadMessage() if err != nil { log. Printf("Failed to read message %v" , err) conn.Close() return } log.Println(string(msg)) } }
  19. Optimization #1: goroutines Knowing when data exists on the wire

    would allow us to reuse goroutines and reduce memory footprint • goroutines • select / poll • epoll t := &syscall.Timeval{ /* timeout for the call */ } if _, err := syscall.Select(maxFD+1, fds, nil, nil, t); err != nil { return nil, err } for _, fd := range fds { if fdIsSet(fdset, fd) { // Consume data } }
  20. Optimization #1: goroutines Knowing when data exists on the wire

    would allow us to reuse goroutines and reduce memory footprint • goroutines • select / poll • epoll epfd, _ := unix.EpollCreate1(0) _ := unix.EpollCtl(epfd, syscall.EPOLL_CTL_ADD, fd, &unix.EpollEvent{Events: unix.POLLIN | unix.POLLHUP, Fd: fd}) events := make([]unix.EpollEvent, 100) n, _ := unix.EpollWait(e.fd, events, 100) for i := 0; i < n; i++ { // Consume data from connection who's fd is events[i].Fd }
  21. Epoll - Demo! fd, err := unix.EpollCreate1(0) if err !=

    nil { return nil, err } fd := websocketFD(conn) err := unix.EpollCtl(e.fd, syscall.EPOLL_CTL_ADD, fd, &unix.EpollEvent{Events: unix.POLLIN | unix.POLLHUP, Fd: int32(fd)}) if err != nil { return err }
  22. Optimization #2: buffers allocations gorilla/websocket keeps a reference to the

    underlying buffers given by Hijack() var br *bufio.Reader if u.ReadBufferSize == 0 && bufioReaderSize(netConn, brw.Reader) > 256 { // Reuse hijacked buffered reader as connection reader. br = brw.Reader } buf := bufioWriterBuffer(netConn, brw.Writer) var writeBuf []byte if u.WriteBufferPool == nil && u.WriteBufferSize == 0 && len(buf) >= maxFrameHeaderSize+256 { // Reuse hijacked write buffer as connection buffer. writeBuf = buf } c := newConn(netConn, true, u.ReadBufferSize, u.WriteBufferSize, u.WriteBufferPool, br, writeBuf)
  23. Optimization #2: buffers allocations github.com/gobwas/ws - alternative websockets library for

    Go • No intermediate allocations during I/O • Low-level API which allows building logic of packet handling and buffers • Zero-copy upgrades import "github.com/gobwas/ws" func wsHandler(w http.ResponseWriter, r *http.Request) { conn, _, _, err := ws.UpgradeHTTP(r, w) if err != nil { return } // Add to epoll } for { // Fetch ready connections with epoll logic msg, _, err := wsutil.ReadClientData(conn) if err == nil { log.Printf("msg: %s", string(msg)) } else { // Close connection } }
  24. Buffer allocations - Results We managed to reduce the memory

    usage by 97% Serving over a million connections is now reduced from ~20GB to ~600MB
  25. Recap.. Premature optimization is the root of all evil, but

    if we must: • Ulimit: Increase the cap of NOFILE resource • Epoll (Async I/O): Reduce the high load of goroutines • Gobwas - More performant ws library to reduce buffer allocations • Conntrack table - Increase the cap of total concurrent connections in the OS