Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The PhD Mind @Work: How my degree helps me today

Imperial ACM
December 06, 2013

The PhD Mind @Work: How my degree helps me today

In this talk I will discuss the impact that a PhD has on my life. More specifically, what drove me into a PhD, how it helped me to get a job and what difference it is making in my current role. This will include R&D challenges I face on a day-to-day basis and how they are tackled using an academic way of thinking. For that, this talk will focus on methodology rather than technical knowledge and highlight why the PhD experience may be more important than you might think.

Imperial ACM

December 06, 2013
Tweet

More Decks by Imperial ACM

Other Decks in Research

Transcript

  1. Senior Software Performance R&D Engineer Felipe Franciosi How my degree

    helps me today The PhD Mind @Work e-mail: [email protected] freenode: felipef #xen-api twitter: @franciozzy
  2. © 2013 Citrix Agenda • My Personal Background ๏ How

    and why I got into a PhD ๏ How I got my job and what my current role is about ! • Solving Performance Problems ๏ The suspicious storage virtualisation overhead ๏ The case of the mysterious network throughput drop ! • Summary ๏ So why is the PhD important again? ๏ Q&A !2
  3. © 2013 Citrix My Personal Background • 1998-2000: Electrical Engineering

    • 2001-2004: Computer Science (BSc) • 2005-2007: Computer Science (MSc) • At around the same time: ๏ Held sysadmin jobs (mainly managing storage systems) ๏ Had a small company for consultancy (a lot of it relating to Xen) !4 Expected Reality
  4. © 2013 Citrix My Personal Background • 2008-2011: My PhD

    was about: ๏ Data Management for Relative QoS in Virtualised Storage Systems ๏ In a nutshell: • Where to place your data in a multi-tiered storage system? • Regarding QoS attributes such as performance, reliability and space efficiency !5 SSD SSD SSD HDD HDD HDD (slow) HDD (slow) HDD (slow) HDD (slow) HDD (slow) HDD (slow) HDD Virtual Volume
  5. © 2013 Citrix My Personal Background • Some time later

    (2011), on the last months of my PhD ๏ The DoC circulated an e-mail from Citrix ๏ Citrix had acquired XenSource in 2007 • XenSource was the start up founded by members of the Computer Lab in Cambridge University to develop an enterprise product based on Xen ! ๏ They were putting together a Performance Team ๏ They needed someone with a background on: • Virtualisation • Storage systems • Performance evaluation !6
  6. © 2013 Citrix My Personal Background • Today I work

    and live in Cambridge: ๏ My current role is mostly about: • Making XenServer faster, with focus on storage • That means enabling “better” virtualised storage for virtual machines - better: faster and consuming less resources ๏ On a daily basis: • Research how different technologies behave in different environments ๏ Create and experiment with new virtualisation protocols ๏ Work on customer cases escalated from support/sales • Usually interesting, giving us a chance to investigate on different equipment !7
  7. © 2013 Citrix Suspicious Storage Virtualisation Overhead • How most

    people measure storage performance? ! ! ! ! ! ! • When evaluating a virtualisation solution (e.g. Xen, KVM, VMWare): ๏ People tend to install a regular OS and obtain some metrics as a baseline ๏ Then install the virtualisation solution, a VM, and repeat the measurements !12 # dd if=/dev/sda of=/dev/null bs=1M count=100 iflag=direct 100+0 records in 100+0 records out 104857600 bytes (105 MB) copied, 0.269689 s, 389 MB/s # hdparm -t /dev/sda ! /dev/sda: Timing buffered disk reads: 1116 MB in 3.00 seconds = 371.70 MB/sec
  8. © 2013 Citrix Suspicious Storage Virtualisation Overhead !13 Some Storage

    Solution Some Storage Solution Hypervisor (Xen) OS (Linux) Storage Drivers User Application (dd) Backend Services Virtual Machine Zero (dom0) Storage Drivers OS (Linux) OS (Linux) Virtual Drivers User Application (dd) Unpriv. Virtual Machine (domU)
  9. © 2013 Citrix Suspicious Storage Virtualisation Overhead • What kind

    of throughput can I get from dom0 to my device? ๏ Using 1 MiB reads, this host reports 118 MB/s from dom0 !14
  10. © 2013 Citrix Suspicious Storage Virtualisation Overhead • What kind

    of throughput can I get from domU to my device? ๏ Using 1 MiB reads, this host reports 117 MB/s from a VM !15 IMPERCEPTIBLE virtualisation overhead
  11. © 2013 Citrix Suspicious Storage Virtualisation Overhead • That’s not

    always the case... ๏ Same test on different hardware (from dom0) !16 my disks can do 700 MB/s !!!!
  12. © 2013 Citrix Suspicious Storage Virtualisation Overhead • That’s not

    always the case... ๏ Same test on different hardware (from domU) !17 VISIBLE virtualisation overhead why is my VM only doing 300 MB/s ???
  13. © 2013 Citrix Suspicious Storage Virtualisation Overhead • Facing this

    issue is no different than facing issues during a PhD ๏ There is a problem to be understood / solved ๏ No one has a clue about what is going on ๏ It is absolutely normal to be stuck without progress !18
  14. © 2013 Citrix Suspicious Storage Virtualisation Overhead • First thing

    to do: take a step back and understand how things work ๏ There are different ways a user application can do storage I/O • We will use simple read() and write() libc wrappers as examples !19 user process 1. char buf[4096]; ! 2. int fd = open(“/dev/sda”, O_RDONLY | O_DIRECT); ! 3. read(fd, buf, 4096); ! 4. buf now has the data! buf fd kernel space user space vfs_read() f_op->read()** BD device driver block layer libc sys_read(fd, buf, 4096) HW Interrupt on completion
  15. © 2013 Citrix How Things Work in Xen • The

    “Upstream Xen” use case ๏ The virtual device in the guest is implemented by blkfront ๏ Blkfront connects to blkback, which handles the I/O in dom0 !20 device driver VDI kernel space user space user process buf fd block layer dom0 domU syscall / etc() blkfront blkback xen’s blkif protocol block layer device driver BD kernel space user space libc
  16. © 2013 Citrix Suspicious Storage Virtualisation Overhead • Same host,

    different RAID0 logical volumes on a PERC H700 ๏ All have 64 KiB stripes, adaptive read-ahead and write-back cache enabled !21
  17. © 2013 Citrix !22 • dom0 had: • 4 vCPUs

    pinned • 4 GB of RAM ! • It becomes visible that certain back ends cope much better with larger block sizes. ! • This controller supports up to 128 KiB per request. ! • Above that, the Linux block layer splits the requests.
  18. © 2013 Citrix !23 • Seagate ST (SAS) ! •

    blkback is slower, but it catches up with big enough requests.
  19. © 2013 Citrix !24 • Seagate ST (SAS) ! •

    User space back ends are so slow they never catch up, even with bigger requests. ! • This is not always true: if the disks were slower, they would catch up.
  20. © 2013 Citrix !25 • Intel DC S3700 (SSD) !

    • When the disks are really fast, none of the technologies catch up.
  21. © 2013 Citrix Suspicious Storage Virtualisation Overhead • There is

    another way to look at the data: !26 Throughput (data/time) 1 = Latency (time/data)
  22. © 2013 Citrix !27 • Intel DC S3700 (SSD) !

    • The question now is: where is time being spent? ! • Compare time spent: • dom0 • blkback • qdisk
  23. © 2013 Citrix Summary • The point here is to

    first understand the problem ! • Then analyse the issue with whatever resources are available ! • Finally tackle the problem in the most efficient way ! ! • Without a methodology, it’s just like shooting in the dark !28
  24. © 2013 Citrix Mysterious Network Throughput Drop • Citrix product

    called Branch Repeater ๏ In this case, shipped as a virtualised instance ๏ That is, a VM that can accelerate network traffic by caching on both ends !30 Client (CentOS) (iperf -c ... -t 60) XenServer (running BR VPX VM) Delay Router creates packet loss and increases latency SM88 Bare-metal Branch Repeater Appliance Server (CentOS) (iperf -s)
  25. © 2013 Citrix Mysterious Network Throughput Drop • Issue: ๏

    When using XenServer 5.6FP1, throughput was 700 Mbps ๏ When using XenServer 6.0, throughput was 300 Mbps !31 Client (CentOS) (iperf -c ... -t 60) XenServer (running BR VPX VM) Delay Router creates packet loss and increases latency SM88 Bare-metal Branch Repeater Appliance Server (CentOS) (iperf -s) Nothing changed but the virtualisation platform version
  26. © 2013 Citrix Mysterious Network Throughput Drop • Facts: ๏

    Hardware is exactly the same ๏ Branch Repeater VPX VM is exactly the same ๏ Client and Server software at extremities of test pod are exactly the same ๏ The only difference is the XenServer version ! • So what differences can affect network throughput? ๏ Network card driver version ๏ Domain 0 kernel version (very unlikely) ๏ Hypervisor version (very unlikely) ! • Remember: this is 700 Mbps to 300 Mbps!! It’s a huge drop! !32
  27. © 2013 Citrix Mysterious Network Throughput Drop • Facing this

    issue is no different than facing issues during a PhD ๏ There is a problem to be understood / solved ๏ No one has a clue about what is going on ๏ It is absolutely normal to be stuck without progress !33
  28. © 2013 Citrix Mysterious Network Throughput Drop • First things

    first: ๏ Tried to make the environment as similar as possible • Replaced components one by one and re-evaluated the throughput ! • To our surprise: ๏ Replacing network drivers did not make any difference ๏ Replacing the Domain 0 kernel did not make any difference ! • Eventually: ๏ Replacing the hypervisor made a difference • That made absolutely no sense! The hypervisor is only partially in the data path • Such a problem in the hypervisor would certainly have been spotted earlier !34
  29. © 2013 Citrix Mysterious Network Throughput Drop • One more

    thing: ๏ In the BR VPX VM there is an application that mangles the network packets ๏ On XenServer 5.6FP1 (700 Mbps), this application consumes 80% of CPU ๏ On XenServer 6.0 (300 Mbps), this application consumes 100% of CPU !35
  30. © 2013 Citrix Mysterious Network Throughput Drop • We then

    used a profiling technique ๏ This interrupts the CPU every so many instructions ๏ It tells what function was being executed ๏ By sampling, we can work out what is consuming most time in terms of CPU ! • By comparing the same workload on both hypervisors ๏ We noticed the slow one was spending a lot of time reading the TSC ! • As it turns out, the user application within the VM relied on gettimeofday() ๏ On the fast scenario, the hypervisor was calling the hardware TSC method ๏ On the slow scenario, the TSC was being emulated !36
  31. © 2013 Citrix Summary • When you face things you

    don’t understand, you ask and learn !40
  32. © 2013 Citrix Summary • If even bad publicity is

    good publicity, then definitely any experience is good experience !41