Cassandra Summit Tokyo 2017 Keynote

Cassandra Summit Tokyo 2017 Keynote

by Nate McCall at Cassandra Summit Tokyo 2017

1cf6896ee8a72af116a172b9e1cd5883?s=128

CassandraCommunityJP

October 13, 2017
Tweet

Transcript

  1. APACHE CASSANDRA WHERE WE ARE NOW AND THE ROAD AHEAD

  2. Nate McCall @zznate CTO, The Last Pickle Apache Cassandra Committer

    PMC Chair
  3. AGENDA * Community * Apache Cassandra project status * 4.0

    * The road ahead
  4. BUT FIRST...

  5. October 31st, 2010

  6. None
  7. None
  8. LET'S TALK ABOUT COMMUNITY

  9. PROJECTS WITH COMPANIES THAT PROVIDE DISTRIBUTIONS • Cloudera|Hortonworks|MapR: Hadoop •

    DataBricks: Spark • Confluent: Kafka • Data Artisans: Flink • IBM: *.* 9
  10. EXAMPLE ASF VENDOR COMPATIBILITY MATRIX

  11. UNFORTUNATELY, WE ARE STARTING TO SEE THIS TOO!

  12. NO COMMUNITY SUPPORT IS A LONELY ROAD TO TRAVEL.

  13. WHAT DOES THIS MEAN FOR OUR COMMUNITY? • Most committers

    and community participants are running clusters • Users on the mailing lists tend to have a lot of experience • No corporate message in project communication • No external pressures on developing features for marketing timelines • Events will be by the community for the community (like this one!) 13
  14. https://twitter.com/jericevans/status/912765106739712001

  15. COMMUNITY STRENGTHS • Very active mail lists • Progressively better

    JIRA grooming • Documentation improvements • Fast turn around on serious bugs 15
  16. OTHER COMMUNITY HIGHLIGHTS Donation of build resources

  17. https://builds.apache.org/computer/

  18. OTHER COMMUNITY HIGHLIGHTS Contribution of cassandra-dtest project

  19. Donated to the ASF

  20. The takeaway: You don't need a vendor to run Apache

    Cassandra.
  21. APACHE CASSANDRA PROJECT STATUS

  22. (YEAH, WE ALL GOT CONFUSED) "TICK TOCK"

  23. TICK TOC LEGACY: GOOD • Much larger emphasis on integration

    testing • Longer term support of existing releases (2.1.19 just released!) • Commitment to "green test board" before release
  24. TICK TOC LEGACY: BAD 24 git am -3 mypatch-3.0.patch git

    merge cassandra-3.0 -s ours git apply -3 mypatch-3.11.patch git commit -amend git merge cassandra-3.11 -s ours git apply -3 mypatch-trunk.patch git commit -amend git push origin cassandra-3.0 cassandra-3.3 trunk -atomic Committing to 3.0, 3.11, and trunk:
  25. TICK TOC LEGACY: BAD 25 git am -3 mypatch-3.0.patch git

    merge cassandra-3.0 -s ours git apply -3 mypatch-3.11.patch git commit -amend git merge cassandra-3.11 -s ours git apply -3 mypatch-trunk.patch git commit -amend git push origin cassandra-3.0 cassandra-3.3 trunk -atomic Committing to 3.0, 3.11, and trunk:
  26. TICK TOC LEGACY: BAD 26 git am -3 mypatch-3.0.patch git

    merge cassandra-3.0 -s ours git apply -3 mypatch-3.11.patch git commit -amend git merge cassandra-3.11 -s ours git apply -3 mypatch-trunk.patch git commit -amend git push origin cassandra-3.0 cassandra-3.3 trunk -atomic Committing to 3.0, 3.11, and trunk:
  27. GREAT. WHAT VERSION SHOULD I USE? 2.2 3.0 3.11

  28. IT'S TIME FOR REAL TALK.

  29. VERSION BREAKDOWN • 2.2: low risk, works well • 3.0:

    "stable" release of tic/toc • 3.11: the new stable (as of 3.11.1) 29
  30. VERSION BREAKDOWN • 2.2: low risk, works well • 3.0:

    "stable" release of tic/toc • 3.11: the new stable (as of 3.11.1) 30
  31. BUT WHAT ABOUT MATERIALIZED VIEWS? • *JUST* fixed https://issues.apache.org/jira/browse/CASSANDRA-11500 •

    Partition deletion causes entire view partition to be held in memory (CASSANDRA-12783) • Can only filter on primary key columns (CASSANDRA-13826) 31
  32. MATERIALIZED VIEWS: THE HAPPY PATH IS VERY NARROW

  33. NEW IN 4.0

  34. NO MORE THRIFT :(

  35. FASTER, BETTER, CHEAPER REPAIR

  36. REPAIR BUG FIXES AND IMPROVEMENTS • Over 20 bugs fixed!

    • New metrics provide table and keyspace level visibility: CASSANDRA-13531, CASSANDRA-13598 • More repair information on nodetool tablestats: CASSANDRA-13774 • Incremental repair works more consistently • Substantial repair logging improvements: CASSANDRA-13468 36
  37. REPAIR STREAMING PREVIEW 37 Total estimated streaming: 41384 ranges, 10

    sstables, 159.007MiB bytes sandwich_long.test_data - 41384 ranges, 10 sstables, 159.007MiB bytes /127.0.0.2 -> /127.0.0.1: 10346 ranges, 3 sstables, 62.888MiB bytes /127.0.0.1 -> /127.0.0.2: 10346 ranges, 2 sstables, 16.616MiB bytes /127.0.0.1 -> /127.0.0.3: 10346 ranges, 2 sstables, 16.616MiB bytes /127.0.0.3 -> /127.0.0.1: 10346 ranges, 3 sstables, 62.887MiB bytes nodetool repair --preview
  38. REPAIR STREAMING PREVIEW 38 Total estimated streaming: 41384 ranges, 10

    sstables, 159.007MiB bytes sandwich_long.test_data - 41384 ranges, 10 sstables, 159.007MiB bytes /127.0.0.2 -> /127.0.0.1: 10346 ranges, 3 sstables, 62.888MiB bytes /127.0.0.1 -> /127.0.0.2: 10346 ranges, 2 sstables, 16.616MiB bytes /127.0.0.1 -> /127.0.0.3: 10346 ranges, 2 sstables, 16.616MiB bytes /127.0.0.3 -> /127.0.0.1: 10346 ranges, 3 sstables, 62.887MiB bytes nodetool repair --preview who what
  39. PULL REPAIR 39 Total estimated streaming: 3550 ranges, 3 sstables,

    10.525MiB bytes sandwich_long.test_data - 3550 ranges, 3 sstables, 10.525MiB bytes /127.0.0.2 -> /127.0.0.1: 1775 ranges, 3 sstables, 10.525MiB bytes /127.0.0.1 -> /127.0.0.2: 1775 ranges, 0 sstables, 0.000KiB bytes nodetool repair --pull -hosts 127.0.0.1,127.0.0.2 -st -3074457345618258603 -et 0 --preview
  40. PULL REPAIR 40 Total estimated streaming: 3550 ranges, 3 sstables,

    10.525MiB bytes sandwich_long.test_data - 3550 ranges, 3 sstables, 10.525MiB bytes /127.0.0.2 -> /127.0.0.1: 1775 ranges, 3 sstables, 10.525MiB bytes /127.0.0.1 -> /127.0.0.2: 1775 ranges, 0 sstables, 0.000KiB bytes nodetool repair --pull -hosts 127.0.0.1,127.0.0.2 -st -3074457345618258603 -et 0 --preview
  41. PULL REPAIR 41 Total estimated streaming: 3550 ranges, 3 sstables,

    10.525MiB bytes sandwich_long.test_data - 3550 ranges, 3 sstables, 10.525MiB bytes /127.0.0.2 -> /127.0.0.1: 1775 ranges, 3 sstables, 10.525MiB bytes /127.0.0.1 -> /127.0.0.2: 1775 ranges, 0 sstables, 0.000KiB bytes nodetool repair --pull -hosts 127.0.0.1,127.0.0.2 -st -3074457345618258603 -et 0 --preview
  42. PULL REPAIR 42 Total estimated streaming: 3550 ranges, 3 sstables,

    10.525MiB bytes sandwich_long.test_data - 3550 ranges, 3 sstables, 10.525MiB bytes /127.0.0.2 -> /127.0.0.1: 1775 ranges, 3 sstables, 10.525MiB bytes /127.0.0.1 -> /127.0.0.2: 1775 ranges, 0 sstables, 0.000KiB bytes nodetool repair --pull -hosts 127.0.0.1,127.0.0.2 -st -3074457345618258603 -et 0 --preview
  43. PULL REPAIR 43 Total estimated streaming: 3550 ranges, 3 sstables,

    10.525MiB bytes sandwich_long.test_data - 3550 ranges, 3 sstables, 10.525MiB bytes /127.0.0.2 -> /127.0.0.1: 1775 ranges, 3 sstables, 10.525MiB bytes /127.0.0.1 -> /127.0.0.2: 1775 ranges, 0 sstables, 0.000KiB bytes nodetool repair --pull -hosts 127.0.0.1,127.0.0.2 -st -3074457345618258603 -et 0 --preview who what
  44. REPAIR STREAMING PREVIEW 44 Total estimated streaming: 41384 ranges, 10

    sstables, 159.007MiB bytes sandwich_long.test_data - 41384 ranges, 10 sstables, 159.007MiB bytes /127.0.0.2 -> /127.0.0.1: 10346 ranges, 3 sstables, 62.888MiB bytes /127.0.0.1 -> /127.0.0.2: 10346 ranges, 2 sstables, 16.616MiB bytes /127.0.0.1 -> /127.0.0.3: 10346 ranges, 2 sstables, 16.616MiB bytes /127.0.0.3 -> /127.0.0.1: 10346 ranges, 3 sstables, 62.887MiB bytes nodetool repair --preview who what
  45. NEW CONVENIENCE FUNCTIONS CQL ADDITIONS

  46. TIMESTAMP FUNCTIONS 46 CREATE TABLE times ( pk int, time

    date, val int, PRIMARY KEY (pk, time));
  47. TIMESTAMP FUNCTIONS 47 CREATE TABLE times ( pk int, time

    date, val int, PRIMARY KEY (pk, time));
  48. TIMESTAMP FUNCTIONS 48 SELECT * FROM times WHERE pk =

    1; pk | time | val ----+------------+----- 1 | 2016-09-28 | 2 1 | 2016-09-30 | 2 SELECT * FROM times WHERE pk = 1 AND time < '2017-10-01' - 1y2d; pk | time | val ----+------------+----- 1 | 2016-09-28 | 2
  49. TIMESTAMP FUNCTIONS 49 SELECT * FROM times WHERE pk =

    1; pk | time | val ----+------------+----- 1 | 2016-09-28 | 2 1 | 2016-09-30 | 2 SELECT * FROM times WHERE pk = 1 AND time < '2017-10-01' - 1y2d; pk | time | val ----+------------+----- 1 | 2016-09-28 | 2
  50. TIMESTAMP FUNCTIONS 50 SELECT * FROM times WHERE pk =

    1; pk | time | val ----+------------+----- 1 | 2016-09-28 | 2 1 | 2016-09-30 | 2 SELECT * FROM times WHERE pk = 1 AND time < '2017-10-01' - 1y2d; pk | time | val ----+------------+----- 1 | 2016-09-28 | 2
  51. TIMESTAMP FUNCTIONS 51 SELECT * FROM times WHERE pk =

    1; pk | time | val ----+------------+----- 1 | 2016-09-28 | 2 1 | 2016-09-30 | 2 SELECT * FROM times WHERE pk = 1 AND time < '2017-10-01' - 1y2d; pk | time | val ----+------------+----- 1 | 2016-09-28 | 2
  52. ARITHMETIC OPERATORS 52 CREATE TABLE arithmetic ( key text, ts

    bigint, val text, PRIMARY KEY (key, ts));
  53. ARITHMETIC OPERATORS 53 CREATE TABLE arithmetic ( key text, ts

    bigint, val text, PRIMARY KEY (key, ts));
  54. ARITHMETIC OPERATORS 54 SELECT * from arithmetic WHERE key =

    'mykey'; key | ts | val -------+----+------------ mykey | 10 | some value
  55. ARITHMETIC OPERATORS 55 SELECT ts * 10 FROM arithmetic WHERE

    key = 'mykey'; ts * 10 --------- 100 SELECT ts % 10 FROM arithmetic WHERE key = 'mykey'; ts % 10 --------- 0
  56. ARITHMETIC OPERATORS 56 https://github.com/apache/cassandra/blob/trunk/doc/source/cql/operators.rst

  57. INTERNODE MESSAGING RE-WRITE 57

  58. INTERNODE MESSAGING RE-WRITE 58 @15:45 Achievement Unblocked: Switching Apache Cassandra

    to Netty and non-blocking I/O - A followup
  59. EASIER TO IMPLEMENT CLUSTER SECURITY

  60. EASIER CLUSTER SECURITY 60 Because: It is really easy to

    attack an un-protected cluster
  61. EASIER CLUSTER SECURITY 61 HA! A bin-packed message format with

    no source verification!* Ease of scalability comes with a price * <currently reading o.a.c.net.MessageIn#read>
  62. EASIER CLUSTER SECURITY 62 nmap -Pn -p7000 \ -oG logs/cass.gnmap

    54.88.0.0/14 HA! A bin-packed message format with no source verification!*
  63. IMPORTANT BECAUSE... 63 Fun fact 1: It takes a single

    Message to insert an admin account into the system table
  64. IMPORTANT BECAUSE... 64 Fun fact 2: It takes a single

    Message to truncate a table
  65. IMPORTANT BECAUSE... 65 -Dcassandra.write_survey=true How to steal writes in real

    time:
  66. node to node encryption and SSL client certificate authentication to

    cluster traffic THE EASY FIX: 66 http://thelastpickle.com/blog/2015/09/30/hardening-cassandra-step-by-step-part-1-server-to-server.html
  67. node to node encryption and SSL client certificate authentication to

    cluster traffic THE EASY FIX: 67 Bonus: can be done with NO downtime!!! http://thelastpickle.com/blog/2015/09/30/hardening-cassandra-step-by-step-part-1-server-to-server.html
  68. node to node encryption and SSL client certificate authentication to

    cluster traffic THE EASY FIX: 68 Bonus: can be done with NO downtime!!! http://thelastpickle.com/blog/2015/09/30/hardening-cassandra-step-by-step-part-1-server-to-server.html CASSANDRA-10404
  69. THE EASY FIX: 69 When you are done it should

    look like:
  70. THE EASY FIX: 70 Don't forget this part!!! When you

    are done it should look like:
  71. WHAT COMES NEXT?

  72. VIRTUAL TABLES

  73. 73 cqlsh> select * from virtual.tables5 where keyspace_name = 'tlp_ks'

    and metric > 'memtable' and metric < 'memtableZ' ALLOW FILTERING; keyspace_name | table_name | metric | value ---------------+--------------------------+----------------------+---------------- tlp_ks | monitoring_example | memtableOnHeapSize | {"value":95201} tlp_ks | monitoring_example | memtableOffHeapSize | {"value":44811} tlp_ks | monitoring_example | memtableLiveDataSize | {"value":42128} tlp_ks | monitoring_example | memtableColumnsCount | {"value":248} tlp_ks | monitoring_example | memtableSwitchCount | {"count":4} ...
  74. 74 cqlsh> select * from virtual.tables5 where keyspace_name = 'tlp_ks'

    and metric > 'memtable' and metric < 'memtableZ' ALLOW FILTERING; keyspace_name | table_name | metric | value ---------------+--------------------------+----------------------+---------------- tlp_ks | monitoring_example | memtableOnHeapSize | {"value":95201} tlp_ks | monitoring_example | memtableOffHeapSize | {"value":44811} tlp_ks | monitoring_example | memtableLiveDataSize | {"value":42128} tlp_ks | monitoring_example | memtableColumnsCount | {"value":248} tlp_ks | monitoring_example | memtableSwitchCount | {"count":4} ...
  75. None
  76. CASSANDRA-7622, CASSANDRA-9233

  77. PLUGGABLE STORAGE ENGINE

  78. https://www.slideshare.net/sunsuk7tp/mycassandra

  79. https://www.slideshare.net/sunsuk7tp/mycassandra CASSANDRA-2995

  80. ROCKS DB INTEGRATION

  81. ROCKS DB INTEGRATION

  82. ROCKS DB INTEGRATION Thanks to Dikang Gu and the Instagram

    team!
  83. None
  84. None
  85. None
  86. None
  87. NOT A PANACEA • Some challenges with streaming • Big

    surface area of changes 87
  88. 88

  89. PLUGGABLE STORAGE AND ROCKSDB - FOLLOW THESE ISSUES: • CASSANDRA-13474

    • CASSANDRA-13475 • CASSANDRA-13476 • CASSANDRA-13553 89
  90. EXPERIMENTAL FEATURES OFF BY DEFAULT 90 "-Dcassandra.mv.allow_filtering_nonkey_columns_unsafe=true" MV keys on

    non primary key columns (from CASSANDRA-13826):
  91. FIXING BUGS! (DISCOVERED BY REAL WORKLOADS) 91 Our developers are

    running clusters. Sometimes *very* big clusters.
  92. Future Cassandra Operators

  93. THANKS! @ZZNATE