Upgrade to Pro — share decks privately, control downloads, hide ads and more …

SunshinePHP 2017 - Making the most out of MySQL

SunshinePHP 2017 - Making the most out of MySQL

The new JSON fields are some of the most talking about new features in MySQL 5.7. But they are by no means the only awesome things this version has to offer. MySQL 5.7 is a year old, so this talk won't be an introduction to this version. We will be digging into 5.7 to see how to make the most of the tools available in it. Want to tackle important practical problem solving for your data, make your query performance analysis more efficient or look at how virtual columns can help you index data? This talk is for you!

Avatar for Gabriela D'Ávila Ferrara

Gabriela D'Ávila Ferrara

February 04, 2017
Tweet

More Decks by Gabriela D'Ávila Ferrara

Other Decks in Programming

Transcript

  1. What to Expect? • DDL Changes • SQL Modes •

    Generated Columns • JSON Data Type • `sys` Schema 3
  2. DDL Changes • In Place (ALGORITHM=INPLACE) • Rename Index •

    VARCHAR from 1B to 255B • VARCHAR from 256B to 65353B* • Add a Virtual Column • Table-copy (ALGORITHM=COPY) • Add a Column • VARCHAR between 255B and >=256 • Decreasing VARCHAR size • Type Conversion 5 More Info
  3. SQL modes (SELECT @@GLOBAL.sql_mode;) 7 On MySQL 5.6: +--------------------------------------------+ |

    @@GLOBAL.sql_mode | +--------------------------------------------+ | STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION | +--------------------------------------------+ 1 row in set (0.00 sec) On MySQL 5.7: +-----------------------------------------------------------------------+ | @@GLOBAL.sql_mode | +-----------------------------------------------------------------------+ | ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE, |
 | ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION | +-----------------------------------------------------------------------+ 1 row in set (0.00 sec)
  4. SQL modes comparison On MySQL 5.6.35: • STRICT_TRANS_TABLES (Bug #69743)

    • NO_ENGINE_SUBSTITUTION On MySQL 5.7.17: • STRICT_TRANS_TABLES • NO_ENGINE_SUBSTITUTION • ONLY_FULL_GROUP_BY • NO_ZERO_IN_DATE* • NO_ZERO_DATE* • ERROR_FOR_DIVISION_BY_ZERO* 8 Full list of sql_mode
  5. YYYY-MM-DD • NO_ZERO_DATE • 0000-00-00 👎 • 0000-00-00 00:00:00 👎

    • NO_ZERO_IN_DATE • 2017-01-00 👎 • 2017-00-04 👎 • 0000-01-04 👍 9
  6. ERROR_FOR_DIVISION_BY_ZERO 10 mysql> UPDATE users SET id = id/0 WHERE

    id = 2; ERROR 1365 (22012): Division by 0 • Write Operation: UPDATE mysql> SELECT id, mod(id,0) FROM users WHERE id = 2; +----+-----------+ | id | mod(id,0) | +----+-----------+ | 2 | NULL | +----+-----------+ 1 row in set, 1 warning (0.00 sec) • Read Operation: SELECT mysql> SHOW WARNINGS; +---------+------+---------------+ | Level | Code | Message | +---------+------+---------------+ | Warning | 1365 | Division by 0 | +---------+------+---------------+ 1 row in set (0.00 sec)
  7. Table `users` Field Type Null Key Default Extra id int(10)

    unsigned NO PRI auto_increment first_name varchar(255) NO NULL last_name varchar(255) NO NULL email varchar(255) NO NULL twitter_info json YES created_at datetime NO CURRENT_TIMESTAMP updated_at datetime NO CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP 13
  8. Repeated emails SELECT * FROM users GROUP BY email HAVING

    count(*) > 1 +-------+------------+-----------+---------------------+-------------------+---------------------+---------------------+ | id | first_name | last_name | email | twitter_info | created_at | updated_at | +-------+------------+-----------+---------------------+-------------------+---------------------+---------------------+ | 8965 | Conor | Quitzon | [email protected] | {"id": 100838242, | 2017-01-02 20:57:39 | 2017-01-02 20:57:39 | | 69772 | Junius | Mante | [email protected] | {"id": 20039476, | 2017-01-02 20:59:34 | 2017-01-02 20:59:34 | | 66525 | Katelynn | Feil | [email protected] | {"id": 111644778, | 2017-01-02 20:59:29 | 2017-01-02 20:59:29 | | 92577 | Tillman | Nienow | [email protected] | {"id": 1359073920,| 2017-01-02 21:00:14 | 2017-01-02 21:00:14 | | 18046 | Guillermo | Lebsack | [email protected] | {"id": 120064855, | 2017-01-02 20:57:59 | 2017-01-02 20:57:59 | +-------+------------+-----------+---------------------+-------------------+---------------------+---------------------+ 5 rows in set (0.65 sec) 5.6 14
  9. Repeated emails SELECT * FROM users WHERE email = '[email protected]'

    +-------+------------+-----------+------------------+--------------+---------------------+---------------------+ | id | first_name | last_name | email | twitter_info | created_at | updated_at | +-------+------------+-----------+------------------+--------------+---------------------+---------------------+ | 18046 | Guillermo | Lebsack | [email protected] | NULL | 2017-01-02 14:57:59 | 2017-01-02 14:57:59 | | 20559 | Marietta | Quitzon | [email protected] | NULL | 2017-01-02 14:58:03 | 2017-01-02 14:58:03 | +-------+------------+-----------+------------------+--------------+---------------------+---------------------+ 2 rows in set (0.04 sec) 15
  10. Repeated emails SELECT * FROM users GROUP BY email HAVING

    count(*) > 1 mysql> SELECT * FROM `users` GROUP BY `email` HAVING count(*) > 1; ERROR 1055 (42000): Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column ‘store.users.id’ which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by 5.7 16
  11. Why? "Reject queries for which the select list, HAVING condition,

    or ORDER BY list refer to nonaggregated columns that are neither named in the GROUP BY clause nor are functionally dependent on (uniquely determined by) GROUP BY columns." 17
  12. Translation • This happens because the query is grouping by

    email and MySQL is magically trying to “guess” which field it should bring as a result to the other columns (id, first_name, etc.) since there are more than one result per row. • This mode is known as ONLY_FULL_GROUP_BY. 18 More Info
  13. Fix

  14. MySQL Aggregators • min(column) • max(column) • group_concat(column) • count(column),

    count(DISTINCT column) or count(*) • sum(column) 21 More Info
  15. How are we used to doing it SELECT * FROM

    users GROUP BY email HAVING count(*) > 1 <?php // Doctrine $qb = $entityManager->createQueryBuilder(); $repeatedUsers = $qb->select('*') ->from('users', 'u') ->groupBy('u.email') ->having('count(*) > 1'); // Eloquent $repeatedUsers = Capsule::table('users') ->select('*') ->groupBy('email') ->havingRaw('count(*) > 1'); 22
  16. 1st step: expanding the query 24 SELECT id, first_name, last_name,

    email, created_at, updated_at FROM users GROUP BY email HAVING COUNT(*) > 1; 5.6 5.7 Works Doesn’t Work
  17. 2nd step: adding aggregators 25 SELECT group_concat(id) AS all_users_id, first_name,

    last_name, email, created_at, updated_at FROM users GROUP BY email HAVING COUNT(*) > 1; 5.6 5.7 Works Doesn’t Work
  18. 2nd step - continued 26 SELECT group_concat(id) AS all_users_id, min(first_name)

    AS first_name, last_name, email, created_at, updated_at FROM users GROUP BY email HAVING COUNT(*) > 1; 5.6 5.7 Works Doesn’t Work
  19. 3rd step 27 SELECT group_concat(id) AS all_user_ids, min(first_name) AS first_name,

    min(last_name) AS last_name, email, min(created_at) AS first_record_at, max(updated_at) AS recent_changed_at FROM users GROUP BY email HAVING COUNT(*) > 1; 5.6 5.7 Works Works
  20. Execution result 28 +--------------+------------+------------+---------------------+---------------------+---------------------+ | all_user_ids | first_name | last_name

    | email | first_record_at | recent_changed_at | +--------------+------------+------------+---------------------+---------------------+---------------------+ | 20559,18046 | Guillermo | Lebsack | [email protected] | 2017-01-02 14:57:59 | 2017-01-02 14:58:03 | | 99426,69772 | Alayna | Mante | [email protected] | 2017-01-02 14:59:34 | 2017-01-02 15:00:28 | | 95937,66525 | Braxton | Cormier | [email protected] | 2017-01-02 14:59:29 | 2017-01-02 15:00:21 | | 8965,48487 | Alfred | Quitzon | [email protected] | 2017-01-02 14:57:39 | 2017-01-02 14:58:56 | | 93458,92577 | Christy | McLaughlin | [email protected] | 2017-01-02 15:00:14 | 2017-01-02 15:00:16 | +--------------+------------+------------+---------------------+---------------------+---------------------+ 5 rows in set (0.05 sec)
  21. 3rd step - alternative: ANY_VALUE() 29 SELECT group_concat(id) AS all_user_ids,

    any_value(first_name) AS any_first_name, any_value(last_name) AS any_last_name, email, min(created_at) AS first_record_at, max(updated_at) AS recent_changed_at FROM users GROUP BY email HAVING COUNT(*) > 1; 5.6 5.7 Doesn’t Work Works
  22. Virtual and Stored Columns • Can be indexed • Allows

    expressions to be used such as: • Operators: * , /, -, + • Built-in functions: string functions, date time functions • Literals: "A", 2, first_name 33
  23. Virtual Columns • No disk space • INPLACE creation •

    Values are generated on demand and BEFORE triggers 34
  24. Stored Columns • Uses disk space • Uses table COPY

    operation on ADD COLUMN and DROP COLUMN • Value updated on INSERT and UPDATE 35
  25. Limitations on Generated Columns • NOT NULL by default •

    You must specify a type and be aware of the limitations • Specifically changing the value of a generated column causes an error • Subqueries and custom functions ARE NOT allowed 36
  26. Table `users` Field Type Null Key Default Extra id int(10)

    unsigned NO PRI auto_increment first_name varchar(255) NO NULL last_name varchar(255) NO NULL email varchar(255) NO NULL twitter_info json YES created_at datetime NO CURRENT_TIMESTAMP updated_at datetime NO CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP 38
  27. -- Virtual ALTER TABLE `users` ADD COLUMN `full_name` VARCHAR (500)

    GENERATED ALWAYS AS (CONCAT_WS(' ', `first_name`, `last_name`)) AFTER `last_name`; -- Executed in 124ms -- Stored ALTER TABLE `users` ADD COLUMN `full_name` VARCHAR (500) GENERATED ALWAYS AS (CONCAT_WS(' ', `first_name`, `last_name`)) STORED AFTER `last_name`; -- 10000 rows affected. Executed in 2s 356ms 39
  28. Table `users` Field Type Null Key Default Extra id int(10)

    unsigned NO PRI auto_increment first_name varchar(255) NO NULL last_name varchar(255) NO NULL full_name varchar(500) YES STORED GENERATED email varchar(255) NO NULL twitter_info json YES created_at datetime NO CURRENT_TIMESTAMP updated_at datetime NO CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP 40
  29. Other expressions GENERATED ALWAYS AS ( • price * quantity

    • SUBSTRING_INDEX(email, '@', -1) • FROM_UNIXTIME(created_at) • JSON_UNQUOTE(JSON_EXTRACT(twitter_info, ‘$.location’)) • twitter_info->>’$.location’ ) 41
  30. JSON • Stored as Binary, not TEXT • Indexing is

    possible only through Generated Columns (Virtual or Stored) • Accessible by $key->'$.path' (JSON_EXTRACT alias) • Modified by $key->>'$.path' returning utf8mb4 (JSON_UNQUOTE alias) 43
  31. Converting TEXT to JSON • It is costly, uses table

    COPY operation • All rows must be valid, else an error occur • Use JSON_VALID() before 44
  32. Virtual Column for JSON: extraction 47 ALTER TABLE `users` ADD

    COLUMN `profile_img` TEXT GENERATED ALWAYS AS (`twitter_info`->>’$.profile_image_url’) AFTER `email`;
  33. Stored Column: creation & index for JSON 48 ALTER TABLE

    `users` ADD COLUMN `location` VARCHAR (255) GENERATED ALWAYS AS (`twitter_info`->>'$.location') STORED AFTER `email`, ADD INDEX `ix_location` (`location`);
  34. Table `users` Field Type Null Key Default Extra … …

    … … email varchar(255) NO NULL location varchar(255) YES INDEX STORED GENERATED profile_img text YES VIRTUAL GENERATED twitter_info json YES … … … … 49
  35. Table `users` SELECT id, first_name, last_name, full_name, location, profile_img FROM

    `users` LIMIT 5; +----+------------+-----------+------------------+-------------------------+---------------------+ | id | first_name | last_name | full_name | location | profile_img | +----+------------+-----------+------------------+-------------------------+---------------------+ | 1 | Davey | Shafik | Davey Shafik | Seattle, Washington | http://pbs.twimg... | | 2 | Gabriela | D'Avila | Gabriela D'Avila | Vitória, Brazil | http://pbs.twimg... | | 3 | Anthony | Ferrara | Anthony Ferrara | New Jersey, USA | http://pbs.twimg... | | 4 | Frank | de Jonge | Frank de Jonge | Amsterdam, North Holland| http://pbs.twimg... | | 5 | Amanda | Folson | Amanda Folson | Probably on a plane. | http://pbs.twimg... | +----+------------+-----------+------------------+-------------------------+---------------------+ 5 rows in set (0.00 sec) 50
  36. MySQL `sys` schema • Needs to be installed on MySQL

    5.6 • Installed by default on MySQL 5.7 • MySQL Workbench 6.3 ships with a client for it • In production should be used in critical query analysis case 52
  37. What can I do with it 55 *High Cost SQL

    statements select * from sys.`x$statement_analysis` Top 5% slower queries select * from sys.`x$statements_with_runtimes_in_95th_percentile` Use temporary tables select * from sys.`statements_with_temp_tables` Unused Indexes select * from sys.`schema_unused_indexes` Full table scans select * from sys.`schema_tables_with_full_table_scans` *x$ is a prefix for a view of the same name on raw format, for instance time is in picoseconds (10-12s) More Info
  38. Other Changes • Passwords can now have expiration date •

    NO_ZERO_DATE, NO_ZERO_IN_DATE, ERROR_FOR_DIVISION_BY_ZERO are deprecated and being default in STRICT mode in future versions (8.0?) • Tables now support more than one trigger by event • YEAR(2) was deprecated on 5.6, removed on 5.7 (no more YY-MM-DD!) 57
  39. Thank you! • Twitter: @gabidavila • Blog: http://gabriela.io • Freenode:

    gabidavila • Feedback: https://joind.in/talk/cc6d1 • References: MySQL documentation 58