Upgrade to Pro — share decks privately, control downloads, hide ads and more …

GDG Google Cloud Databases and Big Data Next Ex...

GDG Google Cloud Databases and Big Data Next Ext 19

CloudSql
Cloud Data Catalog
Cloud Data Fusion
Cloud Dataflow
Cloud Dataproc
BigQuery
Cloud Composer

cncf-canada-meetups

May 15, 2019
Tweet

More Decks by cncf-canada-meetups

Other Decks in Technology

Transcript

  1. Speaker: Stéphane Fréchette @sfrechette Google Cloud Solution Architect @ Pythian

    Formally Cloud Solution Architect - Data Platform @ Microsoft Pythian excels at helping businesses use their data to transform how they compete and win in this ever-changing environment by delivering advanced on-premises, hybrid, and multi-cloud solutions to solve to the toughest data challenges faster and better than anyone else.
  2. Databases & Big Data announcements Enterprise databases, managed for you.

    Fully managed database services. New products and features to help you manage enterprise workloads in the ways you’re used to, and make your data work for you: • Cloud SQL for Microsoft SQL Server (sneak preview - coming soon) Request access Bring your existing SQL Server workloads to GCP and run them in a fully managed database service. • Cloud SQL for PostgreSQL, now with version 11 support Includes useful new features like partitioning improvements, stored procedures, and more parallelism • Cloud Bigtable multi-region replication now available - GA Multi-region replication to general availability
  3. Databases & Big Data announcements Bringing the best of open

    source to Google Cloud customers. Commitment to open source to the next level by announcing strategic partnerships with leading open source-centric companies in the areas of data management and analytics, including: • Confluent • DataStax • Elastic • InfluxData • MongoDB • Neo4j • Redis Labs
  4. Databases & Big Data announcements Cloud Data Fusion (beta) Cloud

    Data Fusion is a fully managed, cloud native, enterprise data integration service for quickly building and managing data pipelines. Cloud Data Fusion provides a graphical interface to increase time efficiency and reduce complexity. Now business users, developers, and data scientists can easily and reliably build scalable data integration solutions to cleanse, prepare, blend, transfer, and transform data—without having to wrestle with infrastructure. • Code-free self service • Collaborative data engineering • GCP-native • Enterprise-grade security • Integration metadata and lineage • Seamless operations • Comprehensive integration toolkit • Hybrid enablement
  5. Databases & Big Data announcements Cloud Dataflow is a managed

    service for executing a wide variety of data processing patterns. • Cloud Dataflow SQL (public alpha) lets you build pipelines using familiar Standard SQL for unified batch and stream data processing. • Dataflow Flexible Resource Scheduling (FlexRS), in beta, helps you flexibly schedule batch processing jobs for cost savings.
  6. Databases & Big Data announcements Cloud Dataproc is a managed

    Apache Spark and Apache Hadoop service that lets you take advantage of open source data tools for batch processing, querying, streaming, and machine learning. • Cloud Dataproc autoscaling (beta) removes the user burden associated with provisioning and decommissioning Hadoop and Spark clusters on Google Cloud Platform, providing you the same serverless convenience that you find in the rest of our data analytics platform. • Dataproc Presto job type (beta) helps you write simpler ad hoc Presto queries against disparate data sources like Cloud Storage and Hive metastore. Now both queries and scripts run as part of the native Dataproc API. • Dataproc Kerberos TLC (beta) enables Hadoop secure mode on Dataproc through thorough API support for Kerberos. This new integration gives you cross-realm trust, RPC and SSL encryption, and KDC administrator configuration capabilities.
  7. Databases & Big Data announcements BigQuery is Google's fully managed,

    petabyte scale, low cost analytics data warehouse • BigQuery BI Engine, in beta, is a fully-managed in-memory analysis service that powers visual analytics over big data with sub-second query response, high-concurrency, simplified BI architecture, and smart performance tuning.
  8. Databases & Big Data announcements BigQuery is Google's fully managed,

    petabyte scale, low cost analytics data warehouse • Connected sheets are a new type of spreadsheet that combines the simplicity of a spreadsheet interface with the power of BigQuery. With a few clicks, you can access BigQuery data in Sheets and securely share it with anyone in your organization. • BigQuery DTS now supports 100+ SaaS apps, enabling you to lay the foundation for a data warehouse without writing a single line of code.
  9. Databases & Big Data announcements BigQuery is Google's fully managed,

    petabyte scale, low cost analytics data warehouse • BigQuery ML is now generally available with new model types you can call with SQL queries. • BigQuery: k-means clustering ML (beta) helps you establish groupings of data points based on axes or attributes that you specify, straight from Standard SQL in BigQuery. • BigQuery: import TensorFlow models (alpha) lets you import your TensorFlow models and call them straight from BigQuery to create classifier and predictive models right from BigQuery. • BigQuery: TensorFlow DNN classifier helps you classify your data, based on a large number of features or signals. You can train and deploy a DNN model of your choosing straight from BigQuery’s Standard SQL interface. • BigQuery: TensorFlow DNN regressor lets you design a regression in TensorFlow and then call it to generate a trend line for your data in BigQuery.
  10. Databases & Big Data announcements Cloud Data Catalog (beta) Cloud

    Data Fusion is a fully managed, cloud native, enterprise data integration service for quickly building and managing data pipelines. Data Catalog is a fully managed and scalable metadata management service that empowers organizations to quickly discover, manage, and understand all their data in Google Cloud. It offers a simple and easy-to-use search interface for data discovery, a flexible and powerful cataloging system for capturing both technical and business metadata, and a strong security and compliance foundation. • Serverless • Metadata-as-a-service • Central catalog • Search and discovery • Schematized metadata • Cloud DLP integration • Cloud IAM integration • Governance
  11. Databases & Big Data announcements Cloud Composer (GA) Cloud Composer

    (generally available) helps you orchestrate your workloads across multiple clouds with a managed Apache Airflow service. Cloud Composer is a managed Apache Airflow service that helps you create, schedule, monitor and manage workflows. Cloud Composer automation helps you create Airflow environments quickly and use Airflow-native tools, such as the powerful Airflow web interface and command line tools, so you can focus on your workflows and not your infrastructure.