What's New in TensorFlow and Keras (By: Mamoona Riaz) - Keras Community Day 2023

What’s new in TensorFlow and Keras? TensorFlow, Keras

Section 01 Create state-of-the-art models, in just a few lines
of code. KerasCV & KerasNLP Unlock the power of data and model parallelism together, so you can scale up with confidence. DTensor Cross-framework compatibility that’s simple and easy. JAX2TF Flexible, fine-grained control over model size like never before for ML development that’s cheaper and faster. TF Quantization API (preview) Building for a changing landscape.

Building for a changing landscape.

Building for a changing landscape. 540B PaLM-540 Parameters (2022) 340M
BERT-Large Parameters (2018)

540B PaLM-540 Parameters (2022) 340M BERT-Large Parameters (2018) 355M petaFLOPs
Training compute for Google LaMDA Building for a changing landscape.

540B PaLM-540 Parameters (2022) 340M BERT-Large Parameters (2018) 355M petaFLOPs
Training compute for Google LaMDA 5B+ Devices running ML in 100k+ apps Building for a changing landscape.

355M petaFLOPs Training compute for Google LaMDA 540B PaLM-540 Parameters
(2022) 340M BERT-Large Parameters (2018) 5B+ Devices running ML in 100k+ apps 16M+ TensorFlow monthly downloads Building for a changing landscape.

The cutting edge of machine learning, right at your fingertips.
Applied ML with KerasCV & KerasNLP Section 02

Image Classification Object Detection Data Augmentation Text Classification Text Generation
Image Generation KerasCV KerasNLP Libraries for state of the art computer vision and natural language processing. From idea to implementation in just a few lines of code! Section 02 What can you do with KerasCV and KerasNLP?

Section 02 BERT, GPT-2, Stable Diffusion, ResNet, RetinaNet, etc. SOTA
models, written in minutes TFLite, DTensor, XLA, TPUs, and beyond Integrated with the TF Ecosystem Readable and modular design with great documentation Easy to get started Why KerasCV and KerasNLP?

model = StableDiffusion( img_width=512, img_height=512, ) images = model.text_to_image( "photograph
of an astronaut " "riding a horse", batch_size=3, ) model = BertClassifier.from_preset( "bert_base_en_uncased", num_classes=2, ) model.compile(...) model.fit(movie_review_dataset) model.predict([ "What an amazing movie!", ]) Here’s a quick look! Want to learn more? Take a deep dive in our full talk on KerasCV/NLP! Classify text Text to image…and much more! Section 02

Model and data parallelism, all in one place. Machine Learning
at Scale with DTensor Section 03

• data parallelism: Traditionally, ML developers have scaled up models
through data parallelism, which splits up your data and feeds it to horizontally-scaled model instances. But it requires that the model ﬁts within a single hardware device. • Across the device: developers need to be able to scale their models across hardware devices. Dtensor

One toolkit, for both data and model parallelism Flexible Safely
split work across multiple machines Efficient An API that abstracts across TPU/GPU/CPU Device Agnostic Models are getting bigger and bigger. And as model size grows, so does the complexity of training and serving. That’s where DTensor can help! Section 03

Device:0 Device:2 Device:1 Device:3 Batch Split up your data and
feed it to horizontally-scaled model instances. Data Parallelism, with DTensor. Section 03

Device:0 Device:2 Device:1 Device:3 Split up your data and feed
it to horizontally-scaled model instances. Data Parallelism, with DTensor. Data shard 0 Data shard 1 Data shard 3 Data shard 2 Batch Section 03

Split up your data and feed it to horizontally-scaled model
instances. Data Parallelism, with DTensor. Device:0 Device:2 Device:1 Device:3 Data shard 0 Data shard 1 Data shard 3 Data shard 2 Model replica 0 Model replica 1 Model replica 3 Model replica 2 Batch Section 03

Device:0 Device:2 Device:1 Device:3 Model Split up your data and
feed it to horizontally-scaled model instances. Data Parallelism, with DTensor. Section 03

it to horizontally-scaled model instances. Data Parallelism, with DTensor. Model shard 0 Model shard 1 Model shard 3 Model shard 2 Model Section 03

it to horizontally-scaled model instances. Data Parallelism, with DTensor. Model shard 0 Model shard 1 Model shard 3 Model shard 2 Data replica 0 Data replica 1 Data replica 3 Data replica 2 Model Section 03

All together now! Data and model parallelism, with DTensor. Device:0
Device:1 Device:2 Device:3 Data Model Section 03

Device:1 Device:2 Device:3 Data Model Model R0 S0 Model R1 S0 Model R1 S1 Model R0 S1 Section 03

Device:1 Device:2 Device:3 Data Model Model R0 S0 Model R1 S0 Model R1 S1 Model R0 S1 Data R0 S0 Data R0 S1 Data R1 S1 Data R1 S0 Section 03

# OPT training setup via KerasNLP, before DTensor opt_lm =
keras_nlp.models.OPTCasualLM.from_preset("opt_6.7b_en") opt_lm.compile(...) opt_lm.fit(wiki_text_dataset)

# DTensor-enabled training! mesh_dims = [("batch", 2), ("model", 4)] mesh
= dtensor.create_distributed_mesh(mesh_dims, device_type="GPU") dtensor.initialize_accelerator_system("GPU") layout_map = keras_nlp.models.OPTCausalLM.create_layout_map(mesh) with layout_map.scope(): opt_lm = keras_nlp.models.OPTCasualLM.from_preset("opt_6.7b_en") opt_lm.compile(...) opt_lm.fit(wiki_text_dataset)

Performance Already inline with NVIDIA’s Megatron for transformer training, Mesh
TensorFlow, and JAX. Even further increases coming soon! Learn more! https://www.tensorflow.org/guide/dtensor_overview For more details on Keras integrations, check out the guides at: keras.io What’s next? Complete integration with Keras and tf.distribute. One strategy for TPU/GPU/CPU. Automatic determination of layouts. Pipelining support. DTensor tf.distribute + Unified Parallelism Section 03 Built for today, ready for tomorrow

From Research to Production with JAX2TF Bring JAX Models into
the TensorFlow Ecosystem with one line of code. Section 04

15+ modular, specialized libraries built on JAX’s core >200% PyPi
download growth (3 months) Section 04 What’s JAX? An open-source framework for high-performance, ML research. Bringing JAX’s development power into production has been hard – until now.

+ Model Fusion Serving (server or on-device) Fine-tuning Section 04
What’s JAX2TF? A simple, lightweight API to give JAX models access to the full strength of TensorFlow ecosystem. From Research to Production

# Save seamlessly in the form of a TensorFlow SavedModel
model = JAXModel() state = model.init(...) tf_model = TFModel(state, model) tf.saved_model.save(tf_model, "./") self.loss_fn = jax2tf.convert(model.loss, ...) self.predict_fn = jax2tf.convert(model.predict, ...)

31 Section 04 Flexibility and fidelity. Whether you train just
in JAX, or use JAX2TF to fine-tune and finish training in TF, it’s just as accurate, and converges just as fast.

Making training and deploying models faster, easier, and cheaper. Nimble
Machine Learning: Quantization in TensorFlow Section 05

Introducing the TF Quantization API Adjust model size, easily. Smaller
models are faster to run and require fewer resources Reduces memory, latency, compute and battery costs Section 05

For mobile, server, microcontrollers, and more. Flexible Introducing the TF
Quantization API Adjust model size, easily. Smaller models are faster to run and require fewer resources Reduces memory, latency, compute and battery costs Section 05

For mobile, server, microcontrollers, and more. Flexible Tools that just
work, without model rewrites. Easy Introducing the TF Quantization API Adjust model size, easily. Smaller models are faster to run and require fewer resources Reduces memory, latency, compute and battery costs Section 05

Fundamentally better than before! For mobile, server, microcontrollers, and more.
Flexible Tools that just work, without model rewrites. Easy Reduced memory, latency, compute and battery costs Efficient Introducing the TF Quantization API Adjust model size, easily. Smaller models are faster to run and require fewer resources Reduces memory, latency, compute and battery costs Section 05

# This is all it takes to create a quantization-aware
model! tf.quantization.apply_quantization_on_model(model, config_map, …) # From here, you can train and save just as always. model.fit() model.save() # You can also export to TFLite, without any changes! converter = tf.lite.TFLiteConverter.from_keras_model(model) converter.optimizations = [tf.lite.Optimize.DEFAULT] tflite_model = converter.convert()

• Post-Training Quantization (PTQ): Convert to a quantized model after
training. This is as simple as it gets and most readily accessible, but there can be a small quality drop. • Quantization-Aware Training (QAT): Simulate quantization during just the forward pass, providing for maximal ﬂexibility with a minimal quality tradeoff. • Quantized Training: Quantize all computations while training. This is still nascent, and needs a lot more testing, but is a powerful tool we want to make sure TensorFlow users have access to. Quantization

16.56X P7 EdgeTPU quantized serving throughput, versus baseline floating point
MobileNetV2. Model: MobileNetV2 Device: Pixel 7 Serving throughput vs. float32 CPU baseline: CPU with XNNPack (1 thread): 2.24x Edge-TPU: 16.56x Performance and quality Section 05

16.56X P7 EdgeTPU quantized serving throughput, versus baseline floating point
MobileNetV2. Model: MobileNetV2 Device: Pixel 7 Serving throughput vs. float32 CPU baseline: CPU with XNNPack (1 thread): 2.24x Edge-TPU: 16.56x All without noticeable detriment to accuracy. float32: 73% int8: Still 73%! Performance and quality. Section 05

Available later this year! Coming soon, to a model near
you.

…and there’s so much more!

Thank You Mani Varadarajan Director of Engineering

Important Announcements

Program details at a glance: • Title: Campus Ambassador •
Duration: 1 Academic Year • Eligibility Criteria: BSCS/BSSE/BSIT students in 3rd - 6th Semester only • Monthly Stipend Registration deadline: 1st October, 2023 Campus Ambassador Program 2023-2024

What's New in TensorFlow and Keras (By: Mamoona...

What's New in TensorFlow and Keras (By: Mamoona Riaz) - Keras Community Day 2023

More Decks by GDG Lahore

Other Decks in Programming

Featured

Transcript