New APIs in Tensorflow

New Features in TensorFlow Sourabh Bajaj Software Engineer, Google Brain
@sb2nov

Features in Active Development Everything Subject to Change Please do
not tweet/blog/publicize these slides :)

+ TensorFlow Eager Mode + TensorFlow Datasets

TensorFlow Eager Execution As simple as possible

Graphs

Delayed feedback • Error reporting much after graph construction •
Not friendly to host-language debugger/tools Graphs can be annoying

Metaprogramming • Control flow concepts (tf.while_loop) are different than the
host language • Can’t use Python data structures easily Graphs can be annoying

x = tf.placeholder(tf.float32, shape=[1, 1]) m = tf.matmul(x, x) print(m)
# Tensor("MatMul:0", shape=(1, 1), dtype=float32) with tf.Session() as sess: m_out = sess.run(m, feed_dict={x: [[2.]]}) print(m_out) # [[4.]] Boilerplate Code like this...

x = [[2.]] m = tf.matmul(x, x) print(m) # tf.Tensor([[4.]],
dtype=float32, shape=(1,1)) Boilerplate Becomes this

x = tf.gather([0, 1, 2], 7) InvalidArgumentError: indices = 7
is not in [0, 3) [Op:Gather] Instant Errors

x = tf.random_uniform([2, 2]) with tf.Session() as sess: for i
in range(x.shape[0]): for j in range(x.shape[1]): print(sess.run(x[i, j])) Metaprogramming Each iteration adds nodes to the graph

x = tf.random_uniform([2, 2]) for i in range(x.shape[0]): for j
in range(x.shape[1]): print(x[i, j]) Metaprogramming

a = tf.constant(6) while a != 1: if a %
2 == 0: a = a / 2 else: a = 3 * a + 1 print(a) Python Control Flow # Outputs tf.Tensor(3, dtype=int32) tf.Tensor(10, dtype=int32) tf.Tensor(5, dtype=int32) tf.Tensor(16, dtype=int32) tf.Tensor(8, dtype=int32) tf.Tensor(4, dtype=int32) tf.Tensor(2, dtype=int32) tf.Tensor(1, dtype=int32)

Gradients

• Operations executed are recorded on a tape • Tape
is played back to compute gradients Gradients

def square(x): return tf.multiply(x, x) # Or x * x
grad = tfe.gradients_function(square) print(square(3.)) # tf.Tensor(9., dtype=tf.float32 print(grad(3.)) # [tf.Tensor(6., dtype=tf.float32))] Gradients

def square(x): return tf.multiply(x, x) # Or x * x
grad = tfe.gradients_function(square) gradgrad = tfe.gradients_function(lambda x: grad(x)[0]) print(square(3.)) # tf.Tensor(9., dtype=tf.float32) print(grad(3.)) # [tf.Tensor(6., dtype=tf.float32)] print(gradgrad(3.)) # [tf.Tensor(2., dtype=tf.float32))] Gradients

It’s not that different

TensorFlow = Operation Kernels + Composition • Session: One way
to compose operations • Eager execution: Compose using Python

tf.device() for manual placement with tf.device(“/gpu:0”): x = tf.random_uniform([10, 10])
y = tf.matmul(x, x) # x and y reside in GPU memory Using GPUs

The same APIs as graph building (tf.layers, tf.train.Optimizer, tf.data etc.)
model = tf.layers.Dense(units=1, use_bias=True) optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1) Building Models

model = tf.layers.Dense(units=1, use_bias=True) optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1) # Define a
loss function def loss(x, y): return tf.reduce_mean(tf.square(y - model(x)) Building Models

Compute and apply gradients for (x, y) in get_next_batch(): optimizer.apply_gradients(grad_fn(x,
y)) Training Models

Compute and apply gradients grad_fn = tfe.implicit_gradients(loss) for (x, y)
in get_next_batch(): optimizer.apply_gradients(grad_fn(x, y)) Training Models

No more graphs then?

Optimizable • Automatic buffer reuse • Constant folding • Inter-op
parallelism • Automatic trade-off between compute and memory Graphs are

Deployable • TensorFlow Serving • Mobile • Any other C++/Java/other
program Without loss in translation between runtimes Graphs are

Transformable • Carve out subgraphs to offload to accelerators •
Train with quantization in mind Graphs are

Graph Functions

“Compile” Python functions into graphs • Mix eager execution with
calls to “compiled” graphs • Differentiate through graphs Graph Functions

def lstm_cell(x, w, h, c): xhw = tf.matmul(tf.concat([x, h], axis=1),
w) y = tf.split(xhw, 4, axis=1) in_value = tf.tanh(y[0]) in_gate, forget_gate, out_gate = [tf.sigmoid(x) for x in y[1:]] c = (forget_gate * c) + (in_gate * in_value) h = out_gate * tf.tanh(c) return h, c h, c = lstm_cell(x, w, h, c) print(h) LSTM Cell

@tfe.graph_function def lstm_cell(x, w, h, c): xhw = tf.matmul(tf.concat([x, h],
axis=1), w) y = tf.split(xhw, 4, axis=1) in_value = tf.tanh(y[0]) in_gate, forget_gate, out_gate = [tf.sigmoid(x) for x in y[1:]] c = (forget_gate * c) + (in_gate * in_value) h = out_gate * tf.tanh(c) return h, c h, c = lstm_cell(x, w, h, c) print(h) LSTM Cell

@tfe.graph_function def lstm_cell(x, w, h, c): xhw = tf.matmul(tf.concat([x, h],
axis=1), w) y = tf.split(xhw, 4, axis=1) in_value = tf.tanh(y[0]) in_gate, forget_gate, out_gate = [tf.sigmoid(x) for x in y[1:]] c = (forget_gate * c) + (in_gate * in_value) h = out_gate * tf.tanh(c) return h, c h, c = lstm_cell(x, w, h, c) print(h) LSTM Cell tanh executed in-place

@tfe.graph_function def inception(image): logits = inception.inception_v3(image, num_classes=1001, is_training=False)[0] return tf.softmax(logits)
inception.restore(“/path/to/checkpoint”) print(len(inception.variables)) Use existing graph code

TensorFlow Datasets Data Infeed Made Simple

Input data is the lifeblood of machine learning Modern accelerators
need faster input pipelines Getting your data into TensorFlow can be painful Why are we here?

Half screen photo slide if text is necessary Feeding sess.run(…,
feed_dict={x: features, y: labels}) All the flexibility of Python… …and all the performance

Queues files = string_input_producer(…) record = TFRecordReader().read(files) parsed = parse_example(record,
…) batch = batch(parsed, 32) Uses TensorFlow ops to perform preprocessing, but driven by client threads. “Starting the queue runners”

How do I switch between training and validation data? How
do I detect the end of an epoch? How do I handle malformed data?

Data elements have the same type Dataset might be too
large to materialize all at once… or infinite Compose functions like map() and filter() to preprocess Input pipelines = lazy lists Functional programming to the rescue!

A well-studied area, applied in existing languages. • C# LINQ,
Scala collections, Java Streams Huge literature on optimization (stream fusion etc.) Input pipelines = lazy lists Functional programming to the rescue!

Introducing tf.data Functional input pipelines in TensorFlow

Create a Dataset from one or more tf.Tensor objects: Dataset.from_tensors((features,
labels)) Dataset.from_tensor_slices((features, labels)) TextLineDataset(filenames) The Dataset interface Data sources and functional transformations

Or create a Dataset from another Dataset: dataset.map(lambda x: tf.decode_jpeg(x))
dataset.repeat(NUM_EPOCHS) dataset.batch(BATCH_SIZE) ...and many more. The Dataset interface Data sources and functional transformations

Or (in TensorFlow 1.4) create a Dataset from a Python
generator: def generator(): while True: yield ... Dataset.from_generator(generator, tf.int32) The Dataset interface Data sources and functional transformations

# Read records from a list of files. dataset =
TFRecordDataset(["file1.tfrecord", "file1.tfrecord", …]) # Parse string values into tensors. dataset = dataset.map(lambda record: tf.parse_single_example(record, …)) # Randomly shuffle using a buffer of 10000 examples. dataset = dataset.shuffle(10000) # Repeat for 100 epochs. dataset = dataset.repeat(100) # Combine 128 consecutive elements into a batch. dataset = dataset.batch(128)

Create an Iterator from a Dataset: dataset.make_one_shot_iterator() dataset.make_initializable_iterator() The Iterator
interface Sequential access to Dataset elements

Get the next element from the Iterator: next_element = iterator.get_next()
while …: sess.run(next_element) The Iterator interface Sequential access to Dataset elements

dataset = … # A one-shot iterator automatically initializes itself
on first use. iterator = dataset.make_one_shot_iterator() # The return value of get_next() matches the dataset element type. images, labels = iterator.get_next() train_op = model_and_optimizer(images, labels) # Loop until all elements have been consumed. try: while True: sess.run(train_op) except tf.errors.OutOfRangeError: pass

def input_fn(): dataset = … # A one-shot iterator automatically
initializes itself on first use. iterator = dataset.make_one_shot_iterator() # The return value of get_next() matches the dataset element type. images, labels = iterator.get_next() return images, labels # The input_fn can be used as a regular Estimator input function. estimator = tf.estimator.Estimator(…) estimator.train(train_input_fn=input_fn, …)

tf.data.Dataset Represents input pipeline using functional transformations tf.data.Iterator Provides sequential
access to elements of a Dataset tf.data API

Thank You Reach out at @sb2nov for questions

New APIs in Tensorflow

New APIs in Tensorflow

More Decks by Sourabh

Other Decks in Programming

Featured

Transcript