New APIs in Tensorflow

0b40b3c621633157be039d55d0fd9ea0?s=47 Sourabh
April 28, 2018

New APIs in Tensorflow

Talk about the Tensorflow Eager execution and the Dataset APIs. Covers on how to change your code from using graph mode to eager execution and in the second half covered the datasets api to write performant input pipelines.

0b40b3c621633157be039d55d0fd9ea0?s=128

Sourabh

April 28, 2018
Tweet

Transcript

  1. New Features in TensorFlow Sourabh Bajaj Software Engineer, Google Brain

    @sb2nov
  2. Features in Active Development Everything Subject to Change Please do

    not tweet/blog/publicize these slides :)
  3. + TensorFlow Eager Mode + TensorFlow Datasets

  4. TensorFlow Eager Execution As simple as possible

  5. Graphs

  6. Delayed feedback • Error reporting much after graph construction •

    Not friendly to host-language debugger/tools Graphs can be annoying
  7. Metaprogramming • Control flow concepts (tf.while_loop) are different than the

    host language • Can’t use Python data structures easily Graphs can be annoying
  8. x = tf.placeholder(tf.float32, shape=[1, 1]) m = tf.matmul(x, x) print(m)

    # Tensor("MatMul:0", shape=(1, 1), dtype=float32) with tf.Session() as sess: m_out = sess.run(m, feed_dict={x: [[2.]]}) print(m_out) # [[4.]] Boilerplate Code like this...
  9. x = [[2.]] m = tf.matmul(x, x) print(m) # tf.Tensor([[4.]],

    dtype=float32, shape=(1,1)) Boilerplate Becomes this
  10. x = tf.gather([0, 1, 2], 7) InvalidArgumentError: indices = 7

    is not in [0, 3) [Op:Gather] Instant Errors
  11. x = tf.random_uniform([2, 2]) with tf.Session() as sess: for i

    in range(x.shape[0]): for j in range(x.shape[1]): print(sess.run(x[i, j])) Metaprogramming Each iteration adds nodes to the graph
  12. x = tf.random_uniform([2, 2]) for i in range(x.shape[0]): for j

    in range(x.shape[1]): print(x[i, j]) Metaprogramming
  13. a = tf.constant(6) while a != 1: if a %

    2 == 0: a = a / 2 else: a = 3 * a + 1 print(a) Python Control Flow # Outputs tf.Tensor(3, dtype=int32) tf.Tensor(10, dtype=int32) tf.Tensor(5, dtype=int32) tf.Tensor(16, dtype=int32) tf.Tensor(8, dtype=int32) tf.Tensor(4, dtype=int32) tf.Tensor(2, dtype=int32) tf.Tensor(1, dtype=int32)
  14. Gradients

  15. • Operations executed are recorded on a tape • Tape

    is played back to compute gradients Gradients
  16. def square(x): return tf.multiply(x, x) # Or x * x

    grad = tfe.gradients_function(square) print(square(3.)) # tf.Tensor(9., dtype=tf.float32 print(grad(3.)) # [tf.Tensor(6., dtype=tf.float32))] Gradients
  17. def square(x): return tf.multiply(x, x) # Or x * x

    grad = tfe.gradients_function(square) gradgrad = tfe.gradients_function(lambda x: grad(x)[0]) print(square(3.)) # tf.Tensor(9., dtype=tf.float32) print(grad(3.)) # [tf.Tensor(6., dtype=tf.float32)] print(gradgrad(3.)) # [tf.Tensor(2., dtype=tf.float32))] Gradients
  18. It’s not that different

  19. TensorFlow = Operation Kernels + Composition • Session: One way

    to compose operations • Eager execution: Compose using Python
  20. tf.device() for manual placement with tf.device(“/gpu:0”): x = tf.random_uniform([10, 10])

    y = tf.matmul(x, x) # x and y reside in GPU memory Using GPUs
  21. The same APIs as graph building (tf.layers, tf.train.Optimizer, tf.data etc.)

    model = tf.layers.Dense(units=1, use_bias=True) optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1) Building Models
  22. model = tf.layers.Dense(units=1, use_bias=True) optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1) # Define a

    loss function def loss(x, y): return tf.reduce_mean(tf.square(y - model(x)) Building Models
  23. Compute and apply gradients for (x, y) in get_next_batch(): optimizer.apply_gradients(grad_fn(x,

    y)) Training Models
  24. Compute and apply gradients grad_fn = tfe.implicit_gradients(loss) for (x, y)

    in get_next_batch(): optimizer.apply_gradients(grad_fn(x, y)) Training Models
  25. No more graphs then?

  26. Optimizable • Automatic buffer reuse • Constant folding • Inter-op

    parallelism • Automatic trade-off between compute and memory Graphs are
  27. Deployable • TensorFlow Serving • Mobile • Any other C++/Java/other

    program Without loss in translation between runtimes Graphs are
  28. Transformable • Carve out subgraphs to offload to accelerators •

    Train with quantization in mind Graphs are
  29. Graph Functions

  30. “Compile” Python functions into graphs • Mix eager execution with

    calls to “compiled” graphs • Differentiate through graphs Graph Functions
  31. def lstm_cell(x, w, h, c): xhw = tf.matmul(tf.concat([x, h], axis=1),

    w) y = tf.split(xhw, 4, axis=1) in_value = tf.tanh(y[0]) in_gate, forget_gate, out_gate = [tf.sigmoid(x) for x in y[1:]] c = (forget_gate * c) + (in_gate * in_value) h = out_gate * tf.tanh(c) return h, c h, c = lstm_cell(x, w, h, c) print(h) LSTM Cell
  32. @tfe.graph_function def lstm_cell(x, w, h, c): xhw = tf.matmul(tf.concat([x, h],

    axis=1), w) y = tf.split(xhw, 4, axis=1) in_value = tf.tanh(y[0]) in_gate, forget_gate, out_gate = [tf.sigmoid(x) for x in y[1:]] c = (forget_gate * c) + (in_gate * in_value) h = out_gate * tf.tanh(c) return h, c h, c = lstm_cell(x, w, h, c) print(h) LSTM Cell
  33. @tfe.graph_function def lstm_cell(x, w, h, c): xhw = tf.matmul(tf.concat([x, h],

    axis=1), w) y = tf.split(xhw, 4, axis=1) in_value = tf.tanh(y[0]) in_gate, forget_gate, out_gate = [tf.sigmoid(x) for x in y[1:]] c = (forget_gate * c) + (in_gate * in_value) h = out_gate * tf.tanh(c) return h, c h, c = lstm_cell(x, w, h, c) print(h) LSTM Cell tanh executed in-place
  34. @tfe.graph_function def inception(image): logits = inception.inception_v3(image, num_classes=1001, is_training=False)[0] return tf.softmax(logits)

    inception.restore(“/path/to/checkpoint”) print(len(inception.variables)) Use existing graph code
  35. None
  36. TensorFlow Datasets Data Infeed Made Simple

  37. Input data is the lifeblood of machine learning Modern accelerators

    need faster input pipelines Getting your data into TensorFlow can be painful Why are we here?
  38. Half screen photo slide if text is necessary Feeding sess.run(…,

    feed_dict={x: features, y: labels}) All the flexibility of Python… …and all the performance
  39. Queues files = string_input_producer(…) record = TFRecordReader().read(files) parsed = parse_example(record,

    …) batch = batch(parsed, 32) Uses TensorFlow ops to perform preprocessing, but driven by client threads. “Starting the queue runners”
  40. None
  41. How do I switch between training and validation data? How

    do I detect the end of an epoch? How do I handle malformed data?
  42. Data elements have the same type Dataset might be too

    large to materialize all at once… or infinite Compose functions like map() and filter() to preprocess Input pipelines = lazy lists Functional programming to the rescue!
  43. A well-studied area, applied in existing languages. • C# LINQ,

    Scala collections, Java Streams Huge literature on optimization (stream fusion etc.) Input pipelines = lazy lists Functional programming to the rescue!
  44. Introducing tf.data Functional input pipelines in TensorFlow

  45. Create a Dataset from one or more tf.Tensor objects: Dataset.from_tensors((features,

    labels)) Dataset.from_tensor_slices((features, labels)) TextLineDataset(filenames) The Dataset interface Data sources and functional transformations
  46. Or create a Dataset from another Dataset: dataset.map(lambda x: tf.decode_jpeg(x))

    dataset.repeat(NUM_EPOCHS) dataset.batch(BATCH_SIZE) ...and many more. The Dataset interface Data sources and functional transformations
  47. Or (in TensorFlow 1.4) create a Dataset from a Python

    generator: def generator(): while True: yield ... Dataset.from_generator(generator, tf.int32) The Dataset interface Data sources and functional transformations
  48. # Read records from a list of files. dataset =

    TFRecordDataset(["file1.tfrecord", "file1.tfrecord", …]) # Parse string values into tensors. dataset = dataset.map(lambda record: tf.parse_single_example(record, …)) # Randomly shuffle using a buffer of 10000 examples. dataset = dataset.shuffle(10000) # Repeat for 100 epochs. dataset = dataset.repeat(100) # Combine 128 consecutive elements into a batch. dataset = dataset.batch(128)
  49. Create an Iterator from a Dataset: dataset.make_one_shot_iterator() dataset.make_initializable_iterator() The Iterator

    interface Sequential access to Dataset elements
  50. Get the next element from the Iterator: next_element = iterator.get_next()

    while …: sess.run(next_element) The Iterator interface Sequential access to Dataset elements
  51. dataset = … # A one-shot iterator automatically initializes itself

    on first use. iterator = dataset.make_one_shot_iterator() # The return value of get_next() matches the dataset element type. images, labels = iterator.get_next() train_op = model_and_optimizer(images, labels) # Loop until all elements have been consumed. try: while True: sess.run(train_op) except tf.errors.OutOfRangeError: pass
  52. def input_fn(): dataset = … # A one-shot iterator automatically

    initializes itself on first use. iterator = dataset.make_one_shot_iterator() # The return value of get_next() matches the dataset element type. images, labels = iterator.get_next() return images, labels # The input_fn can be used as a regular Estimator input function. estimator = tf.estimator.Estimator(…) estimator.train(train_input_fn=input_fn, …)
  53. tf.data.Dataset Represents input pipeline using functional transformations tf.data.Iterator Provides sequential

    access to elements of a Dataset tf.data API
  54. Thank You Reach out at @sb2nov for questions