本文已参与「新人创作礼」活动，一起开启掘金创作之路。

tensorflow作为一个软件系统，必不可少的包含了很多所有软件系统共有的模块。这此模块如监控，配置，调试，分布式，状态存储，单元测试，是大多数软件都有的。只是实现方式不同。本文主要介绍这些围绕tensorflow的系统展开。至于tensorflow的核心功能，先不作详细展开。在了解了这此周边系统后，再看核心系统，会容易的多。本文代码样例来自tensorboard官网。 编辑在开始之间，先来一个最简单的手机数字识别模型，把核心功能串起来。并在后续代码中逐步加入各种指标监控来查看训练状态。没错，就是tensorflw官网入门教程里第一个例子。建议初学者要把这段程序背会。不需要理解为什么，因为tensorflow框架使用有这样固定的格式。再如学习flink, spark也是这个道理。

In [5]:


import tensorflow as tf
#直接使用kerasAPI加载mnist手写数字数据集
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
#构建模型
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])
#编译模型，指定了优化器，损失函数，和监控指标。
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
#训练模型
model.fit(x_train, y_train, epochs=5)
#在测试集上评估模型准确率
model.evaluate(x_test,  y_test, verbose=2)

#假如这是三个新数据，作出预测
x_new = x_test[:3]
y_new = y_test[:3]
y_pred = model.predict(x_new)
print("predict:", tf.argmax(y_pred,1))
print("real:", y_new)

Epoch 1/5
1875/1875 [==============================] - 2s 752us/step - loss: 0.4839 - accuracy: 0.8597
Epoch 2/5
1875/1875 [==============================] - 1s 726us/step - loss: 0.1447 - accuracy: 0.9568
Epoch 3/5
1875/1875 [==============================] - 1s 759us/step - loss: 0.1069 - accuracy: 0.9672
Epoch 4/5
1875/1875 [==============================] - 1s 643us/step - loss: 0.0909 - accuracy: 0.9721
Epoch 5/5
1875/1875 [==============================] - 1s 672us/step - loss: 0.0751 - accuracy: 0.9773
313/313 - 0s - loss: 0.0721 - accuracy: 0.9774
predict: tf.Tensor([7 2 1], shape=(3,), dtype=int64)
real: [7 2 1]

监控

在一个复杂系统中，监控是必不可少的。在系统运行过程中，需要实时进行状态监控，甚至状态干预。tensorflow提供了tf.summary模块，tf.metrics模块，配合以tensorboard可视化工具来完成系统监控。 ##tf.summary 此模块可以用来保存一些信息到文件，以方便进行数据分析，或者用tensorboard来进行可视化展示。 ##tf.metrics, tensorboard, keras tensorboard callbacks

#创建并设置metrics writer
logdir = "../data/summary/logs/scalars/" + datetime.now().strftime("%Y%m%d-%H%M%S")
file_writer = tf.summary.create_file_writer(logdir + "/metrics")
file_writer.set_as_default()
#自定义的function
def lr_schedule(epoch):
  """
  Returns a custom learning rate that decreases as epochs progress.
  """
  learning_rate = 0.2
  if epoch > 10:
    learning_rate = 0.02
  if epoch > 20:
    learning_rate = 0.01
  if epoch > 50:
    learning_rate = 0.005
  #使用
  tf.summary.scalar('lr', data=learning_rate, step=epoch)
  return learning_rate
#创建callback
lr_callback = keras.callbacks.LearningRateScheduler(lr_schedule)
tensorboard_callback = keras.callbacks.TensorBoard(log_dir=logdir)
#callback中加入
model.fit(x_train, y_train, epochs=15, callbacks=[tensorboard_callback, lr_callback])

tensorboard

In [7]:


import tensorflow as tf
import tensorflow.io as tfio
tfio.gfile.rmtree("../data/summary/")
tfio.gfile.mkdir("../data/summary")
tf.summary.trace_on(
    graph=True, profiler=False
)
#创建writer
writer = tf.summary.create_file_writer("../data/summary/logs")
with writer.as_default():
    for step in range(20):
        x = step
        y = x * x / 4 + x/8 + 1
        tf.summary.scalar("step1", y, x)
    tf.summary.text('first_text', 'hello world!', step=0)
    tf.summary.text('first_text', 'nice to meet you!', step=1)
    #直方图展示每个值出现的次数。
    tf.summary.histogram('histogram_of_weights', tf.constant([5,4,3,4,6,5,5,5,5,5, 8,8,8,8,8, 9,9,9,9,9]), step=20)
    writer.flush()

print(tfio.gfile.listdir("../data/summary/logs/"))

#%load_ext tensorboard
#%tensorboard --logdir=../data/summary/logs
#http://localhost:6006/#scalars

['events.out.tfevents.1627026138.HUOZAI.11284.1372.v2']

上边代码运行后，启动tensorboard可以打印出如图所示曲线编辑编辑

In [9]:


import tensorflow as tf
@tf.function
def my_func(x, y):
  # A simple hand-rolled layer.
  return tf.nn.relu(tf.matmul(x, y))

# Set up logging.
logdir = '../data/summary/trace'
writer = tf.summary.create_file_writer(logdir)

# Sample data for your function.
x = tf.random.uniform((3, 3))
y = tf.random.uniform((3, 3))

# Bracket the function call with
# tf.summary.trace_on() and tf.summary.trace_export().
tf.summary.trace_on(graph=True, profiler=True)
# Call only one tf.function when tracing.
z = my_func(x, y)
with writer.as_default():
  tf.summary.trace_export(
      name="my_func_trace",
      step=0,
      profiler_outdir=logdir)
print(z)

tf.Tensor(
[[1.0760099  0.8215246  0.60553014]
 [1.350822   0.9441446  0.8681546 ]
 [1.5068175  0.98877287 1.1240329 ]], shape=(3, 3), dtype=float32)

In [13]:


from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from datetime import datetime
from packaging import version

import tensorflow as tf
from tensorflow import keras

import numpy as np

print("TensorFlow version: ", tf.__version__)
assert version.parse(tf.__version__).release[0] >= 2, \
    "This notebook requires TensorFlow 2.0 or above."
data_size = 1000
# 80% 的数据用来训练
train_pct = 0.8

train_size = int(data_size * train_pct)

# 创建在(-1,1)范围内的随机数作为输入
x = np.linspace(-1, 1, data_size)
np.random.shuffle(x)

# 生成输出数据
# y = 0.5x + 2 + noise
y = 0.5 * x + 2 + np.random.normal(0, 0.05, (data_size, ))

# 将数据分成训练和测试集
x_train, y_train = x[:train_size], y[:train_size]
x_test, y_test = x[train_size:], y[train_size:]

#创建并设置metrics writer
logdir = "../data/summary/logs/scalars/" + datetime.now().strftime("%Y%m%d-%H%M%S")
file_writer = tf.summary.create_file_writer(logdir + "/metrics")
file_writer.set_as_default()

def lr_schedule(epoch):
  """
  Returns a custom learning rate that decreases as epochs progress.
  """
  learning_rate = 0.2
  if epoch > 10:
    learning_rate = 0.02
  if epoch > 20:
    learning_rate = 0.01
  if epoch > 50:
    learning_rate = 0.005
  #使用
  tf.summary.scalar('learning rate', data=learning_rate, step=epoch)
  return learning_rate

lr_callback = keras.callbacks.LearningRateScheduler(lr_schedule)
tensorboard_callback = keras.callbacks.TensorBoard(log_dir=logdir)

model = keras.models.Sequential([
    keras.layers.Dense(16, input_dim=1),
    keras.layers.Dense(1),
])

model.compile(
    loss='mse', # keras.losses.mean_squared_error
    optimizer=keras.optimizers.SGD(),
)

training_history = model.fit(
    x_train, # input
    y_train, # output
    batch_size=train_size,
    verbose=0, # Suppress chatty output; use Tensorboard instead
    epochs=100,
    validation_data=(x_test, y_test),
    callbacks=[tensorboard_callback, lr_callback],
)

TensorFlow version:  2.4.2

In [15]:

import tensorflow as tf
#创建并设置metrics writer
logdir = "../data/summary/logs/scalars/" + datetime.now().strftime("%Y%m%d-%H%M%S")
file_writer = tf.summary.create_file_writer(logdir + "/metrics")
file_writer.set_as_default()
def lr_schedule(epoch):
  """
  Returns a custom learning rate that decreases as epochs progress.
  """
  learning_rate = 0.2
  if epoch > 10:
    learning_rate = 0.02
  if epoch > 20:
    learning_rate = 0.01
  if epoch > 50:
    learning_rate = 0.005
  #使用
  tf.summary.scalar('lr', data=learning_rate, step=epoch)
  return learning_rate
lr_callback = keras.callbacks.LearningRateScheduler(lr_schedule)
tensorboard_callback = keras.callbacks.TensorBoard(log_dir=logdir)

#直接使用kerasAPI加载mnist手写数字数据集
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
#构建模型
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])
#编译模型，指定了优化器，损失函数，和监控指标。
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
#训练模型
model.fit(x_train, y_train, epochs=15, callbacks=[tensorboard_callback, lr_callback])
#在测试集上评估模型准确率
model.evaluate(x_test,  y_test, verbose=2)

#假如这是三个新数据，作出预测
x_new = x_test[:3]
y_new = y_test[:3]
y_pred = model.predict(x_new)
print("predict:", tf.argmax(y_pred,1))
print("real:", y_new)

Epoch 1/15
1875/1875 [==============================] - 2s 884us/step - loss: 3.5876 - accuracy: 0.1808
Epoch 2/15
1875/1875 [==============================] - 2s 833us/step - loss: 2.2668 - accuracy: 0.1363
Epoch 3/15
1875/1875 [==============================] - 2s 951us/step - loss: 2.2424 - accuracy: 0.1384
Epoch 4/15
1875/1875 [==============================] - 2s 823us/step - loss: 2.2890 - accuracy: 0.1264
Epoch 5/15
1875/1875 [==============================] - 1s 674us/step - loss: 2.3162 - accuracy: 0.1292
Epoch 6/15
1875/1875 [==============================] - 1s 690us/step - loss: 2.3393 - accuracy: 0.1312
Epoch 7/15
1875/1875 [==============================] - 1s 674us/step - loss: 2.2469 - accuracy: 0.1381
Epoch 8/15
1875/1875 [==============================] - 1s 699us/step - loss: 2.2639 - accuracy: 0.1263
Epoch 9/15
1875/1875 [==============================] - 1s 685us/step - loss: 2.3123 - accuracy: 0.1103
Epoch 10/15
1875/1875 [==============================] - 1s 665us/step - loss: 2.3101 - accuracy: 0.1339
Epoch 11/15
1875/1875 [==============================] - 1s 664us/step - loss: 2.2541 - accuracy: 0.1303
Epoch 12/15
1875/1875 [==============================] - 1s 686us/step - loss: 2.2471 - accuracy: 0.1240
Epoch 13/15
1875/1875 [==============================] - 1s 693us/step - loss: 2.2307 - accuracy: 0.1329
Epoch 14/15
1875/1875 [==============================] - 1s 747us/step - loss: 2.2237 - accuracy: 0.1358
Epoch 15/15
1875/1875 [==============================] - 1s 750us/step - loss: 2.2190 - accuracy: 0.1347
313/313 - 0s - loss: 2.2072 - accuracy: 0.1521
predict: tf.Tensor([2 2 2], shape=(3,), dtype=int64)
real: [7 2 1]

图像数据

tf.summary.image(
    name, data, step=None, max_outputs=3, description=None
)

In [21]:


from datetime import datetime
import io
import itertools
from packaging import version
from six.moves import range

import tensorflow as tf
from tensorflow import keras

import matplotlib.pyplot as plt
import numpy as np
import sklearn.metrics

print("TensorFlow version: ", tf.__version__)
assert version.parse(tf.__version__).release[0] >= 2, \
    "This notebook requires TensorFlow 2.0 or above."
w = tf.summary.create_file_writer('../data/summary/logs')
with w.as_default():
  image1 = tf.random.uniform(shape=[8, 8, 1])
  image2 = tf.random.uniform(shape=[8, 8, 1])
  tf.summary.image("grayscale_noise", [image1, image2], step=0)


# Download the data. The data is already divided into train and test.
# The labels are integers representing classes.
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = \
    fashion_mnist.load_data()

# Names of the integer classes, i.e., 0 -> T-short/top, 1 -> Trouser, etc.
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 
    'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
print("Shape: ", train_images[0].shape)
print("Label: ", train_labels[0], "->", class_names[train_labels[0]])
# Reshape the image for the Summary API.
img = np.reshape(train_images[0], (-1, 28, 28, 1))

# Sets up a timestamped log directory.
logdir = "../data/summary/logs/" + datetime.now().strftime("%Y%m%d-%H%M%S")
# Creates a file writer for the log directory.
file_writer = tf.summary.create_file_writer(logdir)

# Using the file writer, log the reshaped image.
with file_writer.as_default():
  tf.summary.image("Training data", img, step=0)
with file_writer.as_default():
  # Don't forget to reshape.
  images = np.reshape(train_images[0:25], (-1, 28, 28, 1))
  tf.summary.image("25 training data examples", images, max_outputs=25, step=0)




logdir = "../data/summary/logs" + datetime.now().strftime("%Y%m%d-%H%M%S")
file_writer = tf.summary.create_file_writer(logdir)

def plot_to_image(figure):
  """Converts the matplotlib plot specified by 'figure' to a PNG image and
  returns it. The supplied figure is closed and inaccessible after this call."""
  # Save the plot to a PNG in memory.
  buf = io.BytesIO()
  plt.savefig(buf, format='png')
  # Closing the figure prevents it from being displayed directly inside
  # the notebook.
  plt.close(figure)
  buf.seek(0)
  # Convert PNG buffer to TF image
  image = tf.image.decode_png(buf.getvalue(), channels=4)
  # Add the batch dimension
  image = tf.expand_dims(image, 0)
  return image

def image_grid():
  """Return a 5x5 grid of the MNIST images as a matplotlib figure."""
  # Create a figure to contain the plot.
  figure = plt.figure(figsize=(10,10))
  for i in range(25):
    # Start next subplot.
    plt.subplot(5, 5, i + 1, title=class_names[train_labels[i]])
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i], cmap=plt.cm.binary)

  return figure

# Prepare the plot
figure = image_grid()
# Convert to image and log
with file_writer.as_default():
  tf.summary.image("Training data", plot_to_image(figure), step=0)

TensorFlow version:  2.4.2
Shape:  (28, 28)
Label:  9 -> Ankle boot

用tensorboard进行超参数调优

In [24]:

import tensorflow as tf
from tensorboard.plugins.hparams import api as hp
fashion_mnist = tf.keras.datasets.fashion_mnist

(x_train, y_train),(x_test, y_test) = fashion_mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
HP_NUM_UNITS = hp.HParam('num_units', hp.Discrete([16, 32]))
HP_DROPOUT = hp.HParam('dropout', hp.RealInterval(0.1, 0.2))
HP_OPTIMIZER = hp.HParam('optimizer', hp.Discrete(['adam', 'sgd']))

METRIC_ACCURACY = 'accuracy'

with tf.summary.create_file_writer('../data/summary/logs/').as_default():
  hp.hparams_config(
    hparams=[HP_NUM_UNITS, HP_DROPOUT, HP_OPTIMIZER],
    metrics=[hp.Metric(METRIC_ACCURACY, display_name='Accuracy')],
  )

def train_test_model(hparams):
  model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(hparams[HP_NUM_UNITS], activation=tf.nn.relu),
    tf.keras.layers.Dropout(hparams[HP_DROPOUT]),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax),
  ])
  model.compile(
      optimizer=hparams[HP_OPTIMIZER],
      loss='sparse_categorical_crossentropy',
      metrics=['accuracy'],
  )

  model.fit(x_train, y_train, epochs=1) # Run with 1 epoch to speed things up for demo purposes
  _, accuracy = model.evaluate(x_test, y_test)
  return accuracy

def run(run_dir, hparams):
  with tf.summary.create_file_writer(run_dir).as_default():
    hp.hparams(hparams)  # record the values used in this trial
    accuracy = train_test_model(hparams)
    tf.summary.scalar(METRIC_ACCURACY, accuracy, step=1)

session_num = 0

for num_units in HP_NUM_UNITS.domain.values:
  for dropout_rate in (HP_DROPOUT.domain.min_value, HP_DROPOUT.domain.max_value):
    for optimizer in HP_OPTIMIZER.domain.values:
      hparams = {
          HP_NUM_UNITS: num_units,
          HP_DROPOUT: dropout_rate,
          HP_OPTIMIZER: optimizer,
      }
      run_name = "run-%d" % session_num
      print('--- Starting trial: %s' % run_name)
      print({h.name: hparams[h] for h in hparams})
      run('../data/summary/logs/' + run_name, hparams)
      session_num += 1

--- Starting trial: run-0
{'num_units': 16, 'dropout': 0.1, 'optimizer': 'adam'}
1875/1875 [==============================] - 2s 690us/step - loss: 0.9210 - accuracy: 0.6806
313/313 [==============================] - 0s 518us/step - loss: 0.5077 - accuracy: 0.8251
--- Starting trial: run-1
{'num_units': 16, 'dropout': 0.1, 'optimizer': 'sgd'}
1875/1875 [==============================] - 1s 587us/step - loss: 1.3483 - accuracy: 0.5347
313/313 [==============================] - 0s 511us/step - loss: 0.6940 - accuracy: 0.7591
--- Starting trial: run-2
{'num_units': 16, 'dropout': 0.2, 'optimizer': 'adam'}
1875/1875 [==============================] - 2s 769us/step - loss: 1.1718 - accuracy: 0.5709
313/313 [==============================] - 0s 531us/step - loss: 0.5638 - accuracy: 0.8052
--- Starting trial: run-3
{'num_units': 16, 'dropout': 0.2, 'optimizer': 'sgd'}
1875/1875 [==============================] - 1s 628us/step - loss: 1.3929 - accuracy: 0.5049
313/313 [==============================] - 0s 547us/step - loss: 0.6941 - accuracy: 0.7673
--- Starting trial: run-4
{'num_units': 32, 'dropout': 0.1, 'optimizer': 'adam'}
1875/1875 [==============================] - 2s 695us/step - loss: 0.8180 - accuracy: 0.7159
313/313 [==============================] - 0s 492us/step - loss: 0.4628 - accuracy: 0.8344
--- Starting trial: run-5
{'num_units': 32, 'dropout': 0.1, 'optimizer': 'sgd'}
1875/1875 [==============================] - 1s 625us/step - loss: 1.1590 - accuracy: 0.6134
313/313 [==============================] - 0s 515us/step - loss: 0.6200 - accuracy: 0.7923
--- Starting trial: run-6
{'num_units': 32, 'dropout': 0.2, 'optimizer': 'adam'}
1875/1875 [==============================] - 1s 602us/step - loss: 0.8887 - accuracy: 0.6941
313/313 [==============================] - 0s 534us/step - loss: 0.4889 - accuracy: 0.8226
--- Starting trial: run-7
{'num_units': 32, 'dropout': 0.2, 'optimizer': 'sgd'}
1875/1875 [==============================] - 1s 696us/step - loss: 1.2020 - accuracy: 0.5845
313/313 [==============================] - 0s 486us/step - loss: 0.6257 - accuracy: 0.7844

embedding展示

In [27]:

import os
import tensorflow as tf
import tensorflow_datasets as tfds
from tensorboard.plugins import projector
(train_data, test_data), info = tfds.load(
    "imdb_reviews/subwords8k",
    split=(tfds.Split.TRAIN, tfds.Split.TEST),
    with_info=True,
    as_supervised=True,
)
encoder = info.features["text"].encoder

# Shuffle and pad the data.
train_batches = train_data.shuffle(1000).padded_batch(
    10, padded_shapes=((None,), ())
)
test_batches = test_data.shuffle(1000).padded_batch(
    10, padded_shapes=((None,), ())
)
train_batch, train_labels = next(iter(train_batches))
# Create an embedding layer.
embedding_dim = 16
embedding = tf.keras.layers.Embedding(encoder.vocab_size, embedding_dim)
# Configure the embedding layer as part of a keras model.
model = tf.keras.Sequential(
    [
        embedding, # The embedding layer should be the first layer in a model.
        tf.keras.layers.GlobalAveragePooling1D(),
        tf.keras.layers.Dense(16, activation="relu"),
        tf.keras.layers.Dense(1),
    ]
)

# Compile model.
model.compile(
    optimizer="adam",
    loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
    metrics=["accuracy"],
)

# Train model for one epoch.
history = model.fit(
    train_batches, epochs=1, validation_data=test_batches, validation_steps=20
)
# Set up a logs directory, so Tensorboard knows where to look for files.
log_dir='../data/summary/logs/'
if not os.path.exists(log_dir):
    os.makedirs(log_dir)

# Save Labels separately on a line-by-line manner.
with open(os.path.join(log_dir, 'metadata.tsv'), "w") as f:
  for subwords in encoder.subwords:
    f.write("{}\n".format(subwords))
  # Fill in the rest of the labels with "unknown".
  for unknown in range(1, encoder.vocab_size - len(encoder.subwords)):
    f.write("unknown #{}\n".format(unknown))


# Save the weights we want to analyze as a variable. Note that the first
# value represents any unknown word, which is not in the metadata, here
# we will remove this value.
weights = tf.Variable(model.layers[0].get_weights()[0][1:])
# Create a checkpoint from embedding, the filename and key are the
# name of the tensor.
checkpoint = tf.train.Checkpoint(embedding=weights)
checkpoint.save(os.path.join(log_dir, "embedding.ckpt"))

# Set up config.
config = projector.ProjectorConfig()
embedding = config.embeddings.add()
# The name of the tensor will be suffixed by `/.ATTRIBUTES/VARIABLE_VALUE`.
embedding.tensor_name = "embedding/.ATTRIBUTES/VARIABLE_VALUE"
embedding.metadata_path = 'metadata.tsv'
projector.visualize_embeddings(log_dir, config)
WARNING:absl:TFDS datasets with text encoding are deprecated and will be removed in a future version. Instead, you should use the plain text version and tokenize the text using `tensorflow_text` (See: https://www.tensorflow.org/tutorials/tensorflow_text/intro#tfdata_example)
ERROR:absl:Failed to construct dataset imdb_reviews

本文代码来自tensorboard官网样例。tensorboard还提供了what-if工具，公平性指标，性能剖析工具，debug NaN问题工具需要时可以自由使用。如果想自己做可视化工具，还可以通过直接获取数据的方式，自己绘制图形。

import tensorboard as tb
experiment_id = "c1KCv3X3QvGwaXfgX1c4tg"
experiment = tb.data.experimental.ExperimentFromDev(experiment_id)
df = experiment.get_scalars()
df

In [30]:


#性能分析
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from datetime import datetime
from packaging import version

import functools
import tensorflow as tf
import tensorflow_datasets as tfds
from tensorflow.python.keras import backend
from tensorflow.python.keras import layers

import numpy as np

print("TensorFlow version: ", tf.__version__)
BATCH_NORM_DECAY = 0.997
BATCH_NORM_EPSILON = 1e-5
L2_WEIGHT_DECAY = 2e-4


def identity_building_block(input_tensor,
                            kernel_size,
                            filters,
                            stage,
                            block,
                            training=None):

  """标识块是一种在捷径上没有卷积层的块。

  参数：
    input_tensor：输入张量
    kernel_size：默认为3，内核大小为
        主路径上的中间卷积层
    过滤器：整数列表，主路径上3个卷积层的过滤器
    stage：整数，当前阶段标签，用于生成层名称
    block：当前块标签，用于生成层名称
    training：仅在使用 Estimator 训练 keras 模型时使用。 在其他情况下，它是自动处理的。

  返回值：
    输出块的张量。
  """
  filters1, filters2 = filters
  if tf.keras.backend.image_data_format() == 'channels_last':
    bn_axis = 3
  else:
    bn_axis = 1
  conv_name_base = 'res' + str(stage) + block + '_branch'
  bn_name_base = 'bn' + str(stage) + block + '_branch'

  x = tf.keras.layers.Conv2D(filters1, kernel_size,
                             padding='same',
                             kernel_initializer='he_normal',
                             kernel_regularizer=
                             tf.keras.regularizers.l2(L2_WEIGHT_DECAY),
                             bias_regularizer=
                             tf.keras.regularizers.l2(L2_WEIGHT_DECAY),
                             name=conv_name_base + '2a')(input_tensor)
  x = tf.keras.layers.BatchNormalization(axis=bn_axis,
                                         name=bn_name_base + '2a',
                                         momentum=BATCH_NORM_DECAY,
                                         epsilon=BATCH_NORM_EPSILON)(
                                             x, training=training)
  x = tf.keras.layers.Activation('relu')(x)

  x = tf.keras.layers.Conv2D(filters2, kernel_size,
                             padding='same',
                             kernel_initializer='he_normal',
                             kernel_regularizer=
                             tf.keras.regularizers.l2(L2_WEIGHT_DECAY),
                             bias_regularizer=
                             tf.keras.regularizers.l2(L2_WEIGHT_DECAY),
                             name=conv_name_base + '2b')(x)
  x = tf.keras.layers.BatchNormalization(axis=bn_axis,
                                         name=bn_name_base + '2b',
                                         momentum=BATCH_NORM_DECAY,
                                         epsilon=BATCH_NORM_EPSILON)(
                                             x, training=training)

  x = tf.keras.layers.add([x, input_tensor])
  x = tf.keras.layers.Activation('relu')(x)
  return x


def conv_building_block(input_tensor,
                        kernel_size,
                        filters,
                        stage,
                        block,
                        strides=(2, 2),
                        training=None):
  """在捷径中具有卷积层的块。

  参数：
    input_tensor：输入张量
    kernel_size：默认为3，内核大小为
        主路径上的中间卷积层
    filters：整数列表，主路径上3个卷积层的过滤器
    stage：整数，当前阶段标签，用于生成层名称
    block：当前块标签，用于生成层名称
    training：仅在使用 Estimator 训练 keras 模型时使用。在其他情况下，它是自动处理的。

  返回值：
    输出块的张量。

  请注意，从第3阶段开始，
  主路径上的第一个卷积层的步长=（2，2）
  而且捷径的步长=（2，2）
  """
  filters1, filters2 = filters
  if tf.keras.backend.image_data_format() == 'channels_last':
    bn_axis = 3
  else:
    bn_axis = 1
  conv_name_base = 'res' + str(stage) + block + '_branch'
  bn_name_base = 'bn' + str(stage) + block + '_branch'

  x = tf.keras.layers.Conv2D(filters1, kernel_size, strides=strides,
                             padding='same',
                             kernel_initializer='he_normal',
                             kernel_regularizer=
                             tf.keras.regularizers.l2(L2_WEIGHT_DECAY),
                             bias_regularizer=
                             tf.keras.regularizers.l2(L2_WEIGHT_DECAY),
                             name=conv_name_base + '2a')(input_tensor)
  x = tf.keras.layers.BatchNormalization(axis=bn_axis,
                                         name=bn_name_base + '2a',
                                         momentum=BATCH_NORM_DECAY,
                                         epsilon=BATCH_NORM_EPSILON)(
                                             x, training=training)
  x = tf.keras.layers.Activation('relu')(x)

  x = tf.keras.layers.Conv2D(filters2, kernel_size, padding='same',
                             kernel_initializer='he_normal',
                             kernel_regularizer=
                             tf.keras.regularizers.l2(L2_WEIGHT_DECAY),
                             bias_regularizer=
                             tf.keras.regularizers.l2(L2_WEIGHT_DECAY),
                             name=conv_name_base + '2b')(x)
  x = tf.keras.layers.BatchNormalization(axis=bn_axis,
                                         name=bn_name_base + '2b',
                                         momentum=BATCH_NORM_DECAY,
                                         epsilon=BATCH_NORM_EPSILON)(
                                             x, training=training)

  shortcut = tf.keras.layers.Conv2D(filters2, (1, 1), strides=strides,
                                    kernel_initializer='he_normal',
                                    kernel_regularizer=
                                    tf.keras.regularizers.l2(L2_WEIGHT_DECAY),
                                    bias_regularizer=
                                    tf.keras.regularizers.l2(L2_WEIGHT_DECAY),
                                    name=conv_name_base + '1')(input_tensor)
  shortcut = tf.keras.layers.BatchNormalization(
      axis=bn_axis, name=bn_name_base + '1',
      momentum=BATCH_NORM_DECAY, epsilon=BATCH_NORM_EPSILON)(
          shortcut, training=training)

  x = tf.keras.layers.add([x, shortcut])
  x = tf.keras.layers.Activation('relu')(x)
  return x


def resnet_block(input_tensor,
                 size,
                 kernel_size,
                 filters,
                 stage,
                 conv_strides=(2, 2),
                 training=None):
  """一个应用层后跟多个标识块的块。

  参数：
    input_tensor：输入张量
    size：整数，构成转化卷积/身份块的数量。
    一个卷积层使用后，再跟（size-1）个身份块。
    kernel_size：默认为3，内核大小为
        主路径上的中间卷积层
    filters：整数列表，主路径上3个卷积层的过滤器
    stage：整数，当前阶段标签，用于生成层名称
    conv_strides：块中第一个卷积层的步长。
    training：仅在使用 Estimator 训练 keras 模型时使用。其他情况它会自动处理。  

  返回值：
    应用层和身份块后的输出张量。
  """

  x = conv_building_block(input_tensor, kernel_size, filters, stage=stage,
                          strides=conv_strides, block='block_0',
                          training=training)
  for i in range(size - 1):
    x = identity_building_block(x, kernel_size, filters, stage=stage,
                                block='block_%d' % (i + 1), training=training)
  return x

def resnet(num_blocks, classes=10, training=None):
  """实例化ResNet体系结构。

  参数：
    num_blocks：整数，每个块中的卷积/身份块的数量。
      ResNet 包含3个块，每个块包含一个卷积块
      后面跟着(layers_per_block - 1) 个身份块数。 每
      卷积/理想度块具有2个卷积层。 用输入
      卷积层和池化层至最后，这带来了
      网络的总大小为（6 * num_blocks + 2）
    classes：将图像分类为的可选类数
    training：仅在使用 Estimator 训练 keras 模型时使用。其他情况下它会自动处理。

  返回值：
    Keras模型实例。
  """

  input_shape = (32, 32, 3)
  img_input = layers.Input(shape=input_shape)

  if backend.image_data_format() == 'channels_first':
    x = layers.Lambda(lambda x: backend.permute_dimensions(x, (0, 3, 1, 2)),
                      name='transpose')(img_input)
    bn_axis = 1
  else:  # channel_last
    x = img_input
    bn_axis = 3

  x = tf.keras.layers.ZeroPadding2D(padding=(1, 1), name='conv1_pad')(x)
  x = tf.keras.layers.Conv2D(16, (3, 3),
                             strides=(1, 1),
                             padding='valid',
                             kernel_initializer='he_normal',
                             kernel_regularizer=
                             tf.keras.regularizers.l2(L2_WEIGHT_DECAY),
                             bias_regularizer=
                             tf.keras.regularizers.l2(L2_WEIGHT_DECAY),
                             name='conv1')(x)
  x = tf.keras.layers.BatchNormalization(axis=bn_axis, name='bn_conv1',
                                         momentum=BATCH_NORM_DECAY,
                                         epsilon=BATCH_NORM_EPSILON)(
                                             x, training=training)
  x = tf.keras.layers.Activation('relu')(x)

  x = resnet_block(x, size=num_blocks, kernel_size=3, filters=[16, 16],
                   stage=2, conv_strides=(1, 1), training=training)

  x = resnet_block(x, size=num_blocks, kernel_size=3, filters=[32, 32],
                   stage=3, conv_strides=(2, 2), training=training)

  x = resnet_block(x, size=num_blocks, kernel_size=3, filters=[64, 64],
                   stage=4, conv_strides=(2, 2), training=training)

  x = tf.keras.layers.GlobalAveragePooling2D(name='avg_pool')(x)
  x = tf.keras.layers.Dense(classes, activation='softmax',
                            kernel_initializer='he_normal',
                            kernel_regularizer=
                            tf.keras.regularizers.l2(L2_WEIGHT_DECAY),
                            bias_regularizer=
                            tf.keras.regularizers.l2(L2_WEIGHT_DECAY),
                            name='fc10')(x)

  inputs = img_input
  # 创建模型
  model = tf.keras.models.Model(inputs, x, name='resnet56')

  return model


resnet20 = functools.partial(resnet, num_blocks=3)
resnet32 = functools.partial(resnet, num_blocks=5)
resnet56 = functools.partial(resnet, num_blocks=9)
resnet110 = functools.partial(resnet, num_blocks=18)
cifar_builder = tfds.builder('cifar10')
cifar_builder.download_and_prepare()




HEIGHT = 32
WIDTH = 32
NUM_CHANNELS = 3
NUM_CLASSES = 10
BATCH_SIZE = 128

def preprocess_data(record):
  image = record['image']
  label = record['label']

  # 调整图像大小以在每侧增加四个额外的像素。
  image = tf.image.resize_with_crop_or_pad(
      image, HEIGHT + 8, WIDTH + 8)

  # 随机裁剪图像的 [HEIGHT，WIDTH] 部分。
  image = tf.image.random_crop(image, [HEIGHT, WIDTH, NUM_CHANNELS])

  # 随机水平翻转图像。
  image = tf.image.random_flip_left_right(image)

  # 减去均值并除以像素方差。
  image = tf.image.per_image_standardization(image)

  label = tf.compat.v1.sparse_to_dense(label, (NUM_CLASSES,), 1)
  return image, label

train_data = cifar_builder.as_dataset(split=tfds.Split.TRAIN)
train_data = train_data.repeat()
train_data = train_data.map(
    lambda value: preprocess_data(value))
train_data = train_data.shuffle(1024)

train_data = train_data.batch(BATCH_SIZE)

model = resnet56(classes=NUM_CLASSES)

model.compile(optimizer='SGD',
              loss='categorical_crossentropy',
              metrics=['categorical_accuracy'])


log_dir="../data/summary/logs/" + datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1, profile_batch = 3)
model.fit(train_data,
          steps_per_epoch=20,
          epochs=5, 
          callbacks=[tensorboard_callback])
TensorFlow version:  2.4.2

tf.metrics

Classes class AUC: Approximates the AUC (Area under the curve) of the ROC or PR curves.

class Accuracy: Calculates how often predictions equal labels.

class BinaryAccuracy: Calculates how often predictions match binary labels.

class BinaryCrossentropy: Computes the crossentropy metric between the labels and predictions.

class CategoricalAccuracy: Calculates how often predictions match one-hot labels.

class CategoricalCrossentropy: Computes the crossentropy metric between the labels and predictions.

class CategoricalHinge: Computes the categorical hinge metric between y_true and y_pred.

class CosineSimilarity: Computes the cosine similarity between the labels and predictions.

class FalseNegatives: Calculates the number of false negatives.

class FalsePositives: Calculates the number of false positives.

class Hinge: Computes the hinge metric between y_true and y_pred.

class KLDivergence: Computes Kullback-Leibler divergence metric between y_true and y_pred.

class LogCoshError: Computes the logarithm of the hyperbolic cosine of the prediction error.

class Mean: Computes the (weighted) mean of the given values.

class MeanAbsoluteError: Computes the mean absolute error between the labels and predictions.

class MeanAbsolutePercentageError: Computes the mean absolute percentage error between y_true and y_pred.

class MeanIoU: Computes the mean Intersection-Over-Union metric.

class MeanMetricWrapper: Wraps a stateless metric function with the Mean metric.

class MeanRelativeError: Computes the mean relative error by normalizing with the given values.

class MeanSquaredError: Computes the mean squared error between y_true and y_pred.

class MeanSquaredLogarithmicError: Computes the mean squared logarithmic error between y_true and y_pred.

class MeanTensor: Computes the element-wise (weighted) mean of the given tensors.

class Metric: Encapsulates metric logic and state.

class Poisson: Computes the Poisson metric between y_true and y_pred.

class Precision: Computes the precision of the predictions with respect to the labels.

class PrecisionAtRecall: Computes best precision where recall is >= specified value.

class Recall: Computes the recall of the predictions with respect to the labels.

class RecallAtPrecision: Computes best recall where precision is >= specified value.

class RootMeanSquaredError: Computes root mean squared error metric between y_true and y_pred.

class SensitivityAtSpecificity: Computes best sensitivity where specificity is >= specified value.

class SparseCategoricalAccuracy: Calculates how often predictions match integer labels.

class SparseCategoricalCrossentropy: Computes the crossentropy metric between the labels and predictions.

class SparseTopKCategoricalAccuracy: Computes how often integer targets are in the top K predictions.

class SpecificityAtSensitivity: Computes best specificity where sensitivity is >= specified value.

class SquaredHinge: Computes the squared hinge metric between y_true and y_pred.

class Sum: Computes the (weighted) sum of the given values.

class TopKCategoricalAccuracy: Computes how often targets are in the top K predictions.

class TrueNegatives: Calculates the number of true negatives.

class TruePositives: Calculates the number of true positives.

Functions KLD(...): Computes Kullback-Leibler divergence loss between y_true and y_pred.

MAE(...): Computes the mean absolute error between labels and predictions.

MAPE(...): Computes the mean absolute percentage error between y_true and y_pred.

MSE(...): Computes the mean squared error between labels and predictions.

MSLE(...): Computes the mean squared logarithmic error between y_true and y_pred.

binary_accuracy(...): Calculates how often predictions match binary labels.

binary_crossentropy(...): Computes the binary crossentropy loss.

categorical_accuracy(...): Calculates how often predictions match one-hot labels.

categorical_crossentropy(...): Computes the categorical crossentropy loss.

deserialize(...): Deserializes a serialized metric class/function instance.

get(...): Retrieves a Keras metric as a function/Metric class instance.

hinge(...): Computes the hinge loss between y_true and y_pred.

kl_divergence(...): Computes Kullback-Leibler divergence loss between y_true and y_pred.

kld(...): Computes Kullback-Leibler divergence loss between y_true and y_pred.

kullback_leibler_divergence(...): Computes Kullback-Leibler divergence loss between y_true and y_pred.

log_cosh(...): Logarithm of the hyperbolic cosine of the prediction error.

logcosh(...): Logarithm of the hyperbolic cosine of the prediction error.

mae(...): Computes the mean absolute error between labels and predictions.

mape(...): Computes the mean absolute percentage error between y_true and y_pred.

mean_absolute_error(...): Computes the mean absolute error between labels and predictions.

mean_absolute_percentage_error(...): Computes the mean absolute percentage error between y_true and y_pred.

mean_squared_error(...): Computes the mean squared error between labels and predictions.

mean_squared_logarithmic_error(...): Computes the mean squared logarithmic error between y_true and y_pred.

mse(...): Computes the mean squared error between labels and predictions.

msle(...): Computes the mean squared logarithmic error between y_true and y_pred.

poisson(...): Computes the Poisson loss between y_true and y_pred.

serialize(...): Serializes metric function or Metric instance.

sparse_categorical_accuracy(...): Calculates how often predictions match integer labels.

sparse_categorical_crossentropy(...): Computes the sparse categorical crossentropy loss.

sparse_top_k_categorical_accuracy(...): Computes how often integer targets are in the top K predictions.

squared_hinge(...): Computes the squared hinge loss between y_true and y_pred.

top_k_categorical_accuracy(...): Computes how often targets are in the top K predictions.

tensorflow周边系统--tensorboard & metrics & summary

监控

tensorboard

图像数据

用tensorboard进行超参数调优

embedding展示

tf.metrics