實現單元測試 · TensorFlow 機器學習秘籍中文第二版

# 實現單元測試測試代碼可以加快原型設計速度，提高調試效率，加快更改速度，并且可以更輕松地共享代碼。在 TensorFlow 中有許多簡單的方法可以實現單元測試，我們將在本文中介紹它們。 ## 做好準備在編寫 TensorFlow 模型時，有助于進行單元測試以檢查程序的功能。這有助于我們，因為當我們想要對程序單元進行更改時，測試將確保這些更改不會以未知方式破壞模型。在這個秘籍中，我們將創建一個依賴于`MNIST`數據的簡單 CNN 網絡。有了它，我們將實現三種不同類型的單元測試來說明如何在 TensorFlow 中編寫它們。 > 請注意，Python 有一個很棒的測試庫，名為 Nose。 TensorFlow 還具有內置測試功能，我們將在其中查看，這樣可以更輕松地測試 Tensor 對象的值，而無需評估會話中的值。 1. 首先，我們需要加載必要的庫并格式化數據，如下所示： ```py import sys import numpy as np import tensorflow as tf from tensorflow.python.framework import ops ops.reset_default_graph() # Start a graph session sess = tf.Session() # Load data data_dir = 'temp' mnist = tf.keras.datasets.mnist (train_xdata, train_labels), (test_xdata, test_labels) = mnist.load_data() train_xdata = train_xdata / 255.0 test_xdata = test_xdata / 255.0 # Set model parameters batch_size = 100 learning_rate = 0.005 evaluation_size = 100 image_width = train_xdata[0].shape[0] image_height = train_xdata[0].shape[1] target_size = max(train_labels) + 1 num_channels = 1 # greyscale = 1 channel generations = 100 eval_every = 5 conv1_features = 25 conv2_features = 50 max_pool_size1 = 2 # NxN window for 1st max pool layer max_pool_size2 = 2 # NxN window for 2nd max pool layer fully_connected_size1 = 100 dropout_prob = 0.75 ``` 1. 然后，我們需要聲明我們的占位符，變量和模型公式，如下所示： ```py # Declare model placeholders x_input_shape = (batch_size, image_width, image_height, num_channels) x_input = tf.placeholder(tf.float32, shape=x_input_shape) y_target = tf.placeholder(tf.int32, shape=(batch_size)) eval_input_shape = (evaluation_size, image_width, image_height, num_channels) eval_input = tf.placeholder(tf.float32, shape=eval_input_shape) eval_target = tf.placeholder(tf.int32, shape=(evaluation_size)) dropout = tf.placeholder(tf.float32, shape=()) # Declare model parameters conv1_weight = tf.Variable(tf.truncated_normal([4, 4, num_channels, conv1_features], stddev=0.1, dtype=tf.float32)) conv1_bias = tf.Variable(tf.zeros([conv1_features], dtype=tf.float32)) conv2_weight = tf.Variable(tf.truncated_normal([4, 4, conv1_features, conv2_features], stddev=0.1, dtype=tf.float32)) conv2_bias = tf.Variable(tf.zeros([conv2_features], dtype=tf.float32)) # fully connected variables resulting_width = image_width // (max_pool_size1 * max_pool_size2) resulting_height = image_height // (max_pool_size1 * max_pool_size2) full1_input_size = resulting_width * resulting_height * conv2_features full1_weight = tf.Variable(tf.truncated_normal([full1_input_size, fully_connected_size1], stddev=0.1, dtype=tf.float32)) full1_bias = tf.Variable(tf.truncated_normal([fully_connected_size1], stddev=0.1, dtype=tf.float32)) full2_weight = tf.Variable(tf.truncated_normal([fully_connected_size1, target_size], stddev=0.1, dtype=tf.float32)) full2_bias = tf.Variable(tf.truncated_normal([target_size], stddev=0.1, dtype=tf.float32)) # Initialize Model Operations def my_conv_net(input_data): # First Conv-ReLU-MaxPool Layer conv1 = tf.nn.conv2d(input_data, conv1_weight, strides=[1, 1, 1, 1], padding='SAME') relu1 = tf.nn.relu(tf.nn.bias_add(conv1, conv1_bias)) max_pool1 = tf.nn.max_pool(relu1, ksize=[1, max_pool_size1, max_pool_size1, 1], strides=[1, max_pool_size1, max_pool_size1, 1], padding='SAME') # Second Conv-ReLU-MaxPool Layer conv2 = tf.nn.conv2d(max_pool1, conv2_weight, strides=[1, 1, 1, 1], padding='SAME') relu2 = tf.nn.relu(tf.nn.bias_add(conv2, conv2_bias)) max_pool2 = tf.nn.max_pool(relu2, ksize=[1, max_pool_size2, max_pool_size2, 1], strides=[1, max_pool_size2, max_pool_size2, 1], padding='SAME') # Transform Output into a 1xN layer for next fully connected layer final_conv_shape = max_pool2.get_shape().as_list() final_shape = final_conv_shape[1] * final_conv_shape[2] * final_conv_shape[3] flat_output = tf.reshape(max_pool2, [final_conv_shape[0], final_shape]) # First Fully Connected Layer fully_connected1 = tf.nn.relu(tf.add(tf.matmul(flat_output, full1_weight), full1_bias)) # Second Fully Connected Layer final_model_output = tf.add(tf.matmul(fully_connected1, full2_weight), full2_bias) # Add dropout final_model_output = tf.nn.dropout(final_model_output, dropout) return final_model_output model_output = my_conv_net(x_input) test_model_output = my_conv_net(eval_input) ``` 1. 接下來，我們創建我們的損失函數以及我們的預測和精確操作。然后，我們初始化以下模型變量： ```py # Declare Loss Function (softmax cross entropy) loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(model_output, y_target)) # Create a prediction function prediction = tf.nn.softmax(model_output) test_prediction = tf.nn.softmax(test_model_output) # Create accuracy function def get_accuracy(logits, targets): batch_predictions = np.argmax(logits, axis=1) num_correct = np.sum(np.equal(batch_predictions, targets)) return 100\. * num_correct/batch_predictions.shape[0] # Create an optimizer my_optimizer = tf.train.MomentumOptimizer(learning_rate, 0.9) train_step = my_optimizer.minimize(loss) # Initialize Variables init = tf.global_variables_initializer() sess.run(init) ``` 1. 對于我們的第一個單元測試，我們使用類`tf.test.TestCase`并創建一種方法來測試占位符（或變量）的值。對于此測試用例，我們確保損失概率（用于保持）大于`0.25`，因此模型不會更改為嘗試訓練超過 75％的損失，如下所示： ```py # Check values of tensors! class DropOutTest(tf.test.TestCase): # Make sure that we don't drop too much def dropout_greaterthan(self): with self.test_session(): self.assertGreater(dropout.eval(), 0.25) ``` 1. 接下來，我們需要測試我們的`accuracy`函數是否按預期運行。為此，我們創建一個概率樣本數組和我們期望的樣本，然后確保測試精度返回 100％，如下所示： ```py # Test accuracy function class AccuracyTest(tf.test.TestCase): # Make sure accuracy function behaves correctly def accuracy_exact_test(self): with self.test_session(): test_preds = [[0.9, 0.1],[0.01, 0.99]] test_targets = [0, 1] test_acc = get_accuracy(test_preds, test_targets) self.assertEqual(test_acc.eval(), 100.) ``` 1. 我們還可以確保 Tensor 對象是我們期望的形狀。要通過`target_size`測試模型輸出是`batch_size`的預期形狀，請輸入以下代碼： ```py # Test tensorshape class ShapeTest(tf.test.TestCase): # Make sure our model output is size [batch_size, num_classes] def output_shape_test(self): with self.test_session(): numpy_array = np.ones([batch_size, target_size]) self.assertShapeEqual(numpy_array, model_output) ``` 1. 現在我們需要在腳本中使用`main()`函數告訴 TensorFlow 我們正在運行哪個應用。腳本如下： ```py def main(argv): # Start training loop train_loss = [] train_acc = [] test_acc = [] for i in range(generations): rand_index = np.random.choice(len(train_xdata), size=batch_size) rand_x = train_xdata[rand_index] rand_x = np.expand_dims(rand_x, 3) rand_y = train_labels[rand_index] train_dict = {x_input: rand_x, y_target: rand_y, dropout: dropout_prob} sess.run(train_step, feed_dict=train_dict) temp_train_loss, temp_train_preds = sess.run([loss, prediction], feed_dict=train_dict) temp_train_acc = get_accuracy(temp_train_preds, rand_y) if (i + 1) % eval_every == 0: eval_index = np.random.choice(len(test_xdata), size=evaluation_size) eval_x = test_xdata[eval_index] eval_x = np.expand_dims(eval_x, 3) eval_y = test_labels[eval_index] test_dict = {eval_input: eval_x, eval_target: eval_y, dropout: 1.0} test_preds = sess.run(test_prediction, feed_dict=test_dict) temp_test_acc = get_accuracy(test_preds, eval_y) # Record and print results train_loss.append(temp_train_loss) train_acc.append(temp_train_acc) test_acc.append(temp_test_acc) acc_and_loss = [(i + 1), temp_train_loss, temp_train_acc, temp_test_acc] acc_and_loss = [np.round(x, 2) for x in acc_and_loss] print('Generation # {}. Train Loss: {:.2f}. Train Acc (Test Acc): {:.2f} ({:.2f})'.format(*acc_and_loss)) ``` 1. 要讓我們的腳本執行測試或訓練，我們需要以不同的方式從命令行調用它。以下代碼段是主程序代碼。如果程序收到參數`test`，它將執行測試;否則，它將運行訓練： ```py if __name__ == '__main__': cmd_args = sys.argv if len(cmd_args) > 1 and cmd_args[1] == 'test': # Perform unit-tests tf.test.main(argv=cmd_args[1:]) else: # Run the TensorFlow app tf.app.run(main=None, argv=cmd_args) ``` 1. 如果我們在命令行上運行程序，我們應該得到以下輸出： ```py $ python3 implementing_unit_tests.py test ... ---------------------------------------------------------------------- Ran 3 tests in 0.001s OK ``` 前面步驟中描述的完整程序可以在 [h](https://github.com/nfmcclure/tensorflow_cookbook/) [ttps：//github.com/nfmcclure/tensorflow_cookbook/](https://github.com/nfmcclure/tensorflow_cookbook/) 的書籍 GitHub 倉庫和 Packt 倉庫中找到： [https://github.com/PacktPublishing/TensorFlow-Machine-Learning-Cookbook-Second-Edition](https://github.com/PacktPublishing/TensorFlow-Machine-Learning-Cookbook-Second-Edition) 。 ## 工作原理在本節中，我們實現了三種類型的單元測試：張量值，操作輸出和張量形狀。 TensorFlow 有更多類型的單元測試函數，可在此處找到： [https://www.tensorflow.org/versions/master/api_docs/python/test.html](https://www.tensorflow.org/versions/master/api_docs/python/test.html) 。請記住，單元測試有助于確保代碼能夠按預期運行，為共享代碼提供信心，并使再現性更易于訪問。