實現單層神經網絡 · TensorFlow 機器學習秘籍中文第二版

# 實現單層神經網絡我們擁有實現對真實數據進行操作的神經網絡所需的所有工具，因此在本節中我們將創建一個神經網絡，其中一個層在`Iris`數據集上運行。 ## 做好準備在本節中，我們將實現一個具有一個隱藏層的神經網絡。重要的是要理解完全連接的神經網絡主要基于矩陣乘法。因此，重要的是數據和矩陣的尺寸正確排列。由于這是一個回歸問題，我們將使用均方誤差作為損失函數。 ## 操作步驟我們按如下方式處理秘籍： 1. 要創建計算圖，我們首先加載以下必要的庫： ```py import matplotlib.pyplot as plt import numpy as np import tensorflow as tf from sklearn import datasets ``` 1. 現在我們將加載`Iris`數據并將長度存儲為目標值。然后我們將使用以下代碼啟動圖會話： ```py iris = datasets.load_iris() x_vals = np.array([x[0:3] for x in iris.data]) y_vals = np.array([x[3] for x in iris.data]) sess = tf.Session() ``` 1. 由于數據集較小，我們需要設置種子以使結果可重現，如下所示： ```py seed = 2 tf.set_random_seed(seed) np.random.seed(seed) ``` 1. 為了準備數據，我們將創建一個 80-20 訓練測試分割，并通過最小 - 最大縮放將 x 特征標準化為 0 到 1 之間，如下所示： ```py train_indices = np.random.choice(len(x_vals), round(len(x_vals)*0.8), replace=False) test_indices = np.array(list(set(range(len(x_vals))) - set(train_indices))) x_vals_train = x_vals[train_indices] x_vals_test = x_vals[test_indices] y_vals_train = y_vals[train_indices] y_vals_test = y_vals[test_indices] def normalize_cols(m): col_max = m.max(axis=0) col_min = m.min(axis=0) return (m-col_min) / (col_max - col_min) x_vals_train = np.nan_to_num(normalize_cols(x_vals_train)) x_vals_test = np.nan_to_num(normalize_cols(x_vals_test)) ``` 1. 現在，我們將使用以下代碼聲明數據和目標的批量大小和占位符： ```py batch_size = 50 x_data = tf.placeholder(shape=[None, 3], dtype=tf.float32) y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32) ``` 1. 重要的是要用適當的形狀聲明我們的模型變量。我們可以將隱藏層的大小聲明為我們希望的任何大小;在下面的代碼塊中，我們將其設置為有五個隱藏節點： ```py hidden_layer_nodes = 5 A1 = tf.Variable(tf.random_normal(shape=[3,hidden_layer_nodes])) b1 = tf.Variable(tf.random_normal(shape=[hidden_layer_nodes])) A2 = tf.Variable(tf.random_normal(shape=[hidden_layer_nodes,1])) b2 = tf.Variable(tf.random_normal(shape=[1])) ``` 1. 我們現在分兩步宣布我們的模型。第一步是創建隱藏層輸出，第二步是創建模型的`final_output`，如下所示： > 請注意，我們的模型從三個輸入特征到五個隱藏節點，最后到一個輸出值。 ```py hidden_output = tf.nn.relu(tf.add(tf.matmul(x_data, A1), b1)) final_output = tf.nn.relu(tf.add(tf.matmul(hidden_output, A2), b2)) ``` 1. 我們作為`loss`函數的均方誤差如下： ```py loss = tf.reduce_mean(tf.square(y_target - final_output)) ``` 1. 現在我們將聲明我們的優化算法并使用以下代碼初始化我們的變量： ```py my_opt = tf.train.GradientDescentOptimizer(0.005) train_step = my_opt.minimize(loss) init = tf.global_variables_initializer() sess.run(init) ``` 1. 接下來，我們循環我們的訓練迭代。我們還將初始化兩個列表，我們可以存儲我們的訓練和`test_loss`函數。在每個循環中，我們還希望從訓練數據中隨機選擇一個批量以適合模型，如下所示： ```py # First we initialize the loss vectors for storage. loss_vec = [] test_loss = [] for i in range(500): # We select a random set of indices for the batch. rand_index = np.random.choice(len(x_vals_train), size=batch_size) # We then select the training values rand_x = x_vals_train[rand_index] rand_y = np.transpose([y_vals_train[rand_index]]) # Now we run the training step sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y}) # We save the training loss temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_target: rand_y}) loss_vec.append(np.sqrt(temp_loss)) # Finally, we run the test-set loss and save it. test_temp_loss = sess.run(loss, feed_dict={x_data: x_vals_test, y_target: np.transpose([y_vals_test])}) test_loss.append(np.sqrt(test_temp_loss)) if (i+1)%50==0: print('Generation: ' + str(i+1) + '. Loss = ' + str(temp_loss)) ``` 1. 我們可以用`matplotlib`和以下代碼繪制損失： ```py plt.plot(loss_vec, 'k-', label='Train Loss') plt.plot(test_loss, 'r--', label='Test Loss') plt.title('Loss (MSE) per Generation') plt.xlabel('Generation') plt.ylabel('Loss') plt.legend(loc='upper right') plt.show() ``` 我們通過繪制下圖來繼續秘籍： ![](https://img.kancloud.cn/e1/5a/e15a8e89257a6650e9a536895a0c4c67_393x281.png) 圖 4：我們繪制了訓練和測試裝置的損失（MSE）。請注意，我們在 200 代之后略微過擬合模型，因為測試 MSE 不會進一步下降，但訓練 MSE 確實 ## 工作原理我們的模型現已可視化為神經網絡圖，如下圖所示： ![](https://img.kancloud.cn/c2/e6/c2e658044491f1ecec5d35838399ca73_905x436.png) 圖 5：上圖是我們的神經網絡的可視化，在隱藏層中有五個節點。我們喂養三個值：萼片長度（S.L），萼片寬度（S.W.）和花瓣長度（P.L.）。目標將是花瓣寬度。總的來說，模型中總共有 26 個變量 ## 更多請注意，通過查看測試和訓練集上的`loss`函數，我們可以確定模型何時開始過擬合訓練數據。我們還可以看到訓練損失并不像測試裝置那樣平穩。這是因為有兩個原因：第一個原因是我們使用的批量小于測試集，盡管不是很多;第二個原因是由于我們正在訓練訓練組，而測試裝置不會影響模型的變量。