Iris鳶尾花數據分類 · TensorflowJS

# 1. 鳶尾花數據分類 ## 1.1 加載數據 ```python from sklearn.datasets import load_iris ``` ```python x_data = load_iris().data # 數據特征 y_data = load_iris().target # 數據標簽 ``` ```python x_data.shape ``` (150, 4) ```python y_data.shape ``` (150,) ```python type(x_data) ``` numpy.ndarray ```python x_data[0] ``` array([5.1, 3.5, 1.4, 0.2]) ```python y_data ``` array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]) ## 1.2 數據集劃分 ```python import tensorflow as tf ``` ### 1.2.1 首先將數據集中的數據亂序 ```python import numpy as np # 使用一樣的隨機數種子，確保輸入特征和標簽配對 np.random.seed(19) np.random.shuffle(x_data) np.random.seed(19) np.random.shuffle(y_data) ``` ### 1.2.2 劃分訓練集和驗證集將數據集，簡單按照2:8的比例劃分為驗證集和訓練集合。由于數據量大小為150，故而為前120條和后30條。 ```python x_train = x_data[:-30] x_test = x_data[-30:] y_train = y_data[:-30] y_test = y_data[-30:] ``` 數據類型轉換 ```python x_train = tf.cast(x_train, tf.float32) x_test = tf.cast(x_test, tf.float32) ``` 將輸入的數據特征和標簽進行配對，每32個作為一個batch，進行輸入打包 ```python train_db = tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(32) test_db = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32) ``` ## 1.3 定義神經網絡 ### 1.3.1 定義可訓練的參數 ```python w1 = tf.Variable(tf.random.truncated_normal([4, 3], stddev=0.1, seed=1)) b1 = tf.Variable(tf.random.truncated_normal([3], stddev=0.1, seed=1)) ``` ### 1.3.2 計算梯度和更新 ```python # 超參數 lr = 0.02 epoch = 500 for e in range(epoch): loss_all = 0 for step, (x_train, y_train) in enumerate(train_db): with tf.GradientTape() as tape: # 記錄梯度信息 # 前向傳播計算y y = tf.matmul(x_train, w1) + b1 y = tf.nn.softmax(y) # 訓練集真實標簽，也變為三維的 real_y = tf.one_hot(y_train, depth=3) # 計算總loss loss = tf.reduce_mean(tf.square(real_y - y)) loss_all +=loss.numpy() # 計算梯度 grads = tape.gradient(loss, [w1, b1]) # 梯度更新 w1.assign_sub(lr * grads[0]) b1.assign_sub(lr * grads[1]) # 打印每個epoch的loss信息 print("Epoch: {}, loss: {}".format(e, loss_all/4)) # 測試 total_correct, total_number = 0, 0 for x_test, y_test in test_db: # 使用w1和b1進行預測 y = tf.matmul(x_test, w1) + b1 y = tf.nn.softmax(y) pred = tf.argmax(y, axis = 1) # 轉換數據類型，然后判斷是否和真實值相等 pred = tf.cast(pred, dtype=y_test.dtype) # 如果分類正確，值為1 correct = tf.cast(tf.equal(pred, y_test), dtype=tf.int32) # 加起來 correct = tf.reduce_sum(correct) # 所有batch中的correct total_correct += int(correct) # 總樣本數 total_number += x_test.shape[0] # 總準確率 acc = total_correct / total_number print("測試準確率：{}".format(acc)) ``` 結果： ``` Epoch: 0, loss: 0.18507226184010506 測試準確率：0.5 Epoch: 1, loss: 0.18281254917383194 測試準確率：0.5 ... Epoch: 499, loss: 0.06610946264117956 測試準確率：0.9666666666666667 ``` ## 1.4 使用Dense來定義神經網絡注意到上面的部分使用自己計算梯度的方式來進行，因為數據為： 120x4，而最終分類為3分類問題，故而上面使用功能了一層神經網絡，也即是： 120x4 x 4x3 => 120x3的矩陣。 ```python model = tf.keras.models.Sequential() model.add(tf.keras.layers.Dense(3, input_shape=(4, ), activation="softmax")) model.summary() ``` 統計結果： ``` Model: "sequential_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense_2 (Dense) (None, 3) 15 ================================================================= Total params: 15 Trainable params: 15 Non-trainable params: 0 _________________________________________________________________ ``` 指定優化器，損失函數和度量。值得注意的是，這里使用交叉熵損失函數： ```python model.compile( optimizer='adam', loss='sparse_categorical_crossentropy', # 分類的結果是數字編碼的結果，用該交叉熵 metrics=['accuracy'] ) ``` 擬合： ```python model.fit(x_data, y_data, epochs=500, batch_size=32) ``` 結果： ``` Epoch 1/500 5/5 [==============================] - 0s 1ms/step - loss: 1.1094 - accuracy: 0.2400 ... Epoch 500/500 5/5 [==============================] - 0s 1ms/step - loss: 0.3297 - accuracy: 0.9667 <tensorflow.python.keras.callbacks.History at 0x207e5b65438> ```