TensorFlow 中的 LSTM 文本生成 · 精通 TensorFlow 1.x

# TensorFlow 中的 LSTM 文本生成您可以在 Jupyter 筆記本`ch-08b_RNN_Text_TensorFlow`中按照本節的代碼進行操作。我們使用以下步驟在 TensorFlow 中實現文本生成 LSTM： 1. 讓我們為`x`和`y`定義參數和占位符： ```py batch_size = 128 n_x = 5 # number of input words n_y = 1 # number of output words n_x_vars = 1 # in case of our text, there is only 1 variable at each timestep n_y_vars = text8.vocab_len state_size = 128 learning_rate = 0.001 x_p = tf.placeholder(tf.float32, [None, n_x, n_x_vars], name='x_p') y_p = tf.placeholder(tf.float32, [None, n_y_vars], name='y_p') ``` 對于輸入，我們使用單詞的整數表示，因此`n_x_vars`是 1.對于輸出，我們使用單熱編碼值，因此輸出的數量等于詞匯長度。 1. 接下來，創建一個長度為`n_x`的張量列表： ```py x_in = tf.unstack(x_p,axis=1,name='x_in') ``` 1. 接下來，從輸入和單元創建 LSTM 單元和靜態 RNN 網絡： ```py cell = tf.nn.rnn_cell.LSTMCell(state_size) rnn_outputs, final_states = tf.nn.static_rnn(cell, x_in,dtype=tf.float32) ``` 1. 接下來，我們定義最終層的權重，偏差和公式。最后一層只需要為第六個單詞選擇輸出，因此我們應用以下公式來僅獲取最后一個輸出： ```py # output node parameters w = tf.get_variable('w', [state_size, n_y_vars], initializer= tf.random_normal_initializer) b = tf.get_variable('b', [n_y_vars], initializer=tf.constant_initializer(0.0)) y_out = tf.matmul(rnn_outputs[-1], w) + b ``` 1. 接下來，創建一個損失函數和優化器： ```py loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits( logits=y_out, labels=y_p)) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate) .minimize(loss) ``` 1. 創建我們可以在會話塊中運行的準確率函數，以檢查訓練模式的準確性： ```py n_correct_pred = tf.equal(tf.argmax(y_out,1), tf.argmax(y_p,1)) accuracy = tf.reduce_mean(tf.cast(n_correct_pred, tf.float32)) ``` 1. 最后，我們訓練模型 1000 個周期，并每 100 個周期打印結果。此外，每 100 個周期，我們從上面描述的種子字符串打印生成的文本。 LSTM 和 RNN 網絡需要對大量數據集進行大量周期的訓練，以獲得更好的結果。請嘗試加載完整的數據集并在計算機上運行 50,000或80,000 個周期，并使用其他超參數來改善結果。 ```py n_epochs = 1000 learning_rate = 0.001 text8.reset_index_in_epoch() n_batches = text8.n_batches_seq(batch_size=batch_size,n_tx=n_x,n_ty=n_y) n_epochs_display = 100 with tf.Session() as tfs: tf.global_variables_initializer().run() for epoch in range(n_epochs): epoch_loss = 0 epoch_accuracy = 0 for step in range(n_batches): x_batch, y_batch = text8.next_batch_seq(batch_size=batch_size, n_tx=n_x,n_ty=n_y) y_batch = dsu.to2d(y_batch,unit_axis=1) y_onehot = np.zeros(shape=[batch_size,text8.vocab_len], dtype=np.float32) for i in range(batch_size): y_onehot[i,y_batch[i]]=1 feed_dict = {x_p: x_batch.reshape(-1, n_x, n_x_vars), y_p: y_onehot} _, batch_accuracy, batch_loss = tfs.run([optimizer,accuracy, loss],feed_dict=feed_dict) epoch_loss += batch_loss epoch_accuracy += batch_accuracy if (epoch+1) % (n_epochs_display) == 0: epoch_loss = epoch_loss / n_batches epoch_accuracy = epoch_accuracy / n_batches print('\nEpoch {0:}, Average loss:{1:}, Average accuracy:{2:}'. format(epoch,epoch_loss,epoch_accuracy )) y_pred_r5 = np.empty([10]) y_pred_f5 = np.empty([10]) x_test_r5 = random5.copy() x_test_f5 = first5.copy() # let us generate text of 10 words after feeding 5 words for i in range(10): for x,y in zip([x_test_r5,x_test_f5], [y_pred_r5,y_pred_f5]): x_input = x.copy() feed_dict = {x_p: x_input.reshape(-1, n_x, n_x_vars)} y_pred = tfs.run(y_out, feed_dict=feed_dict) y_pred_id = int(tf.argmax(y_pred, 1).eval()) y[i]=y_pred_id x[:-1] = x[1:] x[-1] = y_pred_id print(' Random 5 prediction:',id2string(y_pred_r5)) print(' First 5 prediction:',id2string(y_pred_f5)) ``` 結果如下： ```py Epoch 99, Average loss:1.3972469369570415, Average accuracy:0.8489583333333334 Random 5 prediction: labor warren together strongly profits strongly supported supported co without First 5 prediction: market own self free together strongly profits strongly supported supported Epoch 199, Average loss:0.7894854595263799, Average accuracy:0.9186197916666666 Random 5 prediction: syndicalists spanish class movements also also anarcho anarcho anarchist was First 5 prediction: five civil association class movements also anarcho anarcho anarcho anarcho Epoch 299, Average loss:1.360412875811259, Average accuracy:0.865234375 Random 5 prediction: anarchistic beginnings influenced true tolstoy tolstoy tolstoy tolstoy tolstoy tolstoy First 5 prediction: early civil movement be for was two most most most Epoch 399, Average loss:1.1692512730757396, Average accuracy:0.8645833333333334 Random 5 prediction: including war than than revolutionary than than war than than First 5 prediction: left including including including other other other other other other Epoch 499, Average loss:0.5921860883633295, Average accuracy:0.923828125 Random 5 prediction: ever edited interested interested variety variety variety variety variety variety First 5 prediction: english market herbert strongly price interested variety variety variety variety Epoch 599, Average loss:0.8356450994809469, Average accuracy:0.8958333333333334 Random 5 prediction: management allow trabajo trabajo national national mag mag ricardo ricardo First 5 prediction: spain prior am working n war war war self self Epoch 699, Average loss:0.7057955612738928, Average accuracy:0.8971354166666666 Random 5 prediction: teachings can directive tend resist obey christianity author christianity christianity First 5 prediction: early early called social called social social social social social Epoch 799, Average loss:0.772875706354777, Average accuracy:0.90234375 Random 5 prediction: associated war than revolutionary revolutionary revolutionary than than revolutionary revolutionary First 5 prediction: political been hierarchy war than see anti anti anti anti Epoch 899, Average loss:0.43675946692625683, Average accuracy:0.9375 Random 5 prediction: individualist which which individualist warren warren tucker benjamin how tucker First 5 prediction: four at warren individualist warren published considered considered considered considered Epoch 999, Average loss:0.23202441136042276, Average accuracy:0.9602864583333334 Random 5 prediction: allow allow trabajo you you you you you you you First 5 prediction: labour spanish they they they movement movement anarcho anarcho two ``` 生成的文本中的重復單詞是常見的，并且應該更好地訓練模型。雖然模型的準確性提高到 96%，但仍然不足以生成清晰的文本。嘗試增加 LSTM 單元/隱藏層的數量，同時在較大的數據集上運行模型以獲取大量周期。現在讓我們在 Keras 建立相同的模型：