Keras 中的 LSTM 文本生成 · 精通 TensorFlow 1.x

# Keras 中的 LSTM 文本生成您可以在 Jupyter 筆記本`ch-08b_RNN_Text_Keras`中按照本節的代碼進行操作。我們在 Keras 實現文本生成 LSTM，步驟如下： 1. 首先，我們將所有數據轉換為兩個張量，張量`x`有五列，因為我們一次輸入五個字，張量`y`只有一列輸出。我們將`y`或標簽張量轉換為單熱編碼表示。請記住，在大型數據集的實踐中，您將使用 word2vec 嵌入而不是單熱表示。 ```py # get the data x_train, y_train = text8.seq_to_xy(seq=text8.part['train'],n_tx=n_x,n_ty=n_y) # reshape input to be [samples, time steps, features] x_train = x_train.reshape(x_train.shape[0], x_train.shape[1],1) y_onehot = np.zeros(shape=[y_train.shape[0],text8.vocab_len],dtype=np.float32) for i in range(y_train.shape[0]): y_onehot[i,y_train[i]]=1 ``` 1. 接下來，僅使用一個隱藏的 LSTM 層定義 LSTM 模型。由于我們的輸出不是序列，我們還將`return_sequences`設置為`False`： ```py n_epochs = 1000 batch_size=128 state_size=128 n_epochs_display=100 # create and fit the LSTM model model = Sequential() model.add(LSTM(units=state_size, input_shape=(x_train.shape[1], x_train.shape[2]), return_sequences=False ) ) model.add(Dense(text8.vocab_len)) model.add(Activation('softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam') model.summary() ``` 該模型如下所示： ```py Layer (type) Output Shape Param # ================================================================= lstm_1 (LSTM) (None, 128) 66560 _________________________________________________________________ dense_1 (Dense) (None, 1457) 187953 _________________________________________________________________ activation_1 (Activation) (None, 1457) 0 ================================================================= Total params: 254,513 Trainable params: 254,513 Non-trainable params: 0 _________________________________________________________________ ``` 1. 對于 Keras，我們運行一個循環來運行 10 次，在每次迭代中訓練 100 個周期的模型并打印文本生成的結果。以下是訓練模型和生成文本的完整代碼： ```py for j in range(n_epochs // n_epochs_display): model.fit(x_train, y_onehot, epochs=n_epochs_display, batch_size=batch_size,verbose=0) # generate text y_pred_r5 = np.empty([10]) y_pred_f5 = np.empty([10]) x_test_r5 = random5.copy() x_test_f5 = first5.copy() # let us generate text of 10 words after feeding 5 words for i in range(10): for x,y in zip([x_test_r5,x_test_f5], [y_pred_r5,y_pred_f5]): x_input = x.copy() x_input = x_input.reshape(-1, n_x, n_x_vars) y_pred = model.predict(x_input)[0] y_pred_id = np.argmax(y_pred) y[i]=y_pred_id x[:-1] = x[1:] x[-1] = y_pred_id print('Epoch: ',((j+1) * n_epochs_display)-1) print(' Random5 prediction:',id2string(y_pred_r5)) print(' First5 prediction:',id2string(y_pred_f5)) ``` 1. 輸出并不奇怪，從重復單詞開始，模型有所改進，但是可以通過更多 LSTM 層，更多數據，更多訓練迭代和其他超參數調整來進一步提高。 ```py Random 5 words: free bolshevik be n another First 5 words: anarchism originated as a term ``` 預測的輸出如下： ```py Epoch: 99 Random5 prediction: anarchistic anarchistic wrote wrote wrote wrote wrote wrote wrote wrote First5 prediction: right philosophy than than than than than than than than Epoch: 199 Random5 prediction: anarchistic anarchistic wrote wrote wrote wrote wrote wrote wrote wrote First5 prediction: term i revolutionary than war war french french french french Epoch: 299 Random5 prediction: anarchistic anarchistic wrote wrote wrote wrote wrote wrote wrote wrote First5 prediction: term i revolutionary revolutionary revolutionary revolutionary revolutionary revolutionary revolutionary revolutionary Epoch: 399 Random5 prediction: anarchistic anarchistic wrote wrote wrote wrote wrote wrote wrote wrote First5 prediction: term i revolutionary labor had had french french french french Epoch: 499 Random5 prediction: anarchistic anarchistic amongst wrote wrote wrote wrote wrote wrote wrote First5 prediction: term i revolutionary labor individualist had had french french french Epoch: 599 Random5 prediction: tolstoy wrote tolstoy wrote wrote wrote wrote wrote wrote wrote First5 prediction: term i revolutionary labor individualist had had had had had Epoch: 699 Random5 prediction: tolstoy wrote tolstoy wrote wrote wrote wrote wrote wrote wrote First5 prediction: term i revolutionary labor individualist had had had had had Epoch: 799 Random5 prediction: tolstoy wrote tolstoy tolstoy tolstoy tolstoy tolstoy tolstoy tolstoy tolstoy First5 prediction: term i revolutionary labor individualist had had had had had Epoch: 899 Random5 prediction: tolstoy wrote tolstoy tolstoy tolstoy tolstoy tolstoy tolstoy tolstoy tolstoy First5 prediction: term i revolutionary labor should warren warren warren warren warren Epoch: 999 Random5 prediction: tolstoy wrote tolstoy tolstoy tolstoy tolstoy tolstoy tolstoy tolstoy tolstoy First5 prediction: term i individualist labor should warren warren warren warren warren ``` 如果您注意到我們在 LSTM 模型的輸出中有重復的單詞用于文本生成。雖然超參數和網絡調整可以消除一些重復，但還有其他方法可以解決這個問題。我們得到重復單詞的原因是模型總是從單詞的概率分布中選擇具有最高概率的單詞。這可以改變以選擇諸如在連續單詞之間引入更大可變性的單詞。