如何用 Keras 調整 LSTM 超參數進行時間序列預測 · Machine Learning Mastery 博客文章翻譯

# 如何用 Keras 調整 LSTM 超參數進行時間序列預測 > 原文： [https://machinelearningmastery.com/tune-lstm-hyperparameters-keras-time-series-forecasting/](https://machinelearningmastery.com/tune-lstm-hyperparameters-keras-time-series-forecasting/) 配置神經網絡很困難，因為沒有關于如何做到這一點的好理論。您必須系統地從探索的動態和客觀結果點探索不同的配置，以嘗試理解給定預測建模問題的發生情況。在本教程中，您將了解如何在時間序列預測問題上探索如何配置 LSTM 網絡。完成本教程后，您將了解： * 如何調整和解釋訓練時期數的結果。 * 如何調整和解釋訓練批次大小的結果。 * 如何調整和解釋神經元數量的結果。讓我們開始吧。 ![How to Tune LSTM Hyperparameters with Keras for Time Series Forecasting](https://img.kancloud.cn/df/46/df46847ae39422fb3ac58ff9ad4874e9_640x360.jpg) 如何用 Keras 調整 LSTM 超參數用于時間序列預測照片由 [David Saddler](https://www.flickr.com/photos/80502454@N00/6585205675/) 保留，保留一些權利。 ## 教程概述本教程分為 6 個部分;他們是： 1. 洗發水銷售數據集 2. 實驗測試線束 3. 調整時代數量 4. 調整批量大小 5. 調整神經元數量 6. 結果摘要 ### 環境本教程假定您已安裝 Python SciPy 環境。您可以在此示例中使用 Python 2 或 3。本教程假設您安裝了 TensorFlow 或 Theano 后端的 Keras v2.0 或更高版本。本教程還假設您安裝了 scikit-learn，Pandas，NumPy 和 Matplotlib。如果您在設置 Python 環境時需要幫助，請參閱以下帖子： * [如何使用 Anaconda 設置用于機器學習和深度學習的 Python 環境](http://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/) ## 洗發水銷售數據集該數據集描述了 3 年期間每月洗發水的銷售數量。單位是銷售計數，有 36 個觀察。原始數據集歸功于 Makridakis，Wheelwright 和 Hyndman（1998）。 [您可以在此處下載并了解有關數據集的更多信息](https://datamarket.com/data/set/22r0/sales-of-shampoo-over-a-three-year-period)。下面的示例加載并創建已加載數據集的圖。 ```py # load and plot dataset from pandas import read_csv from pandas import datetime from matplotlib import pyplot # load dataset def parser(x): return datetime.strptime('190'+x, '%Y-%m') series = read_csv('shampoo-sales.csv', header=0, parse_dates=[0], index_col=0, squeeze=True, date_parser=parser) # summarize first few rows print(series.head()) # line plot series.plot() pyplot.show() ``` 運行該示例將數據集作為 Pandas Series 加載并打印前 5 行。 ```py Month 1901-01-01 266.0 1901-02-01 145.9 1901-03-01 183.1 1901-04-01 119.3 1901-05-01 180.3 Name: Sales, dtype: float64 ``` 然后創建該系列的線圖，顯示明顯的增加趨勢。 ![Line Plot of Shampoo Sales Dataset](https://img.kancloud.cn/11/f1/11f11d2a2ec40c7c0724e4e09f11a4ca_640x480.jpg) 洗發水銷售數據集的線圖接下來，我們將了解實驗中使用的 LSTM 配置和測試工具。 ## 實驗測試線束本節介紹本教程中使用的測試工具。 ### 數據拆分我們將 Shampoo Sales 數據集分為兩部分：訓練和測試集。前兩年的數據將用于訓練數據集，剩余的一年數據將用于測試集。將使用訓練數據集開發模型，并對測試數據集進行預測。測試數據集的持久性預測（樸素預測）實現了每月洗發水銷售 136.761 的錯誤。這在測試集上提供了較低的可接受表現限制。 ### 模型評估將使用滾動預測場景，也稱為前進模型驗證。測試數據集的每個時間步驟將一次一個地走。將使用模型對時間步長進行預測，然后將獲取測試集的實際預期值，并使其可用于下一時間步的預測模型。這模仿了一個真實世界的場景，每個月都會有新的洗發水銷售觀察結果，并用于下個月的預測。這將通過訓練和測試數據集的結構進行模擬。我們將以一次性方法進行所有預測。將收集關于測試數據集的所有預測，并計算錯誤分數以總結模型的技能。將使用均方根誤差（RMSE），因為它會對大錯誤進行處罰，并產生與預測數據相同的分數，即每月洗發水銷售額。 ### 數據準備在我們將 LSTM 模型擬合到數據集之前，我們必須轉換數據。在擬合模型和進行預測之前，對數據集執行以下三個數據變換。 1. 轉換時間序列數據，使其靜止不動。具體而言，滯后= 1 差分以消除數據中的增加趨勢。 2. 將時間序列轉換為監督學習問題。具體而言，將數據組織成輸入和輸出模式，其中前一時間步的觀察被用作預測當前時間步的觀察的輸入 3. 將觀察結果轉換為具有特定比例。具體而言，要將數據重新調整為-1 到 1 之間的值，以滿足 LSTM 模型的默認雙曲正切激活函數。這些變換在預測時反轉，在計算和誤差分數之前將它們恢復到原始比例。 ### 實驗運行每個實驗場景將運行 10 次。其原因在于，每次訓練給定配置時，LSTM 網絡的隨機初始條件可能導致非常不同的結果。診斷方法將用于研究模型配置。這是創建和研究模型技能隨時間變化的線圖（稱為時期的訓練迭代），以深入了解給定配置如何執行以及如何調整以獲得更好的表現。在每個時期結束時，將在訓練和測試數據集上評估模型，并保存 RMSE 分數。打印每個方案結束時的訓練和測試 RMSE 分數，以指示進度。一系列訓練和測試 RMSE 得分在運行結束時繪制為線圖。訓練得分為藍色，考試成績為橙色。讓我們深入研究結果。 ## 調整時代數量我們將看調整的第一個 LSTM 參數是訓練時期的數量。該模型將使用批量大小為 4 和單個神經元。我們將探討針對不同數量的訓練時期訓練此配置的效果。 ### 500 個時代的診斷下面列出了此診斷的完整代碼清單。代碼的評論相當好，應該很容易理解。此代碼將成為本教程中所有未來實驗的基礎，并且僅列出每個后續實驗中所做的更改。 ```py from pandas import DataFrame from pandas import Series from pandas import concat from pandas import read_csv from pandas import datetime from sklearn.metrics import mean_squared_error from sklearn.preprocessing import MinMaxScaler from keras.models import Sequential from keras.layers import Dense from keras.layers import LSTM from math import sqrt import matplotlib # be able to save images on server matplotlib.use('Agg') from matplotlib import pyplot import numpy # date-time parsing function for loading the dataset def parser(x): return datetime.strptime('190'+x, '%Y-%m') # frame a sequence as a supervised learning problem def timeseries_to_supervised(data, lag=1): df = DataFrame(data) columns = [df.shift(i) for i in range(1, lag+1)] columns.append(df) df = concat(columns, axis=1) df = df.drop(0) return df # create a differenced series def difference(dataset, interval=1): diff = list() for i in range(interval, len(dataset)): value = dataset[i] - dataset[i - interval] diff.append(value) return Series(diff) # scale train and test data to [-1, 1] def scale(train, test): # fit scaler scaler = MinMaxScaler(feature_range=(-1, 1)) scaler = scaler.fit(train) # transform train train = train.reshape(train.shape[0], train.shape[1]) train_scaled = scaler.transform(train) # transform test test = test.reshape(test.shape[0], test.shape[1]) test_scaled = scaler.transform(test) return scaler, train_scaled, test_scaled # inverse scaling for a forecasted value def invert_scale(scaler, X, yhat): new_row = [x for x in X] + [yhat] array = numpy.array(new_row) array = array.reshape(1, len(array)) inverted = scaler.inverse_transform(array) return inverted[0, -1] # evaluate the model on a dataset, returns RMSE in transformed units def evaluate(model, raw_data, scaled_dataset, scaler, offset, batch_size): # separate X, y = scaled_dataset[:,0:-1], scaled_dataset[:,-1] # reshape reshaped = X.reshape(len(X), 1, 1) # forecast dataset output = model.predict(reshaped, batch_size=batch_size) # invert data transforms on forecast predictions = list() for i in range(len(output)): yhat = output[i,0] # invert scaling yhat = invert_scale(scaler, X[i], yhat) # invert differencing yhat = yhat + raw_data[i] # store forecast predictions.append(yhat) # report performance rmse = sqrt(mean_squared_error(raw_data[1:], predictions)) return rmse # fit an LSTM network to training data def fit_lstm(train, test, raw, scaler, batch_size, nb_epoch, neurons): X, y = train[:, 0:-1], train[:, -1] X = X.reshape(X.shape[0], 1, X.shape[1]) # prepare model model = Sequential() model.add(LSTM(neurons, batch_input_shape=(batch_size, X.shape[1], X.shape[2]), stateful=True)) model.add(Dense(1)) model.compile(loss='mean_squared_error', optimizer='adam') # fit model train_rmse, test_rmse = list(), list() for i in range(nb_epoch): model.fit(X, y, epochs=1, batch_size=batch_size, verbose=0, shuffle=False) model.reset_states() # evaluate model on train data raw_train = raw[-(len(train)+len(test)+1):-len(test)] train_rmse.append(evaluate(model, raw_train, train, scaler, 0, batch_size)) model.reset_states() # evaluate model on test data raw_test = raw[-(len(test)+1):] test_rmse.append(evaluate(model, raw_test, test, scaler, 0, batch_size)) model.reset_states() history = DataFrame() history['train'], history['test'] = train_rmse, test_rmse return history # run diagnostic experiments def run(): # load dataset series = read_csv('shampoo-sales.csv', header=0, parse_dates=[0], index_col=0, squeeze=True, date_parser=parser) # transform data to be stationary raw_values = series.values diff_values = difference(raw_values, 1) # transform data to be supervised learning supervised = timeseries_to_supervised(diff_values, 1) supervised_values = supervised.values # split data into train and test-sets train, test = supervised_values[0:-12], supervised_values[-12:] # transform the scale of the data scaler, train_scaled, test_scaled = scale(train, test) # fit and evaluate model train_trimmed = train_scaled[2:, :] # config repeats = 10 n_batch = 4 n_epochs = 500 n_neurons = 1 # run diagnostic tests for i in range(repeats): history = fit_lstm(train_trimmed, test_scaled, raw_values, scaler, n_batch, n_epochs, n_neurons) pyplot.plot(history['train'], color='blue') pyplot.plot(history['test'], color='orange') print('%d) TrainRMSE=%f, TestRMSE=%f' % (i, history['train'].iloc[-1], history['test'].iloc[-1])) pyplot.savefig('epochs_diagnostic.png') # entry point run() ``` 運行實驗在 10 次實驗運行的每一次結束時打印訓練的 RMSE 和測試集。 ```py 0) TrainRMSE=63.495594, TestRMSE=113.472643 1) TrainRMSE=60.446307, TestRMSE=100.147470 2) TrainRMSE=59.879681, TestRMSE=95.112331 3) TrainRMSE=66.115269, TestRMSE=106.444401 4) TrainRMSE=61.878702, TestRMSE=86.572920 5) TrainRMSE=73.519382, TestRMSE=103.551694 6) TrainRMSE=64.407033, TestRMSE=98.849227 7) TrainRMSE=72.684834, TestRMSE=98.499976 8) TrainRMSE=77.593773, TestRMSE=124.404747 9) TrainRMSE=71.749335, TestRMSE=126.396615 ``` 還創建了在每個訓練時期之后訓練和測試集上的一系列 RMSE 得分的線圖。 ![Diagnostic Results with 500 Epochs](https://img.kancloud.cn/41/7b/417bccab9335b37dfa444f89ca7b4f7c_640x480.jpg) 500 個時期的診斷結果結果清楚地表明 RMSE 在幾乎所有實驗運行的訓練時期都呈下降趨勢。這是一個好兆頭，因為它表明模型正在學習問題并具有一些預測技巧。實際上，所有最終測試分數都低于簡單持久性模型（樸素預測）的誤差，該模型在此問題上達到了 136.761 的 RMSE。結果表明，更多的訓練時期將導致更熟練的模型。讓我們嘗試將時期數從 500 增加到 1000。 ### 1000 個時期的診斷在本節中，我們使用相同的實驗設置，并使模型適合 1000 個訓練時期。具體地， _n_epochs_ 參數在 _run（）_ 函數中被設置為 _1000_ 。 ```py n_epochs = 1000 ``` 運行該示例為最后一個時期的訓練和測試集打印 RMSE。 ```py 0) TrainRMSE=69.242394, TestRMSE=90.832025 1) TrainRMSE=65.445810, TestRMSE=113.013681 2) TrainRMSE=57.949335, TestRMSE=103.727228 3) TrainRMSE=61.808586, TestRMSE=89.071392 4) TrainRMSE=68.127167, TestRMSE=88.122807 5) TrainRMSE=61.030678, TestRMSE=93.526607 6) TrainRMSE=61.144466, TestRMSE=97.963895 7) TrainRMSE=59.922150, TestRMSE=94.291120 8) TrainRMSE=60.170052, TestRMSE=90.076229 9) TrainRMSE=62.232470, TestRMSE=98.174839 ``` 還創建了每個時期的測試和訓練 RMSE 得分的線圖。 ![Diagnostic Results with 1000 Epochs](https://img.kancloud.cn/89/c7/89c7f804b7706d0e6f091aed6e7f7744_640x480.jpg) 1000 個時期的診斷結果我們可以看到模型誤差的下降趨勢確實繼續并且似乎變慢。訓練和測試案例的線條變得更加橫向，但仍然普遍呈下降趨勢，盡管變化率較低。測試誤差的一些示例顯示可能的拐點大約 600 個時期并且可能顯示出上升趨勢。值得進一步延長時代。我們對測試集中的平均表現持續改進感興趣，這可能會持續下去。讓我們嘗試將時期數從 1000 增加到 2000。 ### 2000 年的診斷在本節中，我們使用相同的實驗設置，并使模型適合 2000 個訓練時期。具體地，在 _run（）_ 函數中將 _n_epochs_ 參數設置為 2000。 ```py n_epochs = 2000 ``` 運行該示例為最后一個時期的訓練和測試集打印 RMSE。 ```py 0) TrainRMSE=67.292970, TestRMSE=83.096856 1) TrainRMSE=55.098951, TestRMSE=104.211509 2) TrainRMSE=69.237206, TestRMSE=117.392007 3) TrainRMSE=61.319941, TestRMSE=115.868142 4) TrainRMSE=60.147575, TestRMSE=87.793270 5) TrainRMSE=59.424241, TestRMSE=99.000790 6) TrainRMSE=66.990082, TestRMSE=80.490660 7) TrainRMSE=56.467012, TestRMSE=97.799062 8) TrainRMSE=60.386380, TestRMSE=103.810569 9) TrainRMSE=58.250862, TestRMSE=86.212094 ``` 還創建了每個時期的測試和訓練 RMSE 得分的線圖。 ![Diagnostic Results with 2000 Epochs](https://img.kancloud.cn/c2/d0/c2d0b6e7bd80817efe20e992fb634bfc_640x480.jpg) 2000 年的診斷結果正如人們可能已經猜到的那樣，在訓練和測試數據集的額外 1000 個時期內，誤差的下降趨勢仍在繼續。值得注意的是，大約一半的案例一直持續減少到運行結束，而其余案件則顯示出增長趨勢的跡象。增長的趨勢是過度擬合的跡象。這是模型過度擬合訓練數據集的代價，代價是測試數據集的表現更差。通過對訓練數據集的持續改進以及隨后的測試數據集中的拐點和惡化技能的改進來舉例說明。不到一半的運行顯示了測試數據集中此類模式的開始。然而，測試數據集的最終時期結果非常好。如果有機會我們可以通過更長時間的訓練獲得進一步的收益，我們必須探索它。讓我們嘗試將 2000 年到 4000 年的時期數量加倍。 ### 4000 個時代的診斷在本節中，我們使用相同的實驗設置，并使模型適合超過 4000 個訓練時期。具體地，在 _run（）_ 函數中將 _n_epochs_ 參數設置為 4000。 ```py n_epochs = 4000 ``` 運行該示例為最后一個時期的訓練和測試集打印 RMSE。 ```py 0) TrainRMSE=58.889277, TestRMSE=99.121765 1) TrainRMSE=56.839065, TestRMSE=95.144846 2) TrainRMSE=58.522271, TestRMSE=87.671309 3) TrainRMSE=53.873962, TestRMSE=113.920076 4) TrainRMSE=66.386299, TestRMSE=77.523432 5) TrainRMSE=58.996230, TestRMSE=136.367014 6) TrainRMSE=55.725800, TestRMSE=113.206607 7) TrainRMSE=57.334604, TestRMSE=90.814642 8) TrainRMSE=54.593069, TestRMSE=105.724825 9) TrainRMSE=56.678498, TestRMSE=83.082262 ``` 還創建了每個時期的測試和訓練 RMSE 得分的線圖。 ![Diagnostic Results with 4000 Epochs](https://img.kancloud.cn/cd/3f/cd3fdf5e16b04c8a6dcf17a3045f4cfa_640x480.jpg) 4000 個時期的診斷結果類似的模式仍在繼續。即使在 4000 個時代，也存在改善表現的總趨勢。有一種嚴重過度擬合的情況，其中測試誤差急劇上升。同樣，大多數運行以“良好”（優于持久性）最終測試錯誤結束。 ### 結果摘要上面的診斷運行有助于探索模型的動態行為，但缺乏客觀和可比較的平均表現。我們可以通過重復相同的實驗并計算和比較每個配置的摘要統計數據來解決這個問題。在這種情況下，30 個運行完成了迭代值 500,1000,2000,4000 和 6000。我們的想法是在大量運行中使用匯總統計數據比較配置，并確切地了解哪些配置可能在平均情況下表現更好。完整的代碼示例如下所示。 ```py from pandas import DataFrame from pandas import Series from pandas import concat from pandas import read_csv from pandas import datetime from sklearn.metrics import mean_squared_error from sklearn.preprocessing import MinMaxScaler from keras.models import Sequential from keras.layers import Dense from keras.layers import LSTM from math import sqrt import matplotlib # be able to save images on server matplotlib.use('Agg') from matplotlib import pyplot import numpy # date-time parsing function for loading the dataset def parser(x): return datetime.strptime('190'+x, '%Y-%m') # frame a sequence as a supervised learning problem def timeseries_to_supervised(data, lag=1): df = DataFrame(data) columns = [df.shift(i) for i in range(1, lag+1)] columns.append(df) df = concat(columns, axis=1) df = df.drop(0) return df # create a differenced series def difference(dataset, interval=1): diff = list() for i in range(interval, len(dataset)): value = dataset[i] - dataset[i - interval] diff.append(value) return Series(diff) # invert differenced value def inverse_difference(history, yhat, interval=1): return yhat + history[-interval] # scale train and test data to [-1, 1] def scale(train, test): # fit scaler scaler = MinMaxScaler(feature_range=(-1, 1)) scaler = scaler.fit(train) # transform train train = train.reshape(train.shape[0], train.shape[1]) train_scaled = scaler.transform(train) # transform test test = test.reshape(test.shape[0], test.shape[1]) test_scaled = scaler.transform(test) return scaler, train_scaled, test_scaled # inverse scaling for a forecasted value def invert_scale(scaler, X, yhat): new_row = [x for x in X] + [yhat] array = numpy.array(new_row) array = array.reshape(1, len(array)) inverted = scaler.inverse_transform(array) return inverted[0, -1] # fit an LSTM network to training data def fit_lstm(train, batch_size, nb_epoch, neurons): X, y = train[:, 0:-1], train[:, -1] X = X.reshape(X.shape[0], 1, X.shape[1]) model = Sequential() model.add(LSTM(neurons, batch_input_shape=(batch_size, X.shape[1], X.shape[2]), stateful=True)) model.add(Dense(1)) model.compile(loss='mean_squared_error', optimizer='adam') for i in range(nb_epoch): model.fit(X, y, epochs=1, batch_size=batch_size, verbose=0, shuffle=False) model.reset_states() return model # run a repeated experiment def experiment(repeats, series, epochs): # transform data to be stationary raw_values = series.values diff_values = difference(raw_values, 1) # transform data to be supervised learning supervised = timeseries_to_supervised(diff_values, 1) supervised_values = supervised.values # split data into train and test-sets train, test = supervised_values[0:-12], supervised_values[-12:] # transform the scale of the data scaler, train_scaled, test_scaled = scale(train, test) # run experiment error_scores = list() for r in range(repeats): # fit the model batch_size = 4 train_trimmed = train_scaled[2:, :] lstm_model = fit_lstm(train_trimmed, batch_size, epochs, 1) # forecast the entire training dataset to build up state for forecasting train_reshaped = train_trimmed[:, 0].reshape(len(train_trimmed), 1, 1) lstm_model.predict(train_reshaped, batch_size=batch_size) # forecast test dataset test_reshaped = test_scaled[:,0:-1] test_reshaped = test_reshaped.reshape(len(test_reshaped), 1, 1) output = lstm_model.predict(test_reshaped, batch_size=batch_size) predictions = list() for i in range(len(output)): yhat = output[i,0] X = test_scaled[i, 0:-1] # invert scaling yhat = invert_scale(scaler, X, yhat) # invert differencing yhat = inverse_difference(raw_values, yhat, len(test_scaled)+1-i) # store forecast predictions.append(yhat) # report performance rmse = sqrt(mean_squared_error(raw_values[-12:], predictions)) print('%d) Test RMSE: %.3f' % (r+1, rmse)) error_scores.append(rmse) return error_scores # load dataset series = read_csv('shampoo-sales.csv', header=0, parse_dates=[0], index_col=0, squeeze=True, date_parser=parser) # experiment repeats = 30 results = DataFrame() # vary training epochs epochs = [500, 1000, 2000, 4000, 6000] for e in epochs: results[str(e)] = experiment(repeats, series, e) # summarize results print(results.describe()) # save boxplot results.boxplot() pyplot.savefig('boxplot_epochs.png') ``` 首先運行代碼打印 5 個配置中每個配置的摘要統計信息。值得注意的是，這包括每個結果群體的 RMSE 得分的平均值和標準差。均值給出了配置的平均預期表現的概念，而標準偏差給出了方差的概念。最小和最大 RMSE 分數還可以了解可能預期的最佳和最差情況示例的范圍。僅查看平均 RMSE 分數，結果表明配置為 1000 的迭代可能更好。結果還表明可能需要進一步調查 1000 至 2000 年的時代價值。 ```py 500 1000 2000 4000 6000 count 30.000000 30.000000 30.000000 30.000000 30.000000 mean 109.439203 104.566259 107.882390 116.339792 127.618305 std 14.874031 19.097098 22.083335 21.590424 24.866763 min 87.747708 81.621783 75.327883 77.399968 90.512409 25% 96.484568 87.686776 86.753694 102.127451 105.861881 50% 110.891939 98.942264 116.264027 121.898248 125.273050 75% 121.067498 119.248849 125.518589 130.107772 150.832313 max 138.879278 139.928055 146.840997 157.026562 166.111151 ``` 分布也顯示在盒子和須狀圖上。這有助于了解分布如何直接比較。綠線顯示中位數，方框顯示第 25 和第 75 百分位數，或中間 50％的數據。該比較還表明，將時期設置為 1000 的選擇優于測試的替代方案。它還表明，在 2000 年或 4000 年的時期內可以實現最佳表現，但代價是平均表現更差。 ![Box and Whisker Plot Summarizing Epoch Results](https://img.kancloud.cn/f6/ce/f6cefa3dc2bd0e2c81a2b33b51cbf5ec_640x480.jpg) 框和晶須圖總結時代結果接下來，我們將看看批量大小的影響。 ## 調整批量大小批量大小控制更新網絡權重的頻率。重要的是在 Keras 中，批量大小必須是測試大小和訓練數據集的一個因子。在上一節探索訓練時期的數量時，批量大小固定為 4，它完全分為測試數據集（大小為 12）和測試數據集的截斷版本（大小為 20）。在本節中，我們將探討改變批量大小的效果。我們將訓練時期的數量保持在 1000。 ### 1000 個時期的診斷和 4 的批量大小作為提醒，上一節在第二個實驗中評估了批量大小為 4，其中一些時期為 1000。結果顯示出錯誤的下降趨勢，大多數運行一直持續到最后的訓練時期。 ![Diagnostic Results with 1000 Epochs](https://img.kancloud.cn/89/c7/89c7f804b7706d0e6f091aed6e7f7744_640x480.jpg) 1000 個時期的診斷結果 ### 1000 個時期的診斷和 2 的批量大小在本節中，我們將批量大小從 4 減半。對 _run（）_ 函數中的 _n_batch_ 參數進行了此更改;例如： ```py n_batch = 2 ``` 運行該示例顯示了與批量大小為 4 相同的總體趨勢，可能在最后一個時期具有更高的 RMSE。這些運行可能會顯示出更快穩定 RMES 的行為，而不是看似繼續下行趨勢。下面列出了每次運行的最終暴露的 RSME 分數。 ```py 0) TrainRMSE=63.510219, TestRMSE=115.855819 1) TrainRMSE=58.336003, TestRMSE=97.954374 2) TrainRMSE=69.163685, TestRMSE=96.721446 3) TrainRMSE=65.201764, TestRMSE=110.104828 4) TrainRMSE=62.146057, TestRMSE=112.153553 5) TrainRMSE=58.253952, TestRMSE=98.442715 6) TrainRMSE=67.306530, TestRMSE=108.132021 7) TrainRMSE=63.545292, TestRMSE=102.821356 8) TrainRMSE=61.693847, TestRMSE=99.859398 9) TrainRMSE=58.348250, TestRMSE=99.682159 ``` 還創建了每個時期的測試和訓練 RMSE 得分的線圖。 ![Diagnostic Results with 1000 Epochs and Batch Size of 2](https://img.kancloud.cn/01/56/0156c75129934c472fb737ecfce79ae8_640x480.jpg) 1000 個時期和批量大小為 2 的診斷結果讓我們再試一次批量。 ### 1000 個時期的診斷和 1 的批量大小批量大小為 1 在技術上執行在線學習。這是在每個訓練模式之后更新網絡的地方。這可以與批量學習形成對比，其中權重僅在每個時期結束時更新。我們可以在 _run（）_ 函數中更改 _n_batch_ 參數;例如： ```py n_batch = 1 ``` 同樣，運行該示例將打印每次運行的最后一個時期的 RMSE 分數。 ```py 0) TrainRMSE=60.349798, TestRMSE=100.182293 1) TrainRMSE=62.624106, TestRMSE=95.716070 2) TrainRMSE=64.091859, TestRMSE=98.598958 3) TrainRMSE=59.929993, TestRMSE=96.139427 4) TrainRMSE=59.890593, TestRMSE=94.173619 5) TrainRMSE=55.944968, TestRMSE=106.644275 6) TrainRMSE=60.570245, TestRMSE=99.981562 7) TrainRMSE=56.704995, TestRMSE=111.404182 8) TrainRMSE=59.909065, TestRMSE=90.238473 9) TrainRMSE=60.863807, TestRMSE=105.331214 ``` 還創建了每個時期的測試和訓練 RMSE 得分的線圖。該圖表明測試 RMSE 隨時間變化的可變性更大，并且可能是訓練 RMSE 比較大的批量大小更快穩定。測試 RMSE 的可變性增加是可以預期的，因為對網絡進行的大量更改會給每次更新提供如此少的反饋。該圖還表明，如果配置提供更多的訓練時期，RMSE 的下降趨勢可能會繼續。 ![Diagnostic Results with 1000 Epochs and Batch Size of 1](https://img.kancloud.cn/2e/aa/2eaaca54d0713da2bfe66d4f501c3afa_640x480.jpg) 1000 個時期和批量大小為 1 的診斷結果 ### 結果摘要與訓練時期一樣，我們可以客觀地比較給定不同批量大小的網絡表現。每個配置運行 30 次，并根據最終結果計算匯總統計數據。 ```py ... # run a repeated experiment def experiment(repeats, series, batch_size): # transform data to be stationary raw_values = series.values diff_values = difference(raw_values, 1) # transform data to be supervised learning supervised = timeseries_to_supervised(diff_values, 1) supervised_values = supervised.values # split data into train and test-sets train, test = supervised_values[0:-12], supervised_values[-12:] # transform the scale of the data scaler, train_scaled, test_scaled = scale(train, test) # run experiment error_scores = list() for r in range(repeats): # fit the model train_trimmed = train_scaled[2:, :] lstm_model = fit_lstm(train_trimmed, batch_size, 1000, 1) # forecast the entire training dataset to build up state for forecasting train_reshaped = train_trimmed[:, 0].reshape(len(train_trimmed), 1, 1) lstm_model.predict(train_reshaped, batch_size=batch_size) # forecast test dataset test_reshaped = test_scaled[:,0:-1] test_reshaped = test_reshaped.reshape(len(test_reshaped), 1, 1) output = lstm_model.predict(test_reshaped, batch_size=batch_size) predictions = list() for i in range(len(output)): yhat = output[i,0] X = test_scaled[i, 0:-1] # invert scaling yhat = invert_scale(scaler, X, yhat) # invert differencing yhat = inverse_difference(raw_values, yhat, len(test_scaled)+1-i) # store forecast predictions.append(yhat) # report performance rmse = sqrt(mean_squared_error(raw_values[-12:], predictions)) print('%d) Test RMSE: %.3f' % (r+1, rmse)) error_scores.append(rmse) return error_scores # load dataset series = read_csv('shampoo-sales.csv', header=0, parse_dates=[0], index_col=0, squeeze=True, date_parser=parser) # experiment repeats = 30 results = DataFrame() # vary training batches batches = [1, 2, 4] for b in batches: results[str(b)] = experiment(repeats, series, b) # summarize results print(results.describe()) # save boxplot results.boxplot() pyplot.savefig('boxplot_batches.png') ``` 僅從平均表現來看，結果表明較低的 RMSE，批量大小為 1.正如前一節所述，隨著更多的訓練時期，這可能會得到進一步改善。 ```py 1 2 4 count 30.000000 30.000000 30.000000 mean 98.697017 102.642594 100.320203 std 12.227885 9.144163 15.957767 min 85.172215 85.072441 83.636365 25% 92.023175 96.834628 87.671461 50% 95.981688 101.139527 91.628144 75% 102.009268 110.171802 114.660192 max 147.688818 120.038036 135.290829 ``` 還創建了數據的框和胡須圖，以幫助以圖形方式比較分布。該圖顯示了作為綠線的中值表現，其中批量大小 4 顯示最大可變性和最低中值 RMSE。調整神經網絡是平均表現和該表現的可變性的折衷，理想的結果是具有低可變性的低平均誤差，這意味著它通常是良好且可再現的。 ![Box and Whisker Plot Summarizing Batch Size Results](https://img.kancloud.cn/f8/e8/f8e8aa61322281e2b148c524aaa3d194_640x480.jpg) 框和晶須圖總結批量大小結果 ## 調整神經元數量在本節中，我們將研究改變網絡中神經元數量的影響。神經元的數量會影響網絡的學習能力。通常，更多的神經元能夠以更長的訓練時間為代價從問題中學習更多的結構。更多的學習能力也會產生可能過度擬合訓練數據的問題。我們將使用批量大小為 4 和 1000 的訓練時期。 ### 1000 個時期和 1 個神經元的診斷我們將從 1 個神經元開始。提醒一下，這是從時代實驗中測試的第二個配置。 ![Diagnostic Results with 1000 Epochs](https://img.kancloud.cn/89/c7/89c7f804b7706d0e6f091aed6e7f7744_640x480.jpg) 1000 個時期的診斷結果 ### 1000 個時期和 2 個神經元的診斷我們可以將神經元的數量從 1 增加到 2.這有望提高網絡的學習能力。我們可以通過改變 _run（）_ 函數中的 _n_neurons_ 變量來實現。 ```py n_neurons = 2 ``` 運行此配置將打印每次運行的最后一個時期的 RMSE 分數。結果表明一般表現良好，但不是很好。 ```py 0) TrainRMSE=59.466223, TestRMSE=95.554547 1) TrainRMSE=58.752515, TestRMSE=101.908449 2) TrainRMSE=58.061139, TestRMSE=86.589039 3) TrainRMSE=55.883708, TestRMSE=94.747927 4) TrainRMSE=58.700290, TestRMSE=86.393213 5) TrainRMSE=60.564511, TestRMSE=101.956549 6) TrainRMSE=63.160916, TestRMSE=98.925108 7) TrainRMSE=60.148595, TestRMSE=95.082825 8) TrainRMSE=63.029242, TestRMSE=89.285092 9) TrainRMSE=57.794717, TestRMSE=91.425071 ``` 還創建了每個時期的測試和訓練 RMSE 得分的線圖。這更有說服力。它顯示測試 RMSE 快速下降到約 500-750 迭代，其中拐點顯示測試 RMSE 在所有運行中幾乎全面上升。同時，訓練數據集顯示持續減少到最后的時期。這些是訓練數據集過度擬合的良好跡象。 ![Diagnostic Results with 1000 Epochs and 2 Neurons](https://img.kancloud.cn/d7/97/d797d2ba55acbc9fb1d335fcf54accb0_640x480.jpg) 1000 個時期和 2 個神經元的診斷結果讓我們看看這種趨勢是否會持續更多的神經元。 ### 1000 個時期和 3 個神經元的診斷本節查看相同配置，神經元數量增加到 3。我們可以通過在 _run（）_ 函數中設置 _n_neurons_ 變量來實現。 ```py n_neurons = 3 ``` 運行此配置將打印每次運行的最后一個時期的 RMSE 分數。結果與上一節類似;我們沒有看到 2 或 3 個神經元的最終時期測試分數之間有太大的差異。最終的訓練得分看起來似乎低于 3 個神經元，可能表現出過度擬合的加速。訓練數據集中的拐點似乎比 2 個神經元實驗更早發生，可能在 300-400 時代。這些神經元數量的增加可能受益于減慢學習速度的額外變化。例如使用正常化方法如丟失，減少批量大小，并減少到訓練時期的數量。 ```py 0) TrainRMSE=55.686242, TestRMSE=90.955555 1) TrainRMSE=55.198617, TestRMSE=124.989622 2) TrainRMSE=55.767668, TestRMSE=104.751183 3) TrainRMSE=60.716046, TestRMSE=93.566307 4) TrainRMSE=57.703663, TestRMSE=110.813226 5) TrainRMSE=56.874231, TestRMSE=98.588524 6) TrainRMSE=57.206756, TestRMSE=94.386134 7) TrainRMSE=55.770377, TestRMSE=124.949862 8) TrainRMSE=56.876467, TestRMSE=95.059656 9) TrainRMSE=57.067810, TestRMSE=94.123620 ``` 還創建了每個時期的測試和訓練 RMSE 得分的線圖。 ![Diagnostic Results with 1000 Epochs and 3 Neurons](https://img.kancloud.cn/dd/4b/dd4b6a706af12400e53e080b92235030_640x480.jpg) 1000 個時期和 3 個神經元的診斷結果 ### 結果摘要同樣，我們可以客觀地比較增加神經元數量的影響，同時保持所有其他網絡配置的固定。在本節中，我們重復每個實驗 30 次，并將平均測試 RMSE 表現與 1 到 5 的神經元數量進行比較。 ```py ... # run a repeated experiment def experiment(repeats, series, neurons): # transform data to be stationary raw_values = series.values diff_values = difference(raw_values, 1) # transform data to be supervised learning supervised = timeseries_to_supervised(diff_values, 1) supervised_values = supervised.values # split data into train and test-sets train, test = supervised_values[0:-12], supervised_values[-12:] # transform the scale of the data scaler, train_scaled, test_scaled = scale(train, test) # run experiment error_scores = list() for r in range(repeats): # fit the model batch_size = 4 train_trimmed = train_scaled[2:, :] lstm_model = fit_lstm(train_trimmed, batch_size, 1000, neurons) # forecast the entire training dataset to build up state for forecasting train_reshaped = train_trimmed[:, 0].reshape(len(train_trimmed), 1, 1) lstm_model.predict(train_reshaped, batch_size=batch_size) # forecast test dataset test_reshaped = test_scaled[:,0:-1] test_reshaped = test_reshaped.reshape(len(test_reshaped), 1, 1) output = lstm_model.predict(test_reshaped, batch_size=batch_size) predictions = list() for i in range(len(output)): yhat = output[i,0] X = test_scaled[i, 0:-1] # invert scaling yhat = invert_scale(scaler, X, yhat) # invert differencing yhat = inverse_difference(raw_values, yhat, len(test_scaled)+1-i) # store forecast predictions.append(yhat) # report performance rmse = sqrt(mean_squared_error(raw_values[-12:], predictions)) print('%d) Test RMSE: %.3f' % (r+1, rmse)) error_scores.append(rmse) return error_scores # load dataset series = read_csv('shampoo-sales.csv', header=0, parse_dates=[0], index_col=0, squeeze=True, date_parser=parser) # experiment repeats = 30 results = DataFrame() # vary neurons neurons = [1, 2, 3, 4, 5] for n in neurons: results[str(n)] = experiment(repeats, series, n) # summarize results print(results.describe()) # save boxplot results.boxplot() pyplot.savefig('boxplot_neurons.png') ``` 運行實驗會打印每個配置的摘要統計信息。僅從平均表現來看，結果表明具有 1 個神經元的網絡配置具有超過 1000 個時期的最佳表現，批量大小為 4.此配置還顯示最緊密的方差。 ```py 1 2 3 4 5 count 30.000000 30.000000 30.000000 30.000000 30.000000 mean 98.344696 103.268147 102.726894 112.453766 122.843032 std 13.538599 14.720989 12.905631 16.296657 25.586013 min 81.764721 87.731385 77.545899 85.632492 85.955093 25% 88.524334 94.040807 95.152752 102.477366 104.192588 50% 93.543948 100.330678 103.622600 110.906970 117.022724 75% 102.944050 105.087384 110.235754 118.653850 133.343669 max 132.934054 152.588092 130.551521 162.889845 184.678185 ``` 盒子和須狀圖顯示中值測試集表現的明顯趨勢，其中神經元的增加導致測試 RMSE 的相應增加。 ![Box and Whisker Plot Summarizing Neuron Results](https://img.kancloud.cn/a8/98/a898a1d1a3b127d8bff71fe385290c78_640x480.jpg) 框和晶須圖總結神經元結果 ## 所有結果摘要我們在本教程的 Shampoo Sales 數據集上完成了很多 LSTM 實驗。通常，似乎配置有 1 個神經元，批量大小為 4 并且訓練 1000 個迭代的有狀態 LSTM 可能是一個很好的配置。結果還表明，批量大小為 1 并且適合更多時期的這種配置可能值得進一步探索。調整神經網絡是一項困難的實證研究，LSTM 也不例外。本教程展示了配置行為隨時間推移的診斷研究的好處，以及測試 RMSE 的客觀研究。然而，總會有更多的研究可以進行。下一節列出了一些想法。 ### 擴展本節列出了本教程中對實驗進行擴展的一些想法。如果您探索其中任何一項，請在評論中報告您的結果;我很想看看你想出了什么。 * **dropout**。使用正則化方法減慢學習速度，例如在重復的 LSTM 連接上丟失。 * **層**。通過在每層中添加更多層和不同數量的神經元來探索額外的分層學習能力。 * **正規化**。探索權重正則化（如 L1 和 L2）如何用于減慢某些配置上網絡的學習和過度擬合。 * **優化算法**。探索[替代優化算法](https://keras.io/optimizers/)的使用，例如經典的梯度下降，以查看加速或減慢學習的特定配置是否可以帶來好處。 * **損失函數**。探索[替代損失函數](https://keras.io/objectives/)的使用，看看它們是否可用于提升表現。 * **功能和時間步**。探索使用滯后觀察作為輸入特征和特征的輸入時間步驟，以查看它們作為輸入的存在是否可以改善模型的學習和/或預測能力。 * **批量大**。探索大于 4 的批量大小，可能需要進一步操縱訓練和測試數據集的大小。 ## 摘要在本教程中，您了解了如何系統地研究 LSTM 網絡的配置以進行時間序列預測。具體來說，你學到了： * 如何設計用于評估模型配置的系統測試工具。 * 如何使用模型診斷隨著時間的推移，以及客觀預測誤差來解釋模型行為。 * 如何探索和解釋訓練時期，批量大小和神經元數量的影響。您對調整 LSTM 或本教程有任何疑問嗎？在下面的評論中提出您的問題，我會盡力回答。