Keras 中 LSTM 的多變量時間序列預測 · Machine Learning Mastery 博客文章翻譯

# Keras 中 LSTM 的多變量時間序列預測 > 原文： [https://machinelearningmastery.com/multivariate-time-series-forecasting-lstms-keras/](https://machinelearningmastery.com/multivariate-time-series-forecasting-lstms-keras/) 像長短期記憶（LSTM）循環神經網絡這樣的神經網絡能夠幾乎無縫地模擬多個輸入變量的問題。這在時間序列預測中是一個很大的好處，其中經典線性方法難以適應多變量或多輸入預測問題。在本教程中，您將了解如何在 Keras 深度學習庫中為多變量時間序列預測開發 LSTM 模型。完成本教程后，您將了解： * 如何將原始數據集轉換為可用于時間序列預測的內容。 * 如何準備數據并使 LSTM 適合多變量時間序列預測問題。 * 如何進行預測并將結果重新調整回原始單位。讓我們開始吧。 * **2017 年 8 月更新**：修正了在計算最終 RMSE 時與前一時間段的 obs 進行比較的錯誤。謝謝，Songbin Xu 和 David Righart。 * **2017 年 10 月更新**：添加了一個新示例，顯示如何根據大眾需求訓練多個先前時間步驟。 * **Update Sep / 2018** ：更新了數據集的鏈接。 ## 教程概述本教程分為 3 個部分;他們是： 1. 空氣污染預測 2. 基本數據準備 3. 多變量 LSTM 預測模型 ### Python 環境本教程假定您已安裝 Python SciPy 環境。您可以在本教程中使用 Python 2 或 3。您必須安裝帶有 TensorFlow 或 Theano 后端的 Keras（2.0 或更高版本）。本教程還假設您安裝了 scikit-learn，Pandas，NumPy 和 Matplotlib。如果您需要有關環境的幫助，請參閱此帖子： * [如何使用 Anaconda 設置用于機器學習和深度學習的 Python 環境](http://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/) ## 1.空氣污染預測在本教程中，我們將使用空氣質量數據集。這是一個數據集，在美國駐中國大使館報告五年來每小時的天氣和污染程度。數據包括日期時間，稱為 PM2.5 濃度的污染，以及包括露點，溫度，壓力，風向，風速和累積的下雪小時數的天氣信息。原始數據中的完整功能列表如下： 1. **否**：行號 2. **年**：此行中的數據年份 3. **月**：此行中的數據月份 4. **天**：此行中的數據日 5. **小時**：此行中的數據小時數 6. **pm2.5** ：PM2.5 濃度 7. **DEWP** ：露點 8. **TEMP** ：溫度 9. **PRES** ：壓力 10. **cbwd** ：風向相結合 11. **Iws** ：累積風速 12. ：積雪累積了幾個小時 13. **Ir** ：累計下雨時間我們可以使用這些數據并構建一個預測問題，考慮到前一個小時的天氣條件和污染，我們預測下一個小時的污染。此數據集可用于構建其他預測問題。你有好主意嗎？請在下面的評論中告訴我。您可以從 UCI 機器學習庫下載數據集。 **更新**，我在這里鏡像了數據集，因為 UCI 變得不可靠： * [北京 PM2.5 數據集](https://raw.githubusercontent.com/jbrownlee/Datasets/master/pollution.csv) 下載數據集并將其放在當前工作目錄中，文件名為“ _raw.csv_ ”。 ## 2.基本數據準備數據尚未準備好使用。我們必須先做好準備。下面是原始數據集的前幾行。 ```py No,year,month,day,hour,pm2.5,DEWP,TEMP,PRES,cbwd,Iws,Is,Ir 1,2010,1,1,0,NA,-21,-11,1021,NW,1.79,0,0 2,2010,1,1,1,NA,-21,-12,1020,NW,4.92,0,0 3,2010,1,1,2,NA,-21,-11,1019,NW,6.71,0,0 4,2010,1,1,3,NA,-21,-14,1019,NW,9.84,0,0 5,2010,1,1,4,NA,-20,-12,1018,NW,12.97,0,0 ``` 第一步是將日期時間信息合并為一個日期時間，以便我們可以將其用作 Pandas 中的索引。快速檢查顯示前 24 小時內 pm2.5 的 NA 值。因此，我們需要刪除第一行數據。在數據集的后面還有一些分散的“NA”值;我們現在可以用 0 值標記它們。下面的腳本加載原始數據集并將日期時間信息解析為 Pandas DataFrame 索引。刪除“否”列，然后為每列指定更清晰的名稱。最后，將 NA 值替換為“0”值，并刪除前 24 小時。刪除“否”列，然后為每列指定更清晰的名稱。最后，將 NA 值替換為“0”值，并刪除前 24 小時。 ```py from pandas import read_csv from datetime import datetime # load data def parse(x): return datetime.strptime(x, '%Y %m %d %H') dataset = read_csv('raw.csv', parse_dates = [['year', 'month', 'day', 'hour']], index_col=0, date_parser=parse) dataset.drop('No', axis=1, inplace=True) # manually specify column names dataset.columns = ['pollution', 'dew', 'temp', 'press', 'wnd_dir', 'wnd_spd', 'snow', 'rain'] dataset.index.name = 'date' # mark all NA values with 0 dataset['pollution'].fillna(0, inplace=True) # drop the first 24 hours dataset = dataset[24:] # summarize first 5 rows print(dataset.head(5)) # save to file dataset.to_csv('pollution.csv') ``` 運行該示例將打印轉換數據集的前 5 行，并將數據集保存到“ _pollution.csv_ ”。 ```py pollution dew temp press wnd_dir wnd_spd snow rain date 2010-01-02 00:00:00 129.0 -16 -4.0 1020.0 SE 1.79 0 0 2010-01-02 01:00:00 148.0 -15 -4.0 1020.0 SE 2.68 0 0 2010-01-02 02:00:00 159.0 -11 -5.0 1021.0 SE 3.57 0 0 2010-01-02 03:00:00 181.0 -7 -5.0 1022.0 SE 5.36 1 0 2010-01-02 04:00:00 138.0 -7 -5.0 1022.0 SE 6.25 2 0 ``` 現在我們以易于使用的形式獲得數據，我們可以創建每個系列的快速繪圖并查看我們擁有的內容。下面的代碼加載新的“ _pollution.csv_ ”文件，并將每個系列繪制為一個單獨的子圖，除了風速 dir，這是絕對的。 ```py from pandas import read_csv from matplotlib import pyplot # load dataset dataset = read_csv('pollution.csv', header=0, index_col=0) values = dataset.values # specify columns to plot groups = [0, 1, 2, 3, 5, 6, 7] i = 1 # plot each column pyplot.figure() for group in groups: pyplot.subplot(len(groups), 1, i) pyplot.plot(values[:, group]) pyplot.title(dataset.columns[group], y=0.5, loc='right') i += 1 pyplot.show() ``` 運行該示例將創建一個包含 7 個子圖的圖，顯示每個變量的 5 年數據。 ![Line Plots of Air Pollution Time Series](https://img.kancloud.cn/64/11/6411dd82ffa220e872016eec56809ece_1024x768.jpg) 空氣污染時間序列的線圖 ## 3.多變量 LSTM 預測模型在本節中，我們將使 LSTM 適應問題。 ### LSTM 數據準備第一步是為 LSTM 準備污染數據集。這涉及將數據集構建為監督學習問題并對輸入變量進行標準化。考慮到污染測量和前一時間步的天氣條件，我們將監督學習問題定為預測當前小時（t）的污染。這個表述很簡單，只是為了這個演示。您可以探索的其他一些秘籍包括： * 根據過去 24 小時內的天氣狀況和污染情況預測下一小時的污染情況。 * 如上所述預測下一小時的污染，并給出下一小時的“預期”天氣狀況。我們可以使用博客文章中開發的 _series_to_supervised（）_ 函數來轉換數據集： * [如何將時間序列轉換為 Python 中的監督學習問題](http://machinelearningmastery.com/convert-time-series-supervised-learning-problem-python/) 首先，加載“ _pollution.csv_ ”數據集。風速特征是標簽編碼的（整數編碼）。如果您有興趣探索它，將來可能會進一步編碼。接下來，將所有特征標準化，然后將數據集轉換為監督學習問題。然后移除要預測的小時（t）的天氣變量。完整的代碼清單如下。 ```py # convert series to supervised learning def series_to_supervised(data, n_in=1, n_out=1, dropnan=True): n_vars = 1 if type(data) is list else data.shape[1] df = DataFrame(data) cols, names = list(), list() # input sequence (t-n, ... t-1) for i in range(n_in, 0, -1): cols.append(df.shift(i)) names += [('var%d(t-%d)' % (j+1, i)) for j in range(n_vars)] # forecast sequence (t, t+1, ... t+n) for i in range(0, n_out): cols.append(df.shift(-i)) if i == 0: names += [('var%d(t)' % (j+1)) for j in range(n_vars)] else: names += [('var%d(t+%d)' % (j+1, i)) for j in range(n_vars)] # put it all together agg = concat(cols, axis=1) agg.columns = names # drop rows with NaN values if dropnan: agg.dropna(inplace=True) return agg # load dataset dataset = read_csv('pollution.csv', header=0, index_col=0) values = dataset.values # integer encode direction encoder = LabelEncoder() values[:,4] = encoder.fit_transform(values[:,4]) # ensure all data is float values = values.astype('float32') # normalize features scaler = MinMaxScaler(feature_range=(0, 1)) scaled = scaler.fit_transform(values) # frame as supervised learning reframed = series_to_supervised(scaled, 1, 1) # drop columns we don't want to predict reframed.drop(reframed.columns[[9,10,11,12,13,14,15]], axis=1, inplace=True) print(reframed.head()) ``` 運行該示例將打印轉換后的數據集的前 5 行。我們可以看到 8 個輸入變量（輸入系列）和 1 個輸出變量（當前小時的污染水平）。 ```py var1(t-1) var2(t-1) var3(t-1) var4(t-1) var5(t-1) var6(t-1) \ 1 0.129779 0.352941 0.245902 0.527273 0.666667 0.002290 2 0.148893 0.367647 0.245902 0.527273 0.666667 0.003811 3 0.159960 0.426471 0.229508 0.545454 0.666667 0.005332 4 0.182093 0.485294 0.229508 0.563637 0.666667 0.008391 5 0.138833 0.485294 0.229508 0.563637 0.666667 0.009912 var7(t-1) var8(t-1) var1(t) 1 0.000000 0.0 0.148893 2 0.000000 0.0 0.159960 3 0.000000 0.0 0.182093 4 0.037037 0.0 0.138833 5 0.074074 0.0 0.109658 ``` 這個數據準備很簡單，我們可以探索更多。您可以看到的一些想法包括： * 單熱編碼風速。 * 通過差分和季節性調整使所有系列保持靜止。 * 提供超過 1 小時的輸入時間步長。考慮到學習序列預測問題時 LSTM 在時間上使用反向傳播，最后一點可能是最重要的。 ### 定義和擬合模型在本節中，我們將在多變量輸入數據上擬合 LSTM。首先，我們必須將準備好的數據集拆分為訓練集和測試集。為了加速本演示模型的訓練，我們只在數據的第一年擬合模型，然后在剩余的 4 年數據上進行評估。如果您有時間，請考慮探索此測試工具的倒置版本。下面的示例將數據集拆分為訓練集和測試集，然后將訓練集和測試集拆分為輸入和輸出變量。最后，輸入（X）被重新整形為 LSTM 所期望的 3D 格式，即[樣本，時間步長，特征]。 ```py # split into train and test sets values = reframed.values n_train_hours = 365 * 24 train = values[:n_train_hours, :] test = values[n_train_hours:, :] # split into input and outputs train_X, train_y = train[:, :-1], train[:, -1] test_X, test_y = test[:, :-1], test[:, -1] # reshape input to be 3D [samples, timesteps, features] train_X = train_X.reshape((train_X.shape[0], 1, train_X.shape[1])) test_X = test_X.reshape((test_X.shape[0], 1, test_X.shape[1])) print(train_X.shape, train_y.shape, test_X.shape, test_y.shape) ``` 運行此示例打印訓練的形狀并測試輸入和輸出集，其中大約 9K 小時的數據用于訓練，大約 35K 小時用于測試。 ```py (8760, 1, 8) (8760,) (35039, 1, 8) (35039,) ``` 現在我們可以定義和擬合我們的 LSTM 模型。我們將定義 LSTM，在第一個隱藏層中有 50 個神經元，在輸出層中有 1 個神經元用于預測污染。輸入形狀將是 1 個步驟，具有 8 個功能。我們將使用平均絕對誤差（MAE）損失函數和隨機梯度下降的有效 Adam 版本。該模型適用于批量大小為 72 的 50 個訓練時期。請記住，Keras 中 LSTM 的內部狀態在每個批次結束時重置，因此內部狀態可能是若干天的函數。有幫助（試試這個）。最后，我們通過在 fit（）函數中設置 _validation_data_ 參數來跟蹤訓練期間的訓練和測試丟失。在運行結束時，繪制訓練和測試損失。 ```py # design network model = Sequential() model.add(LSTM(50, input_shape=(train_X.shape[1], train_X.shape[2]))) model.add(Dense(1)) model.compile(loss='mae', optimizer='adam') # fit network history = model.fit(train_X, train_y, epochs=50, batch_size=72, validation_data=(test_X, test_y), verbose=2, shuffle=False) # plot history pyplot.plot(history.history['loss'], label='train') pyplot.plot(history.history['val_loss'], label='test') pyplot.legend() pyplot.show() ``` ### 評估模型在模型擬合后，我們可以預測整個測試數據集。我們將預測與測試數據集結合起來并反轉縮放。我們還使用預期的污染數反轉測試數據集上的縮放。通過原始比例的預測和實際值，我們可以計算模型的誤差分數。在這種情況下，我們計算出均方誤差（RMSE），它以與變量本身相同的單位給出誤差。 ```py # make a prediction yhat = model.predict(test_X) test_X = test_X.reshape((test_X.shape[0], test_X.shape[2])) # invert scaling for forecast inv_yhat = concatenate((yhat, test_X[:, 1:]), axis=1) inv_yhat = scaler.inverse_transform(inv_yhat) inv_yhat = inv_yhat[:,0] # invert scaling for actual test_y = test_y.reshape((len(test_y), 1)) inv_y = concatenate((test_y, test_X[:, 1:]), axis=1) inv_y = scaler.inverse_transform(inv_y) inv_y = inv_y[:,0] # calculate RMSE rmse = sqrt(mean_squared_error(inv_y, inv_yhat)) print('Test RMSE: %.3f' % rmse) ``` ### 完整的例子下面列出了完整的示例。 **注**：此示例假設您已正確準備數據，例如將下載的“ _raw.csv_ ”轉換為準備好的“ _pollution.csv_ ”。請參閱本教程的第一部分。 ```py from math import sqrt from numpy import concatenate from matplotlib import pyplot from pandas import read_csv from pandas import DataFrame from pandas import concat from sklearn.preprocessing import MinMaxScaler from sklearn.preprocessing import LabelEncoder from sklearn.metrics import mean_squared_error from keras.models import Sequential from keras.layers import Dense from keras.layers import LSTM # convert series to supervised learning def series_to_supervised(data, n_in=1, n_out=1, dropnan=True): n_vars = 1 if type(data) is list else data.shape[1] df = DataFrame(data) cols, names = list(), list() # input sequence (t-n, ... t-1) for i in range(n_in, 0, -1): cols.append(df.shift(i)) names += [('var%d(t-%d)' % (j+1, i)) for j in range(n_vars)] # forecast sequence (t, t+1, ... t+n) for i in range(0, n_out): cols.append(df.shift(-i)) if i == 0: names += [('var%d(t)' % (j+1)) for j in range(n_vars)] else: names += [('var%d(t+%d)' % (j+1, i)) for j in range(n_vars)] # put it all together agg = concat(cols, axis=1) agg.columns = names # drop rows with NaN values if dropnan: agg.dropna(inplace=True) return agg # load dataset dataset = read_csv('pollution.csv', header=0, index_col=0) values = dataset.values # integer encode direction encoder = LabelEncoder() values[:,4] = encoder.fit_transform(values[:,4]) # ensure all data is float values = values.astype('float32') # normalize features scaler = MinMaxScaler(feature_range=(0, 1)) scaled = scaler.fit_transform(values) # frame as supervised learning reframed = series_to_supervised(scaled, 1, 1) # drop columns we don't want to predict reframed.drop(reframed.columns[[9,10,11,12,13,14,15]], axis=1, inplace=True) print(reframed.head()) # split into train and test sets values = reframed.values n_train_hours = 365 * 24 train = values[:n_train_hours, :] test = values[n_train_hours:, :] # split into input and outputs train_X, train_y = train[:, :-1], train[:, -1] test_X, test_y = test[:, :-1], test[:, -1] # reshape input to be 3D [samples, timesteps, features] train_X = train_X.reshape((train_X.shape[0], 1, train_X.shape[1])) test_X = test_X.reshape((test_X.shape[0], 1, test_X.shape[1])) print(train_X.shape, train_y.shape, test_X.shape, test_y.shape) # design network model = Sequential() model.add(LSTM(50, input_shape=(train_X.shape[1], train_X.shape[2]))) model.add(Dense(1)) model.compile(loss='mae', optimizer='adam') # fit network history = model.fit(train_X, train_y, epochs=50, batch_size=72, validation_data=(test_X, test_y), verbose=2, shuffle=False) # plot history pyplot.plot(history.history['loss'], label='train') pyplot.plot(history.history['val_loss'], label='test') pyplot.legend() pyplot.show() # make a prediction yhat = model.predict(test_X) test_X = test_X.reshape((test_X.shape[0], test_X.shape[2])) # invert scaling for forecast inv_yhat = concatenate((yhat, test_X[:, 1:]), axis=1) inv_yhat = scaler.inverse_transform(inv_yhat) inv_yhat = inv_yhat[:,0] # invert scaling for actual test_y = test_y.reshape((len(test_y), 1)) inv_y = concatenate((test_y, test_X[:, 1:]), axis=1) inv_y = scaler.inverse_transform(inv_y) inv_y = inv_y[:,0] # calculate RMSE rmse = sqrt(mean_squared_error(inv_y, inv_yhat)) print('Test RMSE: %.3f' % rmse) ``` 首先運行該示例創建一個圖表，顯示訓練期間的訓練和測試損失。有趣的是，我們可以看到測試損失低于訓練損失。該模型可能過度擬合訓練數據。在訓練期間測量和繪制 RMSE 可能會對此有所了解。 ![Line Plot of Train and Test Loss from the Multivariate LSTM During Training](https://img.kancloud.cn/d5/01/d501042cc16a9b00c417be00f76c698d_1024x768.jpg) 訓練期間多變量 LSTM 的訓練線路和試驗損失訓練和測試損失在每個訓練時期結束時打印。在運行結束時，將打印測試數據集上模型的最終 RMSE。我們可以看到該模型實現了 26.496 的可觀 RMSE，低于使用持久性模型找到的 30 的 RMSE。 ```py ... Epoch 46/50 0s - loss: 0.0143 - val_loss: 0.0133 Epoch 47/50 0s - loss: 0.0143 - val_loss: 0.0133 Epoch 48/50 0s - loss: 0.0144 - val_loss: 0.0133 Epoch 49/50 0s - loss: 0.0143 - val_loss: 0.0133 Epoch 50/50 0s - loss: 0.0144 - val_loss: 0.0133 Test RMSE: 26.496 ``` 此模型未調整。你能做得更好嗎？請在下面的評論中告訴我您的問題框架，模型配置和 RMSE。 ## 更新：訓練多個滯后時間步長示例關于如何調整上述示例以在多個先前時間步驟上訓練模型，已經有許多關于建議的請求。在編寫原始帖子時，我嘗試了這個以及無數其他配置，并決定不包括它們，因為他們沒有提升模型技能。盡管如此，我已將此示例作為參考模板包含在內，您可以根據自己的問題進行調整。在多個先前時間步驟上訓練模型所需的更改非常小，如下所示：首先，在調用 series_to_supervised（）時必須適當地構建問題。我們將使用 3 小時的數據作為輸入。另請注意，我們不再明確地刪除 ob（t）處所有其他字段中的列。 ```py # specify the number of lag hours n_hours = 3 n_features = 8 # frame as supervised learning reframed = series_to_supervised(scaled, n_hours, 1) ``` 接下來，我們需要更加謹慎地指定輸入和輸出列。我們的框架數據集中有 3 * 8 + 8 列。我們將在前 3 個小時內將 3 * 8 或 24 列作為所有功能的視角輸入。我們將在下一個小時將污染變量作為輸出，如下所示： ```py # split into input and outputs n_obs = n_hours * n_features train_X, train_y = train[:, :n_obs], train[:, -n_features] test_X, test_y = test[:, :n_obs], test[:, -n_features] print(train_X.shape, len(train_X), train_y.shape) ``` 接下來，我們可以正確地重塑輸入數據以反映時間步驟和功能。 ```py # reshape input to be 3D [samples, timesteps, features] train_X = train_X.reshape((train_X.shape[0], n_hours, n_features)) test_X = test_X.reshape((test_X.shape[0], n_hours, n_features)) ``` 擬合模型是一樣的。唯一的另一個小變化是如何評估模型。具體來說，我們如何重建具有 8 列的行，這些行適合于反轉縮放操作以使 y 和 yhat 返回到原始比例，以便我們可以計算 RMSE。更改的要點是我們將 y 或 yhat 列與測試數據集的最后 7 個特征連接起來，以便反轉縮放，如下所示： ```py # invert scaling for forecast inv_yhat = concatenate((yhat, test_X[:, -7:]), axis=1) inv_yhat = scaler.inverse_transform(inv_yhat) inv_yhat = inv_yhat[:,0] # invert scaling for actual test_y = test_y.reshape((len(test_y), 1)) inv_y = concatenate((test_y, test_X[:, -7:]), axis=1) inv_y = scaler.inverse_transform(inv_y) inv_y = inv_y[:,0] ``` 我們可以將所有這些修改與上述示例結合在一起。下面列出了具有多個滯后輸入的多變量時間序列預測的完整示例： ```py from math import sqrt from numpy import concatenate from matplotlib import pyplot from pandas import read_csv from pandas import DataFrame from pandas import concat from sklearn.preprocessing import MinMaxScaler from sklearn.preprocessing import LabelEncoder from sklearn.metrics import mean_squared_error from keras.models import Sequential from keras.layers import Dense from keras.layers import LSTM # convert series to supervised learning def series_to_supervised(data, n_in=1, n_out=1, dropnan=True): n_vars = 1 if type(data) is list else data.shape[1] df = DataFrame(data) cols, names = list(), list() # input sequence (t-n, ... t-1) for i in range(n_in, 0, -1): cols.append(df.shift(i)) names += [('var%d(t-%d)' % (j+1, i)) for j in range(n_vars)] # forecast sequence (t, t+1, ... t+n) for i in range(0, n_out): cols.append(df.shift(-i)) if i == 0: names += [('var%d(t)' % (j+1)) for j in range(n_vars)] else: names += [('var%d(t+%d)' % (j+1, i)) for j in range(n_vars)] # put it all together agg = concat(cols, axis=1) agg.columns = names # drop rows with NaN values if dropnan: agg.dropna(inplace=True) return agg # load dataset dataset = read_csv('pollution.csv', header=0, index_col=0) values = dataset.values # integer encode direction encoder = LabelEncoder() values[:,4] = encoder.fit_transform(values[:,4]) # ensure all data is float values = values.astype('float32') # normalize features scaler = MinMaxScaler(feature_range=(0, 1)) scaled = scaler.fit_transform(values) # specify the number of lag hours n_hours = 3 n_features = 8 # frame as supervised learning reframed = series_to_supervised(scaled, n_hours, 1) print(reframed.shape) # split into train and test sets values = reframed.values n_train_hours = 365 * 24 train = values[:n_train_hours, :] test = values[n_train_hours:, :] # split into input and outputs n_obs = n_hours * n_features train_X, train_y = train[:, :n_obs], train[:, -n_features] test_X, test_y = test[:, :n_obs], test[:, -n_features] print(train_X.shape, len(train_X), train_y.shape) # reshape input to be 3D [samples, timesteps, features] train_X = train_X.reshape((train_X.shape[0], n_hours, n_features)) test_X = test_X.reshape((test_X.shape[0], n_hours, n_features)) print(train_X.shape, train_y.shape, test_X.shape, test_y.shape) # design network model = Sequential() model.add(LSTM(50, input_shape=(train_X.shape[1], train_X.shape[2]))) model.add(Dense(1)) model.compile(loss='mae', optimizer='adam') # fit network history = model.fit(train_X, train_y, epochs=50, batch_size=72, validation_data=(test_X, test_y), verbose=2, shuffle=False) # plot history pyplot.plot(history.history['loss'], label='train') pyplot.plot(history.history['val_loss'], label='test') pyplot.legend() pyplot.show() # make a prediction yhat = model.predict(test_X) test_X = test_X.reshape((test_X.shape[0], n_hours*n_features)) # invert scaling for forecast inv_yhat = concatenate((yhat, test_X[:, -7:]), axis=1) inv_yhat = scaler.inverse_transform(inv_yhat) inv_yhat = inv_yhat[:,0] # invert scaling for actual test_y = test_y.reshape((len(test_y), 1)) inv_y = concatenate((test_y, test_X[:, -7:]), axis=1) inv_y = scaler.inverse_transform(inv_y) inv_y = inv_y[:,0] # calculate RMSE rmse = sqrt(mean_squared_error(inv_y, inv_yhat)) print('Test RMSE: %.3f' % rmse) ``` 該模型在一兩分鐘內就像以前一樣合適。 ```py ... Epoch 45/50 1s - loss: 0.0143 - val_loss: 0.0154 Epoch 46/50 1s - loss: 0.0143 - val_loss: 0.0148 Epoch 47/50 1s - loss: 0.0143 - val_loss: 0.0152 Epoch 48/50 1s - loss: 0.0143 - val_loss: 0.0151 Epoch 49/50 1s - loss: 0.0143 - val_loss: 0.0152 Epoch 50/50 1s - loss: 0.0144 - val_loss: 0.0149 ``` 繪制了在時期上的訓練和測試損失的圖。 ![Plot of Loss on the Train and Test Datasets](https://img.kancloud.cn/0a/f3/0af3989a67817b2eb5e40a81763187d0_1280x960.jpg) 訓練和測試數據集損失情節最后，測試 RMSE 被打印出來，至少在這個問題上并沒有真正顯示出技能上的任何優勢。 ```py Test RMSE: 27.177 ``` 我想補充一點，LSTM [似乎不適合自回歸類型問題](https://machinelearningmastery.com/suitability-long-short-term-memory-networks-time-series-forecasting/)，你可能最好去探索一個大窗口的 MLP。我希望這個例子可以幫助您進行自己的時間序列預測實驗。 ## 進一步閱讀如果您要深入了解，本節將提供有關該主題的更多資源。 * [北京 PM2.5 數據集在 UCI 機器學習庫](https://archive.ics.uci.edu/ml/datasets/Beijing+PM2.5+Data) * [Keras 中長期短期記憶模型的 5 步生命周期](http://machinelearningmastery.com/5-step-life-cycle-long-short-term-memory-models-keras/) * [Python 中長期短期記憶網絡的時間序列預測](http://machinelearningmastery.com/time-series-forecasting-long-short-term-memory-network-python/) * [Python 中長期短期記憶網絡的多步時間序列預測](http://machinelearningmastery.com/multi-step-time-series-forecasting-long-short-term-memory-networks-python/) ## 摘要在本教程中，您了解了如何使 LSTM 適應多變量時間序列預測問題。具體來說，你學到了： * 如何將原始數據集轉換為可用于時間序列預測的內容。 * 如何準備數據并使 LSTM 適合多變量時間序列預測問題。 * 如何進行預測并將結果重新調整回原始單位。你有任何問題嗎？在下面的評論中提出您的問題，我會盡力回答。