如何開發 LSTM 模型進行時間序列預測 · Machine Learning Mastery 博客文章翻譯

# 如何開發 LSTM 模型進行時間序列預測 > 原文： [https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-series-forecasting/](https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-series-forecasting/) 長短期內存網絡（簡稱 LSTM）可應用于時間序列預測。有許多類型的 LSTM 模型可用于每種特定類型的時間序列預測問題。在本教程中，您將了解如何針對一系列標準時間序列預測問題開發一套 LSTM 模型。本教程的目的是為每種類型的時間序列問題提供每個模型的獨立示例，作為模板，您可以根據特定的時間序列預測問題進行復制和調整。完成本教程后，您將了解： * 如何開發 LSTM 模型進行單變量時間序列預測。 * 如何開發多變量時間序列預測的 LSTM 模型。 * 如何開發 LSTM 模型進行多步時間序列預測。這是一個龐大而重要的職位;您可能希望將其加入書簽以供將來參考。讓我們開始吧。 ![How to Develop LSTM Models for Time Series Forecasting](https://img.kancloud.cn/bc/00/bc00222249faf5286544b0680912e1c0_640x360.jpg) 如何開發用于時間序列預測的 LSTM 模型照片由 [N i c o l a](https://www.flickr.com/photos/15216811@N06/6704346543/) ，保留一些權利。 ## 教程概述在本教程中，我們將探索如何為時間序列預測開發一套不同類型的 LSTM 模型。這些模型在小型設計的時間序列問題上進行了演示，旨在解決時間序列問題類型的風格。所選擇的模型配置是任意的，并未針對每個問題進行優化;那不是目標。本教程分為四個部分;他們是： 1. 單變量 LSTM 模型 2. 多變量 LSTM 模型 3. 多步 LSTM 模型 4. 多變量多步 LSTM 模型 ## 單變量 LSTM 模型 LSTM 可用于模擬單變量時間序列預測問題。這些問題包括一系列觀察，并且需要模型來從過去的一系列觀察中學習以預測序列中的下一個值。我們將演示 LSTM 模型的多種變體，用于單變量時間序列預測。本節分為六個部分;他們是： 1. 數據準備 2. 香草 LSTM 3. 堆疊式 LSTM 4. 雙向 LSTM 5. CNN LSTM 6. ConvLSTM 這些模型中的每一個都被演示為一步式單變量時間序列預測，但可以很容易地進行調整并用作其他類型的時間序列預測問題的模型的輸入部分。 ### 數據準備在對單變量系列進行建模之前，必須準備好它。 LSTM 模型將學習一種函數，該函數將過去觀察序列作為輸入映射到輸出觀察。因此，必須將觀察序列轉換為 LSTM 可以學習的多個示例。考慮給定的單變量序列： ```py [10, 20, 30, 40, 50, 60, 70, 80, 90] ``` 我們可以將序列劃分為多個稱為樣本的輸入/輸出模式，其中三個時間步長用作輸入，一個時間步長用作正在學習的一步預測的輸出。 ```py X, y 10, 20, 30 40 20, 30, 40 50 30, 40, 50 60 ... ``` 下面的 _split_sequence（）_ 函數實現了這種行為，并將給定的單變量序列分成多個樣本，其中每個樣本具有指定的時間步長，輸出是單個時間步長。 ```py # split a univariate sequence into samples def split_sequence(sequence, n_steps): X, y = list(), list() for i in range(len(sequence)): # find the end of this pattern end_ix = i + n_steps # check if we are beyond the sequence if end_ix > len(sequence)-1: break # gather input and output parts of the pattern seq_x, seq_y = sequence[i:end_ix], sequence[end_ix] X.append(seq_x) y.append(seq_y) return array(X), array(y) ``` 我們可以在上面的小型人為數據集上演示這個功能。下面列出了完整的示例。 ```py # univariate data preparation from numpy import array # split a univariate sequence into samples def split_sequence(sequence, n_steps): X, y = list(), list() for i in range(len(sequence)): # find the end of this pattern end_ix = i + n_steps # check if we are beyond the sequence if end_ix > len(sequence)-1: break # gather input and output parts of the pattern seq_x, seq_y = sequence[i:end_ix], sequence[end_ix] X.append(seq_x) y.append(seq_y) return array(X), array(y) # define input sequence raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90] # choose a number of time steps n_steps = 3 # split into samples X, y = split_sequence(raw_seq, n_steps) # summarize the data for i in range(len(X)): print(X[i], y[i]) ``` 運行該示例將單變量系列分成六個樣本，其中每個樣本具有三個輸入時間步長和一個輸出時間步長。 ```py [10 20 30] 40 [20 30 40] 50 [30 40 50] 60 [40 50 60] 70 [50 60 70] 80 [60 70 80] 90 ``` 現在我們已經知道如何準備用于建模的單變量系列，讓我們看看開發 LSTM 模型，它可以學習輸入到輸出的映射，從 Vanilla LSTM 開始。 ### 香草 LSTM Vanilla LSTM 是 LSTM 模型，具有單個隱藏的 LSTM 單元層，以及用于進行預測的輸出層。我們可以如下定義用于單變量時間序列預測的 Vanilla LSTM。 ```py # define model model = Sequential() model.add(LSTM(50, activation='relu', input_shape=(n_steps, n_features))) model.add(Dense(1)) model.compile(optimizer='adam', loss='mse') ``` 定義的關鍵是輸入的形狀;這就是模型期望的每個樣本的輸入，包括時間步數和特征數。我們正在使用單變量系列，因此對于一個變量，要素的數量是一個。輸入的時間步數是我們在準備數據集時選擇的數字，作為 _split_sequence（）_ 函數的參數。每個樣本的輸入形狀在第一個隱藏層定義的 _input_shape_ 參數中指定。我們幾乎總是有多個樣本，因此，模型將期望訓練數據的輸入組件具有尺寸或形狀： ```py [samples, timesteps, features] ``` 我們在上一節中的 _split_sequence（）_ 函數輸出具有[_ 樣本，時間步長 _]形狀??的 X，因此我們可以輕松地對其進行整形，以便為一個特征提供額外的維度。 ```py # reshape from [samples, timesteps] into [samples, timesteps, features] n_features = 1 X = X.reshape((X.shape[0], X.shape[1], n_features)) ``` 在這種情況下，我們定義隱藏層中具有 50 個 LSTM 單元的模型和預測單個數值的輸出層。使用隨機梯度下降的有效 [Adam 版本擬合該模型，并使用均方誤差或' _mse_ '損失函數進行優化。](https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/) 定義模型后，我們可以將其放在訓練數據集上。 ```py # fit model model.fit(X, y, epochs=200, verbose=0) ``` 在模型擬合后，我們可以使用它來進行預測。我們可以通過提供輸入來預測序列中的下一個值： ```py [70, 80, 90] ``` 并期望模型預測如下： ```py [100] ``` 該模型期望輸入形狀為[_ 樣本，時間步長，特征 _]三維，因此，我們必須在進行預測之前對單個輸入樣本進行整形。 ```py # demonstrate prediction x_input = array([70, 80, 90]) x_input = x_input.reshape((1, n_steps, n_features)) yhat = model.predict(x_input, verbose=0) ``` 我們可以將所有這些結合在一起并演示如何開發用于單變量時間序列預測的 Vanilla LSTM 并進行單一預測。 ```py # univariate lstm example from numpy import array from keras.models import Sequential from keras.layers import LSTM from keras.layers import Dense # split a univariate sequence into samples def split_sequence(sequence, n_steps): X, y = list(), list() for i in range(len(sequence)): # find the end of this pattern end_ix = i + n_steps # check if we are beyond the sequence if end_ix > len(sequence)-1: break # gather input and output parts of the pattern seq_x, seq_y = sequence[i:end_ix], sequence[end_ix] X.append(seq_x) y.append(seq_y) return array(X), array(y) # define input sequence raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90] # choose a number of time steps n_steps = 3 # split into samples X, y = split_sequence(raw_seq, n_steps) # reshape from [samples, timesteps] into [samples, timesteps, features] n_features = 1 X = X.reshape((X.shape[0], X.shape[1], n_features)) # define model model = Sequential() model.add(LSTM(50, activation='relu', input_shape=(n_steps, n_features))) model.add(Dense(1)) model.compile(optimizer='adam', loss='mse') # fit model model.fit(X, y, epochs=200, verbose=0) # demonstrate prediction x_input = array([70, 80, 90]) x_input = x_input.reshape((1, n_steps, n_features)) yhat = model.predict(x_input, verbose=0) print(yhat) ``` 運行該示例準備數據，擬合模型并進行預測。鑒于算法的隨機性，您的結果可能會有所不同;嘗試運行幾次這個例子。我們可以看到模型預測序列中的下一個值。 ```py [[102.09213]] ``` ### 堆疊式 LSTM 多個隱藏的 LSTM 層可以在所謂的堆疊 LSTM 模型中一個堆疊在另一個之上。 LSTM 層需要三維輸入，默認情況下，LSTM 將產生二維輸出作為序列末尾的解釋。我們可以通過在層上設置 _return_sequences = True_ 參數，為輸入數據中的每個時間步長輸出 LSTM 來解決這個問題。這允許我們將隱藏的 LSTM 層的 3D 輸出作為下一個輸入。因此，我們可以如下定義 Stacked LSTM。 ```py # define model model = Sequential() model.add(LSTM(50, activation='relu', return_sequences=True, input_shape=(n_steps, n_features))) model.add(LSTM(50, activation='relu')) model.add(Dense(1)) model.compile(optimizer='adam', loss='mse') ``` 我們可以將它們聯系起來;完整的代碼示例如下所示。 ```py # univariate stacked lstm example from numpy import array from keras.models import Sequential from keras.layers import LSTM from keras.layers import Dense # split a univariate sequence def split_sequence(sequence, n_steps): X, y = list(), list() for i in range(len(sequence)): # find the end of this pattern end_ix = i + n_steps # check if we are beyond the sequence if end_ix > len(sequence)-1: break # gather input and output parts of the pattern seq_x, seq_y = sequence[i:end_ix], sequence[end_ix] X.append(seq_x) y.append(seq_y) return array(X), array(y) # define input sequence raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90] # choose a number of time steps n_steps = 3 # split into samples X, y = split_sequence(raw_seq, n_steps) # reshape from [samples, timesteps] into [samples, timesteps, features] n_features = 1 X = X.reshape((X.shape[0], X.shape[1], n_features)) # define model model = Sequential() model.add(LSTM(50, activation='relu', return_sequences=True, input_shape=(n_steps, n_features))) model.add(LSTM(50, activation='relu')) model.add(Dense(1)) model.compile(optimizer='adam', loss='mse') # fit model model.fit(X, y, epochs=200, verbose=0) # demonstrate prediction x_input = array([70, 80, 90]) x_input = x_input.reshape((1, n_steps, n_features)) yhat = model.predict(x_input, verbose=0) print(yhat) ``` 運行該示例預測序列中的下一個值，我們預期該值為 100。 ```py [[102.47341]] ``` ### 雙向 LSTM 在一些序列預測問題上，允許 LSTM 模型向前和向后學習輸入序列并連接兩種解釋可能是有益的。這稱為[雙向 LSTM](https://machinelearningmastery.com/develop-bidirectional-lstm-sequence-classification-python-keras/) 。我們可以通過將第一個隱藏層包裝在名為 Bidirectional 的包裝層中來實現雙向 LSTM 以進行單變量時間序列預測。定義雙向 LSTM 以向前和向后讀取輸入的示例如下。 ```py # define model model = Sequential() model.add(Bidirectional(LSTM(50, activation='relu'), input_shape=(n_steps, n_features))) model.add(Dense(1)) model.compile(optimizer='adam', loss='mse') ``` 下面列出了用于單變量時間序列預測的雙向 LSTM 的完整示例。 ```py # univariate bidirectional lstm example from numpy import array from keras.models import Sequential from keras.layers import LSTM from keras.layers import Dense from keras.layers import Bidirectional # split a univariate sequence def split_sequence(sequence, n_steps): X, y = list(), list() for i in range(len(sequence)): # find the end of this pattern end_ix = i + n_steps # check if we are beyond the sequence if end_ix > len(sequence)-1: break # gather input and output parts of the pattern seq_x, seq_y = sequence[i:end_ix], sequence[end_ix] X.append(seq_x) y.append(seq_y) return array(X), array(y) # define input sequence raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90] # choose a number of time steps n_steps = 3 # split into samples X, y = split_sequence(raw_seq, n_steps) # reshape from [samples, timesteps] into [samples, timesteps, features] n_features = 1 X = X.reshape((X.shape[0], X.shape[1], n_features)) # define model model = Sequential() model.add(Bidirectional(LSTM(50, activation='relu'), input_shape=(n_steps, n_features))) model.add(Dense(1)) model.compile(optimizer='adam', loss='mse') # fit model model.fit(X, y, epochs=200, verbose=0) # demonstrate prediction x_input = array([70, 80, 90]) x_input = x_input.reshape((1, n_steps, n_features)) yhat = model.predict(x_input, verbose=0) print(yhat) ``` 運行該示例預測序列中的下一個值，我們預期該值為 100。 ```py [[101.48093]] ``` ### CNN LSTM 卷積神經網絡（簡稱 CNN）是一種為處理二維圖像數據而開發的神經網絡。 CNN 可以非常有效地從一維序列數據（例如單變量時間序列數據）中自動提取和學習特征。 CNN 模型可以在具有 LSTM 后端的混合模型中使用，其中 CNN 用于解釋輸入的子序列，這些子序列一起作為序列提供給 LSTM 模型以進行解釋。 [這種混合模型稱為 CNN-LSTM](https://machinelearningmastery.com/cnn-long-short-term-memory-networks/) 。第一步是將輸入序列分成可由 CNN 模型處理的子序列。例如，我們可以首先將單變量時間序列數據拆分為輸入/輸出樣本，其中四個步驟作為輸入，一個作為輸出。然后可以將每個樣品分成兩個子樣品，每個子樣品具有兩個時間步驟。 CNN 可以解釋兩個時間步的每個子序列，并提供對 LSTM 模型的子序列的時間序列解釋以作為輸入進行處理。我們可以對此進行參數化，并將子序列的數量定義為 _n_seq_ ，將每個子序列的時間步數定義為 _n_steps_ 。然后可以將輸入數據重新整形為具有所需的結構： ```py [samples, subsequences, timesteps, features] ``` 例如： ```py # choose a number of time steps n_steps = 4 # split into samples X, y = split_sequence(raw_seq, n_steps) # reshape from [samples, timesteps] into [samples, subsequences, timesteps, features] n_features = 1 n_seq = 2 n_steps = 2 X = X.reshape((X.shape[0], n_seq, n_steps, n_features)) ``` 我們希望在分別讀取每個數據子序列時重用相同的 CNN 模型。這可以通過將整個 CNN 模型包裝在 [TimeDistributed 包裝器](https://machinelearningmastery.com/timedistributed-layer-for-long-short-term-memory-networks-in-python/)中來實現，該包裝器將每個輸入應用整個模型一次，在這種情況下，每個輸入子序列一次。 CNN 模型首先具有卷積層，用于讀取子序列，該子序列需要指定多個過濾器和內核大小。過濾器的數量是輸入序列的讀取或解釋的數量。內核大小是輸入序列的每個“讀取”操作所包含的時間步數。卷積層后面跟著一個最大池池，它將過濾器圖譜提取到其大小的 1/4，包括最顯著的特征。然后將這些結構展平為單個一維向量，以用作 LSTM 層的單個輸入時間步長。 ```py model.add(TimeDistributed(Conv1D(filters=64, kernel_size=1, activation='relu'), input_shape=(None, n_steps, n_features))) model.add(TimeDistributed(MaxPooling1D(pool_size=2))) model.add(TimeDistributed(Flatten())) ``` 接下來，我們可以定義模型的 LSTM 部分，該部分解釋 CNN 模型對輸入序列的讀取并進行預測。 ```py model.add(LSTM(50, activation='relu')) model.add(Dense(1)) ``` 我們可以將所有這些結合在一起;下面列出了用于單變量時間序列預測的 CNN-LSTM 模型的完整示例。 ```py # univariate cnn lstm example from numpy import array from keras.models import Sequential from keras.layers import LSTM from keras.layers import Dense from keras.layers import Flatten from keras.layers import TimeDistributed from keras.layers.convolutional import Conv1D from keras.layers.convolutional import MaxPooling1D # split a univariate sequence into samples def split_sequence(sequence, n_steps): X, y = list(), list() for i in range(len(sequence)): # find the end of this pattern end_ix = i + n_steps # check if we are beyond the sequence if end_ix > len(sequence)-1: break # gather input and output parts of the pattern seq_x, seq_y = sequence[i:end_ix], sequence[end_ix] X.append(seq_x) y.append(seq_y) return array(X), array(y) # define input sequence raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90] # choose a number of time steps n_steps = 4 # split into samples X, y = split_sequence(raw_seq, n_steps) # reshape from [samples, timesteps] into [samples, subsequences, timesteps, features] n_features = 1 n_seq = 2 n_steps = 2 X = X.reshape((X.shape[0], n_seq, n_steps, n_features)) # define model model = Sequential() model.add(TimeDistributed(Conv1D(filters=64, kernel_size=1, activation='relu'), input_shape=(None, n_steps, n_features))) model.add(TimeDistributed(MaxPooling1D(pool_size=2))) model.add(TimeDistributed(Flatten())) model.add(LSTM(50, activation='relu')) model.add(Dense(1)) model.compile(optimizer='adam', loss='mse') # fit model model.fit(X, y, epochs=500, verbose=0) # demonstrate prediction x_input = array([60, 70, 80, 90]) x_input = x_input.reshape((1, n_seq, n_steps, n_features)) yhat = model.predict(x_input, verbose=0) print(yhat) ``` 運行該示例預測序列中的下一個值，我們預期該值為 100。 ```py [[101.69263]] ``` ### ConvLSTM 與 CNN-LSTM 相關的一種 LSTM 是 ConvLSTM，其中輸入的卷積讀取直接建立在每個 LSTM 單元中。 ConvLSTM 是為讀取二維時空數據而開發的，但可以用于單變量時間序列預測。該層期望輸入為二維圖像序列，因此輸入數據的形狀必須為： ```py [samples, timesteps, rows, columns, features] ``` 為了我們的目的，我們可以將每個樣本分成時序將成為子序列數的子序列，或 _n_seq_ ，并且列將是每個子序列的時間步數，或 _n_steps_ 。當我們使用一維數據時，行數固定為 1。我們現在可以將準備好的樣品重新塑造成所需的結構。 ```py # choose a number of time steps n_steps = 4 # split into samples X, y = split_sequence(raw_seq, n_steps) # reshape from [samples, timesteps] into [samples, timesteps, rows, columns, features] n_features = 1 n_seq = 2 n_steps = 2 X = X.reshape((X.shape[0], n_seq, 1, n_steps, n_features)) ``` 我們可以根據過濾器的數量將 ConvLSTM 定義為單個層，并根據（行，列）將二維內核大小定義為單層。當我們使用一維系列時，內核中的行數始終固定為 1。然后必須將模型的輸出展平，然后才能進行解釋并進行預測。 ```py model.add(ConvLSTM2D(filters=64, kernel_size=(1,2), activation='relu', input_shape=(n_seq, 1, n_steps, n_features))) model.add(Flatten()) ``` 下面列出了用于一步式單變量時間序列預測的 ConvLSTM 的完整示例。 ```py # univariate convlstm example from numpy import array from keras.models import Sequential from keras.layers import LSTM from keras.layers import Dense from keras.layers import Flatten from keras.layers import ConvLSTM2D # split a univariate sequence into samples def split_sequence(sequence, n_steps): X, y = list(), list() for i in range(len(sequence)): # find the end of this pattern end_ix = i + n_steps # check if we are beyond the sequence if end_ix > len(sequence)-1: break # gather input and output parts of the pattern seq_x, seq_y = sequence[i:end_ix], sequence[end_ix] X.append(seq_x) y.append(seq_y) return array(X), array(y) # define input sequence raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90] # choose a number of time steps n_steps = 4 # split into samples X, y = split_sequence(raw_seq, n_steps) # reshape from [samples, timesteps] into [samples, timesteps, rows, columns, features] n_features = 1 n_seq = 2 n_steps = 2 X = X.reshape((X.shape[0], n_seq, 1, n_steps, n_features)) # define model model = Sequential() model.add(ConvLSTM2D(filters=64, kernel_size=(1,2), activation='relu', input_shape=(n_seq, 1, n_steps, n_features))) model.add(Flatten()) model.add(Dense(1)) model.compile(optimizer='adam', loss='mse') # fit model model.fit(X, y, epochs=500, verbose=0) # demonstrate prediction x_input = array([60, 70, 80, 90]) x_input = x_input.reshape((1, n_seq, 1, n_steps, n_features)) yhat = model.predict(x_input, verbose=0) print(yhat) ``` 運行該示例預測序列中的下一個值，我們預期該值為 100。 ```py [[103.68166]] ``` 現在我們已經查看了單變量數據的 LSTM 模型，讓我們將注意力轉向多變量數據。 ## 多變量 LSTM 模型多變量時間序列數據是指每個時間步長有多個觀察值的數據。對于多變量時間序列數據，我們可能需要兩種主要模型;他們是： 1. 多輸入系列。 2. 多個并聯系列。讓我們依次看看每一個。 ### 多輸入系列問題可能有兩個或更多并行輸入時間序列和輸出時間序列，這取決于輸入時間序列。輸入時間序列是平行的，因為每個系列在同一時間步驟具有觀察。我們可以通過兩個并行輸入時間序列的簡單示例來演示這一點，其中輸出序列是輸入序列的簡單添加。 ```py # define input sequence in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90]) in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95]) out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))]) ``` 我們可以將這三個數據數組重新整形為單個數據集，其中每一行都是一個時間步，每列都是一個單獨的時間序列。這是將并行時間序列存儲在 CSV 文件中的標準方法。 ```py # convert to [rows, columns] structure in_seq1 = in_seq1.reshape((len(in_seq1), 1)) in_seq2 = in_seq2.reshape((len(in_seq2), 1)) out_seq = out_seq.reshape((len(out_seq), 1)) # horizontally stack columns dataset = hstack((in_seq1, in_seq2, out_seq)) ``` 下面列出了完整的示例。 ```py # multivariate data preparation from numpy import array from numpy import hstack # define input sequence in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90]) in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95]) out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))]) # convert to [rows, columns] structure in_seq1 = in_seq1.reshape((len(in_seq1), 1)) in_seq2 = in_seq2.reshape((len(in_seq2), 1)) out_seq = out_seq.reshape((len(out_seq), 1)) # horizontally stack columns dataset = hstack((in_seq1, in_seq2, out_seq)) print(dataset) ``` 運行該示例將打印數據集，每個時間步長為一行，兩個輸入和一個輸出并行時間序列分別為一列。 ```py [[ 10 15 25] [ 20 25 45] [ 30 35 65] [ 40 45 85] [ 50 55 105] [ 60 65 125] [ 70 75 145] [ 80 85 165] [ 90 95 185]] ``` 與單變量時間序列一樣，我們必須將這些數據組織成具有輸入和輸出元素的樣本。 LSTM 模型需要足夠的上下文來學習從輸入序列到輸出值的映射。 LSTM 可以支持并行輸入時間序列作為單獨的變量或特征。因此，我們需要將數據分成樣本，保持兩個輸入序列的觀察順序。如果我們選擇三個輸入時間步長，那么第一個樣本將如下所示：輸入： ```py 10, 15 20, 25 30, 35 ``` 輸出： ```py 65 ``` 也就是說，每個并行系列的前三個時間步長被提供作為模型的輸入，并且模型將其與第三時間步驟（在這種情況下為 65）的輸出系列中的值相關聯。我們可以看到，在將時間序列轉換為輸入/輸出樣本以訓練模型時，我們將不得不從輸出時間序列中丟棄一些值，其中我們在先前時間步驟中沒有輸入時間序列中的值。反過來，選擇輸入時間步數的大小將對使用多少訓練數據產生重要影響。我們可以定義一個名為 _split_sequences（）_ 的函數，該函數將采用數據集，因為我們已經為時間步長和行定義了并行序列和返回輸入/輸出樣本的列。 ```py # split a multivariate sequence into samples def split_sequences(sequences, n_steps): X, y = list(), list() for i in range(len(sequences)): # find the end of this pattern end_ix = i + n_steps # check if we are beyond the dataset if end_ix > len(sequences): break # gather input and output parts of the pattern seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1] X.append(seq_x) y.append(seq_y) return array(X), array(y) ``` 我們可以使用每個輸入時間序列的三個時間步長作為輸入在我們的數據集上測試此函數。下面列出了完整的示例。 ```py # multivariate data preparation from numpy import array from numpy import hstack # split a multivariate sequence into samples def split_sequences(sequences, n_steps): X, y = list(), list() for i in range(len(sequences)): # find the end of this pattern end_ix = i + n_steps # check if we are beyond the dataset if end_ix > len(sequences): break # gather input and output parts of the pattern seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1] X.append(seq_x) y.append(seq_y) return array(X), array(y) # define input sequence in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90]) in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95]) out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))]) # convert to [rows, columns] structure in_seq1 = in_seq1.reshape((len(in_seq1), 1)) in_seq2 = in_seq2.reshape((len(in_seq2), 1)) out_seq = out_seq.reshape((len(out_seq), 1)) # horizontally stack columns dataset = hstack((in_seq1, in_seq2, out_seq)) # choose a number of time steps n_steps = 3 # convert into input/output X, y = split_sequences(dataset, n_steps) print(X.shape, y.shape) # summarize the data for i in range(len(X)): print(X[i], y[i]) ``` 首先運行該示例將打印 X 和 y 組件的形狀。我們可以看到 X 組件具有三維結構。第一個維度是樣本數，在本例中為 7.第二個維度是每個樣本的時間步數，在這種情況下為 3，即為函數指定的值。最后，最后一個維度指定并行時間序列的數量或變量的數量，在這種情況下，兩個并行序列為 2。這是 LSTM 作為輸入所期望的精確三維結構。數據即可使用而無需進一步重塑。然后我們可以看到每個樣本的輸入和輸出都被打印出來，顯示了兩個輸入序列中每個樣本的三個時間步長以及每個樣本的相關輸出。 ```py (7, 3, 2) (7,) [[10 15] [20 25] [30 35]] 65 [[20 25] [30 35] [40 45]] 85 [[30 35] [40 45] [50 55]] 105 [[40 45] [50 55] [60 65]] 125 [[50 55] [60 65] [70 75]] 145 [[60 65] [70 75] [80 85]] 165 [[70 75] [80 85] [90 95]] 185 ``` 我們現在準備在這些數據上安裝 LSTM 模型。可以使用前一節中的任何種類的 LSTM，例如香草，堆疊，雙向，CNN 或 ConvLSTM 模型。我們將使用 Vanilla LSTM，其中通過 _input_shape_ 參數為輸入層指定時間步數和并行系列（特征）。 ```py # define model model = Sequential() model.add(LSTM(50, activation='relu', input_shape=(n_steps, n_features))) model.add(Dense(1)) model.compile(optimizer='adam', loss='mse') ``` 在進行預測時，模型需要兩個輸入時間序列的三個時間步長。我們可以預測輸出系列中的下一個值，提供以下輸入值： ```py 80, 85 90, 95 100, 105 ``` 具有三個時間步長和兩個變量的一個樣本的形狀必須是[1,3,2]。我們希望序列中的下一個值為 100 + 105 或 205。 ```py # demonstrate prediction x_input = array([[80, 85], [90, 95], [100, 105]]) x_input = x_input.reshape((1, n_steps, n_features)) yhat = model.predict(x_input, verbose=0) ``` 下面列出了完整的示例。 ```py # multivariate lstm example from numpy import array from numpy import hstack from keras.models import Sequential from keras.layers import LSTM from keras.layers import Dense # split a multivariate sequence into samples def split_sequences(sequences, n_steps): X, y = list(), list() for i in range(len(sequences)): # find the end of this pattern end_ix = i + n_steps # check if we are beyond the dataset if end_ix > len(sequences): break # gather input and output parts of the pattern seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1] X.append(seq_x) y.append(seq_y) return array(X), array(y) # define input sequence in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90]) in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95]) out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))]) # convert to [rows, columns] structure in_seq1 = in_seq1.reshape((len(in_seq1), 1)) in_seq2 = in_seq2.reshape((len(in_seq2), 1)) out_seq = out_seq.reshape((len(out_seq), 1)) # horizontally stack columns dataset = hstack((in_seq1, in_seq2, out_seq)) # choose a number of time steps n_steps = 3 # convert into input/output X, y = split_sequences(dataset, n_steps) # the dataset knows the number of features, e.g. 2 n_features = X.shape[2] # define model model = Sequential() model.add(LSTM(50, activation='relu', input_shape=(n_steps, n_features))) model.add(Dense(1)) model.compile(optimizer='adam', loss='mse') # fit model model.fit(X, y, epochs=200, verbose=0) # demonstrate prediction x_input = array([[80, 85], [90, 95], [100, 105]]) x_input = x_input.reshape((1, n_steps, n_features)) yhat = model.predict(x_input, verbose=0) print(yhat) ``` 運行該示例準備數據，擬合模型并進行預測。 ```py [[208.13531]] ``` ## 多個并聯系列另一個時間序列問題是存在多個并行時間序列并且必須為每個時間序列預測值的情況。例如，給定上一節的數據： ```py [[ 10 15 25] [ 20 25 45] [ 30 35 65] [ 40 45 85] [ 50 55 105] [ 60 65 125] [ 70 75 145] [ 80 85 165] [ 90 95 185]] ``` 我們可能想要預測下一個時間步的三個時間序列中的每一個的值。這可以稱為多變量預測。同樣，必須將數據分成輸入/輸出樣本以訓練模型。該數據集的第一個示例是：輸入： ```py 10, 15, 25 20, 25, 45 30, 35, 65 ``` 輸出： ```py 40, 45, 85 ``` 下面的 _split_sequences（）_ 函數將分割多個并行時間序列，其中時間步長為行，每列一個系列為所需的輸入/輸出形狀。 ```py # split a multivariate sequence into samples def split_sequences(sequences, n_steps): X, y = list(), list() for i in range(len(sequences)): # find the end of this pattern end_ix = i + n_steps # check if we are beyond the dataset if end_ix > len(sequences)-1: break # gather input and output parts of the pattern seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :] X.append(seq_x) y.append(seq_y) return array(X), array(y) ``` 我們可以在人為的問題上證明這一點;下面列出了完整的示例。 ```py # multivariate output data prep from numpy import array from numpy import hstack # split a multivariate sequence into samples def split_sequences(sequences, n_steps): X, y = list(), list() for i in range(len(sequences)): # find the end of this pattern end_ix = i + n_steps # check if we are beyond the dataset if end_ix > len(sequences)-1: break # gather input and output parts of the pattern seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :] X.append(seq_x) y.append(seq_y) return array(X), array(y) # define input sequence in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90]) in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95]) out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))]) # convert to [rows, columns] structure in_seq1 = in_seq1.reshape((len(in_seq1), 1)) in_seq2 = in_seq2.reshape((len(in_seq2), 1)) out_seq = out_seq.reshape((len(out_seq), 1)) # horizontally stack columns dataset = hstack((in_seq1, in_seq2, out_seq)) # choose a number of time steps n_steps = 3 # convert into input/output X, y = split_sequences(dataset, n_steps) print(X.shape, y.shape) # summarize the data for i in range(len(X)): print(X[i], y[i]) ``` 首先運行該示例打印準備好的 X 和 y 組件的形狀。 X 的形狀是三維的，包括樣品的數量（6），每個樣品選擇的時間步數（3），以及平行時間序列或特征的數量（3）。 y 的形狀是二維的，正如我們可能期望的樣本數量（6）和每個樣本的時間變量數量（3）。數據已準備好在 LSTM 模型中使用，該模型需要三維輸入和每個樣本的 X 和 y 分量的二維輸出形狀。然后，打印每個樣本，顯示每個樣本的輸入和輸出分量。 ```py (6, 3, 3) (6, 3) [[10 15 25] [20 25 45] [30 35 65]] [40 45 85] [[20 25 45] [30 35 65] [40 45 85]] [ 50 55 105] [[ 30 35 65] [ 40 45 85] [ 50 55 105]] [ 60 65 125] [[ 40 45 85] [ 50 55 105] [ 60 65 125]] [ 70 75 145] [[ 50 55 105] [ 60 65 125] [ 70 75 145]] [ 80 85 165] [[ 60 65 125] [ 70 75 145] [ 80 85 165]] [ 90 95 185] ``` 我們現在準備在這些數據上安裝 LSTM 模型。可以使用前一節中的任何種類的 LSTM，例如香草，堆疊，雙向，CNN 或 ConvLSTM 模型。我們將使用 Stacked LSTM，其中通過 _input_shape_ 參數為輸入層指定時間步數和并行系列（特征）。并行序列的數量也用于指定輸出層中模型預測的值的數量;再次，這是三個。 ```py # define model model = Sequential() model.add(LSTM(100, activation='relu', return_sequences=True, input_shape=(n_steps, n_features))) model.add(LSTM(100, activation='relu')) model.add(Dense(n_features)) model.compile(optimizer='adam', loss='mse') ``` 我們可以通過為每個系列提供三個時間步長的輸入來預測三個并行系列中的每一個的下一個值。 ```py 70, 75, 145 80, 85, 165 90, 95, 185 ``` 用于進行單個預測的輸入的形狀必須是 1 個樣本，3 個時間步長和 3 個特征，或者[1,3,3] ```py # demonstrate prediction x_input = array([[70,75,145], [80,85,165], [90,95,185]]) x_input = x_input.reshape((1, n_steps, n_features)) yhat = model.predict(x_input, verbose=0) ``` 我們希望向量輸出為： ```py [100, 105, 205] ``` 我們可以將所有這些結合在一起并演示下面的多變量輸出時間序列預測的 Stacked LSTM。 ```py # multivariate output stacked lstm example from numpy import array from numpy import hstack from keras.models import Sequential from keras.layers import LSTM from keras.layers import Dense # split a multivariate sequence into samples def split_sequences(sequences, n_steps): X, y = list(), list() for i in range(len(sequences)): # find the end of this pattern end_ix = i + n_steps # check if we are beyond the dataset if end_ix > len(sequences)-1: break # gather input and output parts of the pattern seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :] X.append(seq_x) y.append(seq_y) return array(X), array(y) # define input sequence in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90]) in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95]) out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))]) # convert to [rows, columns] structure in_seq1 = in_seq1.reshape((len(in_seq1), 1)) in_seq2 = in_seq2.reshape((len(in_seq2), 1)) out_seq = out_seq.reshape((len(out_seq), 1)) # horizontally stack columns dataset = hstack((in_seq1, in_seq2, out_seq)) # choose a number of time steps n_steps = 3 # convert into input/output X, y = split_sequences(dataset, n_steps) # the dataset knows the number of features, e.g. 2 n_features = X.shape[2] # define model model = Sequential() model.add(LSTM(100, activation='relu', return_sequences=True, input_shape=(n_steps, n_features))) model.add(LSTM(100, activation='relu')) model.add(Dense(n_features)) model.compile(optimizer='adam', loss='mse') # fit model model.fit(X, y, epochs=400, verbose=0) # demonstrate prediction x_input = array([[70,75,145], [80,85,165], [90,95,185]]) x_input = x_input.reshape((1, n_steps, n_features)) yhat = model.predict(x_input, verbose=0) print(yhat) ``` 運行該示例準備數據，擬合模型并進行預測。 ```py [[101.76599 108.730484 206.63577 ]] ``` ## 多步 LSTM 模型需要預測未來多個時間步長的時間序列預測問題可以稱為多步時間序列預測。具體而言，這些是預測范圍或間隔超過一個時間步長的問題。有兩種主要類型的 LSTM 模型可用于多步預測;他們是： 1. 向量輸出模型 2. 編碼器 - 解碼器模型在我們查看這些模型之前，讓我們首先看一下多步驟預測的數據準備。 ### 數據準備與一步預測一樣，用于多步時間序列預測的時間序列必須分為帶有輸入和輸出組件的樣本。輸入和輸出組件都將包含多個時間步長，并且可以具有或不具有相同數量的步驟。例如，給定單變量時間序列： ```py [10, 20, 30, 40, 50, 60, 70, 80, 90] ``` 我們可以使用最后三個時間步作為輸入并預測接下來的兩個時間步。第一個樣本如下：輸入： ```py [10, 20, 30] ``` 輸出： ```py [40, 50] ``` 下面的 _split_sequence（）_ 函數實現了這種行為，并將給定的單變量時間序列分割為具有指定數量的輸入和輸出時間步長的樣本。 ```py # split a univariate sequence into samples def split_sequence(sequence, n_steps_in, n_steps_out): X, y = list(), list() for i in range(len(sequence)): # find the end of this pattern end_ix = i + n_steps_in out_end_ix = end_ix + n_steps_out # check if we are beyond the sequence if out_end_ix > len(sequence): break # gather input and output parts of the pattern seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix] X.append(seq_x) y.append(seq_y) return array(X), array(y) ``` 我們可以在小型設計數據集上演示此功能。下面列出了完整的示例。 ```py # multi-step data preparation from numpy import array # split a univariate sequence into samples def split_sequence(sequence, n_steps_in, n_steps_out): X, y = list(), list() for i in range(len(sequence)): # find the end of this pattern end_ix = i + n_steps_in out_end_ix = end_ix + n_steps_out # check if we are beyond the sequence if out_end_ix > len(sequence): break # gather input and output parts of the pattern seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix] X.append(seq_x) y.append(seq_y) return array(X), array(y) # define input sequence raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90] # choose a number of time steps n_steps_in, n_steps_out = 3, 2 # split into samples X, y = split_sequence(raw_seq, n_steps_in, n_steps_out) # summarize the data for i in range(len(X)): print(X[i], y[i]) ``` 運行該示例將單變量系列拆分為輸入和輸出時間步驟，并打印每個系列的輸入和輸出組件。 ```py [10 20 30] [40 50] [20 30 40] [50 60] [30 40 50] [60 70] [40 50 60] [70 80] [50 60 70] [80 90] ``` 既然我們知道如何為多步預測準備數據，那么讓我們看看一些可以學習這種映射的 LSTM 模型。 ### 向量輸出模型與其他類型的神經網絡模型一樣，LSTM 可以直接輸出向量，可以解釋為多步預測。在前一節中看到這種方法是每個輸出時間序列的一個時間步驟被預測為向量。與前一節中單變量數據的 LSTM 一樣，必須首先對準備好的樣本進行重新整形。 LSTM 期望數據具有[_ 樣本，時間步長，特征 _]的三維結構，在這種情況下，我們只有一個特征，因此重塑是直截了當的。 ```py # reshape from [samples, timesteps] into [samples, timesteps, features] n_features = 1 X = X.reshape((X.shape[0], X.shape[1], n_features)) ``` 通過 _n_steps_in_ 和 _n_steps_out_ 變量中指定的輸入和輸出步數，我們可以定義一個多步驟時間序列預測模型。可以使用任何呈現的 LSTM 模型類型，例如香草，堆疊，雙向，CNN-LSTM 或 ConvLSTM。下面定義了用于多步預測的 Stacked LSTM。 ```py # define model model = Sequential() model.add(LSTM(100, activation='relu', return_sequences=True, input_shape=(n_steps_in, n_features))) model.add(LSTM(100, activation='relu')) model.add(Dense(n_steps_out)) model.compile(optimizer='adam', loss='mse') ``` 該模型可以對單個樣本進行預測。我們可以通過提供輸入來預測數據集末尾之后的下兩個步驟： ```py [70, 80, 90] ``` 我們希望預測的輸出為： ```py [100, 110] ``` 正如模型所預期的那樣，進行預測時輸入數據的單個樣本的形狀對于 1 個樣本，輸入的 3 個時間步長和單個特征必須是[1,3,1]。 ```py # demonstrate prediction x_input = array([70, 80, 90]) x_input = x_input.reshape((1, n_steps_in, n_features)) yhat = model.predict(x_input, verbose=0) ``` 將所有這些結合在一起，下面列出了具有單變量時間序列的用于多步預測的 Stacked LSTM。 ```py # univariate multi-step vector-output stacked lstm example from numpy import array from keras.models import Sequential from keras.layers import LSTM from keras.layers import Dense # split a univariate sequence into samples def split_sequence(sequence, n_steps_in, n_steps_out): X, y = list(), list() for i in range(len(sequence)): # find the end of this pattern end_ix = i + n_steps_in out_end_ix = end_ix + n_steps_out # check if we are beyond the sequence if out_end_ix > len(sequence): break # gather input and output parts of the pattern seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix] X.append(seq_x) y.append(seq_y) return array(X), array(y) # define input sequence raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90] # choose a number of time steps n_steps_in, n_steps_out = 3, 2 # split into samples X, y = split_sequence(raw_seq, n_steps_in, n_steps_out) # reshape from [samples, timesteps] into [samples, timesteps, features] n_features = 1 X = X.reshape((X.shape[0], X.shape[1], n_features)) # define model model = Sequential() model.add(LSTM(100, activation='relu', return_sequences=True, input_shape=(n_steps_in, n_features))) model.add(LSTM(100, activation='relu')) model.add(Dense(n_steps_out)) model.compile(optimizer='adam', loss='mse') # fit model model.fit(X, y, epochs=50, verbose=0) # demonstrate prediction x_input = array([70, 80, 90]) x_input = x_input.reshape((1, n_steps_in, n_features)) yhat = model.predict(x_input, verbose=0) print(yhat) ``` 運行示例預測并打印序列中的后兩個時間步驟。 ```py [[100.98096 113.28924]] ``` ### 編碼器 - 解碼器模型專門為預測可變長度輸出序列而開發的模型稱為[編碼器 - 解碼器 LSTM](https://machinelearningmastery.com/encoder-decoder-long-short-term-memory-networks/) 。該模型設計用于預測問題，其中存在輸入和輸出序列，即所謂的序列到序列或 seq2seq 問題，例如將文本從一種語言翻譯成另一種語言。該模型可用于多步時間序列預測。顧名思義，該模型由兩個子模型組成：編碼器和解碼器。編碼器是負責讀取和解釋輸入序列的模型。編碼器的輸出是固定長度的向量，表示模型對序列的解釋。編碼器傳統上是 Vanilla LSTM 模型，但也可以使用其他編碼器模型，例如 Stacked，Bidirectional 和 CNN 模型。 ```py model.add(LSTM(100, activation='relu', input_shape=(n_steps_in, n_features))) ``` 解碼器使用編碼器的輸出作為輸入。首先，對輸出序列中的每個所需時間步長重復一次編碼器的固定長度輸出。 ```py model.add(RepeatVector(n_steps_out)) ``` 然后將該序列提供給 LSTM 解碼器模型。模型必須為輸出時間步驟中的每個值輸出一個值，該值可由單個輸出模型解釋。 ```py model.add(LSTM(100, activation='relu', return_sequences=True)) ``` 我們可以使用相同的一個或多個輸出層在輸出序列中進行每個一步預測。這可以通過將模型的輸出部分包裝在 [TimeDistributed 包裝器](https://machinelearningmastery.com/timedistributed-layer-for-long-short-term-memory-networks-in-python/)中來實現。 ```py model.add(TimeDistributed(Dense(1))) ``` 下面列出了用于多步時間序列預測的編碼器 - 解碼器模型的完整定義。 ```py # define model model = Sequential() model.add(LSTM(100, activation='relu', input_shape=(n_steps_in, n_features))) model.add(RepeatVector(n_steps_out)) model.add(LSTM(100, activation='relu', return_sequences=True)) model.add(TimeDistributed(Dense(1))) model.compile(optimizer='adam', loss='mse') ``` 與其他 LSTM 模型一樣，輸入數據必須重新整形為[_ 樣本，時間步長，特征 _]的預期三維形狀。 ```py X = X.reshape((X.shape[0], X.shape[1], n_features)) ``` 在編碼器 - 解碼器模型的情況下，訓練數據集的輸出或 y 部分也必須具有該形狀。這是因為模型將使用每個輸入樣本的給定數量的特征預測給定數量的時間步長。 ```py y = y.reshape((y.shape[0], y.shape[1], n_features)) ``` 下面列出了用于多步時間序列預測的編碼器 - 解碼器 LSTM 的完整示例。 ```py # univariate multi-step encoder-decoder lstm example from numpy import array from keras.models import Sequential from keras.layers import LSTM from keras.layers import Dense from keras.layers import RepeatVector from keras.layers import TimeDistributed # split a univariate sequence into samples def split_sequence(sequence, n_steps_in, n_steps_out): X, y = list(), list() for i in range(len(sequence)): # find the end of this pattern end_ix = i + n_steps_in out_end_ix = end_ix + n_steps_out # check if we are beyond the sequence if out_end_ix > len(sequence): break # gather input and output parts of the pattern seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix] X.append(seq_x) y.append(seq_y) return array(X), array(y) # define input sequence raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90] # choose a number of time steps n_steps_in, n_steps_out = 3, 2 # split into samples X, y = split_sequence(raw_seq, n_steps_in, n_steps_out) # reshape from [samples, timesteps] into [samples, timesteps, features] n_features = 1 X = X.reshape((X.shape[0], X.shape[1], n_features)) y = y.reshape((y.shape[0], y.shape[1], n_features)) # define model model = Sequential() model.add(LSTM(100, activation='relu', input_shape=(n_steps_in, n_features))) model.add(RepeatVector(n_steps_out)) model.add(LSTM(100, activation='relu', return_sequences=True)) model.add(TimeDistributed(Dense(1))) model.compile(optimizer='adam', loss='mse') # fit model model.fit(X, y, epochs=100, verbose=0) # demonstrate prediction x_input = array([70, 80, 90]) x_input = x_input.reshape((1, n_steps_in, n_features)) yhat = model.predict(x_input, verbose=0) print(yhat) ``` 運行示例預測并打印序列中的后兩個時間步驟。 ```py [[[101.9736 [116.213615]]] ``` ## 多變量多步 LSTM 模型在前面的部分中，我們研究了單變量，多變量和多步驟時間序列預測。可以混合和匹配到目前為止針對不同問題呈現的不同類型的 LSTM 模型。這也適用于涉及多變量和多步預測的時間序列預測問題，但可能更具挑戰性。在本節中，我們將提供多個多步驟時間序列預測的數據準備和建模的簡短示例，作為模板來緩解這一挑戰，具體來說： 1. 多輸入多步輸出。 2. 多個并行輸入和多步輸出。也許最大的絆腳石是準備數據，所以這是我們關注的重點。 ### 多輸入多步輸出存在多變量時間序列預測問題，其中輸出序列是分開的但取決于輸入時間序列，并且輸出序列需要多個時間步長。例如，考慮前一部分的多變量時間序列： ```py [[ 10 15 25] [ 20 25 45] [ 30 35 65] [ 40 45 85] [ 50 55 105] [ 60 65 125] [ 70 75 145] [ 80 85 165] [ 90 95 185]] ``` 我們可以使用兩個輸入時間序列中的每一個的三個先前時間步驟來預測輸出時間序列的兩個時間步長。輸入： ```py 10, 15 20, 25 30, 35 ``` 輸出： ```py 65 85 ``` 下面的 _split_sequences（）_ 函數實現了這種行為。 ```py # split a multivariate sequence into samples def split_sequences(sequences, n_steps_in, n_steps_out): X, y = list(), list() for i in range(len(sequences)): # find the end of this pattern end_ix = i + n_steps_in out_end_ix = end_ix + n_steps_out-1 # check if we are beyond the dataset if out_end_ix > len(sequences): break # gather input and output parts of the pattern seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1:out_end_ix, -1] X.append(seq_x) y.append(seq_y) return array(X), array(y) ``` 我們可以在我們設計的數據集上證明這一點。下面列出了完整的示例。 ```py # multivariate multi-step data preparation from numpy import array from numpy import hstack # split a multivariate sequence into samples def split_sequences(sequences, n_steps_in, n_steps_out): X, y = list(), list() for i in range(len(sequences)): # find the end of this pattern end_ix = i + n_steps_in out_end_ix = end_ix + n_steps_out-1 # check if we are beyond the dataset if out_end_ix > len(sequences): break # gather input and output parts of the pattern seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1:out_end_ix, -1] X.append(seq_x) y.append(seq_y) return array(X), array(y) # define input sequence in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90]) in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95]) out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))]) # convert to [rows, columns] structure in_seq1 = in_seq1.reshape((len(in_seq1), 1)) in_seq2 = in_seq2.reshape((len(in_seq2), 1)) out_seq = out_seq.reshape((len(out_seq), 1)) # horizontally stack columns dataset = hstack((in_seq1, in_seq2, out_seq)) # choose a number of time steps n_steps_in, n_steps_out = 3, 2 # covert into input/output X, y = split_sequences(dataset, n_steps_in, n_steps_out) print(X.shape, y.shape) # summarize the data for i in range(len(X)): print(X[i], y[i]) ``` 首先運行該示例打印準備好的訓練數據的形狀。我們可以看到樣本輸入部分的形狀是三維的，由六個樣本組成，有三個時間步長，兩個變量用于兩個輸入時間序列。樣本的輸出部分對于六個樣本是二維的，并且每個樣本的兩個時間步長是預測的。然后打印制備的樣品以確認數據是按照我們指定的方式制備的。 ```py (6, 3, 2) (6, 2) [[10 15] [20 25] [30 35]] [65 85] [[20 25] [30 35] [40 45]] [ 85 105] [[30 35] [40 45] [50 55]] [105 125] [[40 45] [50 55] [60 65]] [125 145] [[50 55] [60 65] [70 75]] [145 165] [[60 65] [70 75] [80 85]] [165 185] ``` 我們現在可以開發用于多步預測的 LSTM 模型。可以使用向量輸出或編碼器 - 解碼器模型。在這種情況下，我們將使用 Stacked LSTM 演示向量輸出。下面列出了完整的示例。 ```py # multivariate multi-step stacked lstm example from numpy import array from numpy import hstack from keras.models import Sequential from keras.layers import LSTM from keras.layers import Dense # split a multivariate sequence into samples def split_sequences(sequences, n_steps_in, n_steps_out): X, y = list(), list() for i in range(len(sequences)): # find the end of this pattern end_ix = i + n_steps_in out_end_ix = end_ix + n_steps_out-1 # check if we are beyond the dataset if out_end_ix > len(sequences): break # gather input and output parts of the pattern seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1:out_end_ix, -1] X.append(seq_x) y.append(seq_y) return array(X), array(y) # define input sequence in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90]) in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95]) out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))]) # convert to [rows, columns] structure in_seq1 = in_seq1.reshape((len(in_seq1), 1)) in_seq2 = in_seq2.reshape((len(in_seq2), 1)) out_seq = out_seq.reshape((len(out_seq), 1)) # horizontally stack columns dataset = hstack((in_seq1, in_seq2, out_seq)) # choose a number of time steps n_steps_in, n_steps_out = 3, 2 # covert into input/output X, y = split_sequences(dataset, n_steps_in, n_steps_out) # the dataset knows the number of features, e.g. 2 n_features = X.shape[2] # define model model = Sequential() model.add(LSTM(100, activation='relu', return_sequences=True, input_shape=(n_steps_in, n_features))) model.add(LSTM(100, activation='relu')) model.add(Dense(n_steps_out)) model.compile(optimizer='adam', loss='mse') # fit model model.fit(X, y, epochs=200, verbose=0) # demonstrate prediction x_input = array([[70, 75], [80, 85], [90, 95]]) x_input = x_input.reshape((1, n_steps_in, n_features)) yhat = model.predict(x_input, verbose=0) print(yhat) ``` 運行該示例適合模型并預測輸出序列的下兩個時間步驟超出數據集。我們希望接下來的兩個步驟是：[185,205] 這是一個具有挑戰性的問題框架，數據非常少，模型的任意配置版本也很接近。 ```py [[188.70619 210.16513]] ``` ### 多個并行輸入和多步輸出并行時間序列的問題可能需要預測每個時間序列的多個時間步長。例如，考慮前一部分的多變量時間序列： ```py [[ 10 15 25] [ 20 25 45] [ 30 35 65] [ 40 45 85] [ 50 55 105] [ 60 65 125] [ 70 75 145] [ 80 85 165] [ 90 95 185]] ``` 我們可以使用三個時間序列中的每一個的最后三個步驟作為模型的輸入，并預測三個時間序列中的每一個的下一個時間步長作為輸出。訓練數據集中的第一個樣本如下。輸入： ```py 10, 15, 25 20, 25, 45 30, 35, 65 ``` 輸出： ```py 40, 45, 85 50, 55, 105 ``` 下面的 _split_sequences（）_ 函數實現了這種行為。 ```py # split a multivariate sequence into samples def split_sequences(sequences, n_steps_in, n_steps_out): X, y = list(), list() for i in range(len(sequences)): # find the end of this pattern end_ix = i + n_steps_in out_end_ix = end_ix + n_steps_out # check if we are beyond the dataset if out_end_ix > len(sequences): break # gather input and output parts of the pattern seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix:out_end_ix, :] X.append(seq_x) y.append(seq_y) return array(X), array(y) ``` 我們可以在小型設計數據集上演示此功能。下面列出了完整的示例。 ```py # multivariate multi-step data preparation from numpy import array from numpy import hstack from keras.models import Sequential from keras.layers import LSTM from keras.layers import Dense from keras.layers import RepeatVector from keras.layers import TimeDistributed # split a multivariate sequence into samples def split_sequences(sequences, n_steps_in, n_steps_out): X, y = list(), list() for i in range(len(sequences)): # find the end of this pattern end_ix = i + n_steps_in out_end_ix = end_ix + n_steps_out # check if we are beyond the dataset if out_end_ix > len(sequences): break # gather input and output parts of the pattern seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix:out_end_ix, :] X.append(seq_x) y.append(seq_y) return array(X), array(y) # define input sequence in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90]) in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95]) out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))]) # convert to [rows, columns] structure in_seq1 = in_seq1.reshape((len(in_seq1), 1)) in_seq2 = in_seq2.reshape((len(in_seq2), 1)) out_seq = out_seq.reshape((len(out_seq), 1)) # horizontally stack columns dataset = hstack((in_seq1, in_seq2, out_seq)) # choose a number of time steps n_steps_in, n_steps_out = 3, 2 # covert into input/output X, y = split_sequences(dataset, n_steps_in, n_steps_out) print(X.shape, y.shape) # summarize the data for i in range(len(X)): print(X[i], y[i]) ``` 首先運行該示例打印準備好的訓練數據集的形狀。我們可以看到數據集的輸入（X）和輸出（Y）元素分別是樣本數，時間步長和變量或并行時間序列的三維。然后將每個系列的輸入和輸出元素并排打印，以便我們可以確認數據是按照我們的預期準備的。 ```py (5, 3, 3) (5, 2, 3) [[10 15 25] [20 25 45] [30 35 65]] [[ 40 45 85] [ 50 55 105]] [[20 25 45] [30 35 65] [40 45 85]] [[ 50 55 105] [ 60 65 125]] [[ 30 35 65] [ 40 45 85] [ 50 55 105]] [[ 60 65 125] [ 70 75 145]] [[ 40 45 85] [ 50 55 105] [ 60 65 125]] [[ 70 75 145] [ 80 85 165]] [[ 50 55 105] [ 60 65 125] [ 70 75 145]] [[ 80 85 165] [ 90 95 185]] ``` 我們可以使用向量輸出或編碼器解碼器 LSTM 來模擬這個問題。在這種情況下，我們將使用編碼器 - 解碼器模型。下面列出了完整的示例。 ```py # multivariate multi-step encoder-decoder lstm example from numpy import array from numpy import hstack from keras.models import Sequential from keras.layers import LSTM from keras.layers import Dense from keras.layers import RepeatVector from keras.layers import TimeDistributed # split a multivariate sequence into samples def split_sequences(sequences, n_steps_in, n_steps_out): X, y = list(), list() for i in range(len(sequences)): # find the end of this pattern end_ix = i + n_steps_in out_end_ix = end_ix + n_steps_out # check if we are beyond the dataset if out_end_ix > len(sequences): break # gather input and output parts of the pattern seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix:out_end_ix, :] X.append(seq_x) y.append(seq_y) return array(X), array(y) # define input sequence in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90]) in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95]) out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))]) # convert to [rows, columns] structure in_seq1 = in_seq1.reshape((len(in_seq1), 1)) in_seq2 = in_seq2.reshape((len(in_seq2), 1)) out_seq = out_seq.reshape((len(out_seq), 1)) # horizontally stack columns dataset = hstack((in_seq1, in_seq2, out_seq)) # choose a number of time steps n_steps_in, n_steps_out = 3, 2 # covert into input/output X, y = split_sequences(dataset, n_steps_in, n_steps_out) # the dataset knows the number of features, e.g. 2 n_features = X.shape[2] # define model model = Sequential() model.add(LSTM(200, activation='relu', input_shape=(n_steps_in, n_features))) model.add(RepeatVector(n_steps_out)) model.add(LSTM(200, activation='relu', return_sequences=True)) model.add(TimeDistributed(Dense(n_features))) model.compile(optimizer='adam', loss='mse') # fit model model.fit(X, y, epochs=300, verbose=0) # demonstrate prediction x_input = array([[60, 65, 125], [70, 75, 145], [80, 85, 165]]) x_input = x_input.reshape((1, n_steps_in, n_features)) yhat = model.predict(x_input, verbose=0) print(yhat) ``` 運行該示例適合模型并預測超出數據集末尾的下兩個時間步的三個時間步中的每一個的值。我們希望這些系列和時間步驟的值如下： ```py 90, 95, 185 100, 105, 205 ``` 我們可以看到模型預測合理地接近預期值。 ```py [[[ 91.86044 97.77231 189.66768 ] [103.299355 109.18123 212.6863 ]]] ``` ## 摘要在本教程中，您了解了如何針對一系列標準時間序列預測問題開發一套 LSTM 模型。具體來說，你學到了： * 如何開發 LSTM 模型進行單變量時間序列預測。 * 如何開發多變量時間序列預測的 LSTM 模型。 * 如何開發 LSTM 模型進行多步時間序列預測。你有任何問題嗎？在下面的評論中提出您的問題，我會盡力回答。