如何開發 LSTM 模型用于家庭用電的多步時間序列預測 · Machine Learning Mastery 博客文章翻譯

# 如何開發 LSTM 模型用于家庭用電的多步時間序列預測 > 原文： [https://machinelearningmastery.com/how-to-develop-lstm-models-for-multi-step-time-series-forecasting-of-household-power-consumption/](https://machinelearningmastery.com/how-to-develop-lstm-models-for-multi-step-time-series-forecasting-of-household-power-consumption/) 鑒于智能電表的興起以及太陽能電池板等發電技術的廣泛采用，可提供大量的用電數據。該數據代表了多變量時間序列的功率相關變量，而這些變量又可用于建模甚至預測未來的電力消耗。與其他機器學習算法不同，長期短期記憶循環神經網絡能夠自動學習序列數據的特征，支持多變量數據，并且可以輸出可用于多步預測的可變長度序列。在本教程中，您將了解如何開發長期短期記憶循環神經網絡，用于家庭功耗的多步時間序列預測。完成本教程后，您將了解： * 如何開發和評估用于多步時間序列預測的單變量和多變量編碼器 - 解碼器 LSTM。 * 如何開發和評估用于多步時間序列預測的 CNN-LSTM 編碼器 - 解碼器模型。 * 如何開發和評估用于多步時間序列預測的 ConvLSTM 編碼器 - 解碼器模型。讓我們開始吧。 **注 1** ：這篇文章摘錄自：“[深度學習時間序列預測](https://machinelearningmastery.com/deep-learning-for-time-series-forecasting/)”。看一下，如果您想獲得更多關于在時間序列預測問題上充分利用深度學習方法的分步教程。 **Note2** ：這是一個相當高級的教程，如果你不熟悉 Python 中的時間序列預測，[從這里開始](https://machinelearningmastery.com/start-here/#timeseries)。如果您不熟悉時間序列的深度學習，[從這里開始](https://machinelearningmastery.com/start-here/#deep_learning_time_series)。如果你真的想開始使用時間序列的 LSTM，[從這里開始](https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-series-forecasting/)。 ![How to Develop LSTM Models for Multi-Step Time Series Forecasting of Household Power Consumption](https://3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com/wp-content/uploads/2018/10/How-to-Develop-LSTM-Models-for-Multi-Step-Time-Series-Forecasting-of-Household-Power-Consumption.jpg) 如何開發 LSTM 模型的多步時間序列預測家庭用電量照片由 [Ian Muttoo](https://www.flickr.com/photos/imuttoo/4257813689/) ，保留一些權利。 ## 教程概述本教程分為九個部分;他們是： 1. 問題描述 2. 加載并準備數據集 3. 模型評估 4. 用于多步預測的 LSTM 5. 具有單變量輸入和向量輸出的 LSTM 模型 6. 具有單變量輸入的編碼器 - 解碼器 LSTM 模型 7. 具有多變量輸入的編碼器 - 解碼器 LSTM 模型 8. 具有單變量輸入的 CNN-LSTM 編碼器 - 解碼器模型 9. 具有單變量輸入的 ConvLSTM 編碼器 - 解碼器模型 ### Python 環境本教程假設您安裝了 Python SciPy 環境，理想情況下使用 Python 3。您必須安裝帶有 TensorFlow 或 Theano 后端的 Keras（2.2 或更高版本）。本教程還假設您安裝了 scikit-learn，Pandas，NumPy 和 Matplotlib。如果您需要有關環境的幫助，請參閱本教程： * [如何為機器學習和深度學習設置 Python 環境](https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/) 本教程不需要 GPU，但您可以在 Amazon Web Services 上以低成本方式訪問 GPU。在本教程中學習如何： * [如何設置亞馬遜 AWS EC2 GPU 以訓練 Keras 深度學習模型](https://machinelearningmastery.com/develop-evaluate-large-deep-learning-models-keras-amazon-web-services/) 讓我們潛入。 ## 問題描述 '[家庭用電量](https://archive.ics.uci.edu/ml/datasets/individual+household+electric+power+consumption)'數據集是一個多變量時間序列數據集，描述了四年內單個家庭的用電量。有關此數據集的更多信息，請參閱帖子： * [如何加載和探索家庭用電數據](https://machinelearningmastery.com/how-to-load-and-explore-household-electricity-usage-data/) 該數據是在 2006 年 12 月至 2010 年 11 月之間收集的，并且每分鐘收集家庭內的能耗觀察結果。它是一個多變量系列，由七個變量組成（除日期和時間外）;他們是： * **global_active_power** ：家庭消耗的總有功功率（千瓦）。 * **global_reactive_power** ：家庭消耗的總無功功率（千瓦）。 * **電壓**：平均電壓（伏特）。 * **global_intensity** ：平均電流強度（安培）。 * **sub_metering_1** ：廚房的有功電能（瓦特小時的有功電能）。 * **sub_metering_2** ：用于洗衣的有功能量（瓦特小時的有功電能）。 * **sub_metering_3** ：氣候控制系統的有功電能（瓦特小時的有功電能）。有功和無功電能參考[交流電](https://en.wikipedia.org/wiki/AC_power)的技術細節。可以通過從總活動能量中減去三個定義的子計量變量的總和來創建第四個子計量變量，如下所示： ```py sub_metering_remainder = (global_active_power * 1000 / 60) - (sub_metering_1 + sub_metering_2 + sub_metering_3) ``` ## 加載并準備數據集數據集可以從 UCI 機器學習庫下載為單個 20 兆字節的.zip 文件： * [household_power_consumption.zip](https://archive.ics.uci.edu/ml/machine-learning-databases/00235/household_power_consumption.zip) 下載數據集并將其解壓縮到當前工作目錄中。您現在將擁有大約 127 兆字節的文件“ _household_power_consumption.txt_ ”并包含所有觀察結果。我們可以使用 _read_csv（）_ 函數來加載數據，并將前兩列合并到一個日期時間列中，我們可以將其用作索引。 ```py # load all data dataset = read_csv('household_power_consumption.txt', sep=';', header=0, low_memory=False, infer_datetime_format=True, parse_dates={'datetime':[0,1]}, index_col=['datetime']) ``` 接下來，我們可以用'_ 標記所有[缺失值](https://machinelearningmastery.com/handle-missing-timesteps-sequence-prediction-problems-python/)？_ '具有 _NaN_ 值的字符，這是一個浮點數。這將允許我們將數據作為一個浮點值數組而不是混合類型（效率較低）。 ```py # mark all missing values dataset.replace('?', nan, inplace=True) # make dataset numeric dataset = dataset.astype('float32') ``` 我們還需要填寫缺失值，因為它們已被標記。一種非常簡單的方法是從前一天的同一時間復制觀察。我們可以在一個名為 _fill_missing（）_ 的函數中實現它，該函數將從 24 小時前獲取數據的 NumPy 數組并復制值。 ```py # fill missing values with a value at the same time one day ago def fill_missing(values): one_day = 60 * 24 for row in range(values.shape[0]): for col in range(values.shape[1]): if isnan(values[row, col]): values[row, col] = values[row - one_day, col] ``` 我們可以將此函數直接應用于 DataFrame 中的數據。 ```py # fill missing fill_missing(dataset.values) ``` 現在，我們可以使用上一節中的計算創建一個包含剩余子計量的新列。 ```py # add a column for for the remainder of sub metering values = dataset.values dataset['sub_metering_4'] = (values[:,0] * 1000 / 60) - (values[:,4] + values[:,5] + values[:,6]) ``` 我們現在可以將清理后的數據集版本保存到新文件中;在這種情況下，我們只需將文件擴展名更改為.csv，并將數據集保存為“ _household_power_consumption.csv_ ”。 ```py # save updated dataset dataset.to_csv('household_power_consumption.csv') ``` 將所有這些結合在一起，下面列出了加載，清理和保存數據集的完整示例。 ```py # load and clean-up data from numpy import nan from numpy import isnan from pandas import read_csv from pandas import to_numeric # fill missing values with a value at the same time one day ago def fill_missing(values): one_day = 60 * 24 for row in range(values.shape[0]): for col in range(values.shape[1]): if isnan(values[row, col]): values[row, col] = values[row - one_day, col] # load all data dataset = read_csv('household_power_consumption.txt', sep=';', header=0, low_memory=False, infer_datetime_format=True, parse_dates={'datetime':[0,1]}, index_col=['datetime']) # mark all missing values dataset.replace('?', nan, inplace=True) # make dataset numeric dataset = dataset.astype('float32') # fill missing fill_missing(dataset.values) # add a column for for the remainder of sub metering values = dataset.values dataset['sub_metering_4'] = (values[:,0] * 1000 / 60) - (values[:,4] + values[:,5] + values[:,6]) # save updated dataset dataset.to_csv('household_power_consumption.csv') ``` 運行該示例將創建新文件' _household_power_consumption.csv_ '，我們可以將其用作建模項目的起點。 ## 模型評估在本節中，我們將考慮如何開發和評估家庭電力數據集的預測模型。本節分為四個部分;他們是： 1. 問題框架 2. 評估指標 3. 訓練和測試集 4. 前瞻性驗證 ### 問題框架有許多方法可以利用和探索家庭用電量數據集。在本教程中，我們將使用這些數據來探索一個非常具體的問題;那是： > 鑒于最近的耗電量，未來一周的預期耗電量是多少？這要求預測模型預測未來七天每天的總有功功率。從技術上講，考慮到多個預測步驟，這個問題的框架被稱為多步驟時間序列預測問題。利用多個輸入變量的模型可以稱為多變量多步時間序列預測模型。這種類型的模型在規劃支出方面可能有助于家庭。在供應方面，它也可能有助于規劃特定家庭的電力需求。數據集的這種框架還表明，將每分鐘功耗的觀察結果下采樣到每日總數是有用的。這不是必需的，但考慮到我們對每天的總功率感興趣，這是有道理的。我們可以使用 pandas DataFrame 上的 [resample（）函數](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.resample.html)輕松實現這一點。使用參數' _D_ '調用此函數允許按日期時間索引的加載數據按天分組（[查看所有偏移別名](http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases)）。然后，我們可以計算每天所有觀測值的總和，并為八個變量中的每一個創建每日耗電量數據的新數據集。下面列出了完整的示例。 ```py # resample minute data to total for each day from pandas import read_csv # load the new file dataset = read_csv('household_power_consumption.csv', header=0, infer_datetime_format=True, parse_dates=['datetime'], index_col=['datetime']) # resample data to daily daily_groups = dataset.resample('D') daily_data = daily_groups.sum() # summarize print(daily_data.shape) print(daily_data.head()) # save daily_data.to_csv('household_power_consumption_days.csv') ``` 運行該示例將創建一個新的每日總功耗數據集，并將結果保存到名為“ _household_power_consumption_days.csv_ ”的單獨文件中。我們可以將其用作數據集，用于擬合和評估所選問題框架的預測模型。 ### 評估指標預測將包含七個值，一個用于一周中的每一天。多步預測問題通常分別評估每個預測時間步長。這有助于以下幾個原因： * 在特定提前期評論技能（例如+1 天 vs +3 天）。 * 在不同的交付時間基于他們的技能對比模型（例如，在+1 天的模型和在日期+5 的模型良好的模型）。總功率的單位是千瓦，并且具有也在相同單位的誤差度量將是有用的。均方根誤差（RMSE）和平均絕對誤差（MAE）都符合這個要求，盡管 RMSE 更常用，將在本教程中采用。與 MAE 不同，RMSE 更能預測預測誤差。此問題的表現指標是從第 1 天到第 7 天的每個提前期的 RMSE。作為捷徑，使用單個分數總結模型的表現以幫助模型選擇可能是有用的。可以使用的一個可能的分數是所有預測天數的 RMSE。下面的函數 _evaluate_forecasts（）_ 將實現此行為并基于多個七天預測返回模型的表現。 ```py # evaluate one or more weekly forecasts against expected values def evaluate_forecasts(actual, predicted): scores = list() # calculate an RMSE score for each day for i in range(actual.shape[1]): # calculate mse mse = mean_squared_error(actual[:, i], predicted[:, i]) # calculate rmse rmse = sqrt(mse) # store scores.append(rmse) # calculate overall RMSE s = 0 for row in range(actual.shape[0]): for col in range(actual.shape[1]): s += (actual[row, col] - predicted[row, col])**2 score = sqrt(s / (actual.shape[0] * actual.shape[1])) return score, scores ``` 運行該函數將首先返回整個 RMSE，無論白天，然后每天返回一系列 RMSE 分數。 ### 訓練和測試集我們將使用前三年的數據來訓練預測模型和評估模型的最后一年。給定數據集中的數據將分為標準周。這些是從周日開始到周六結束的周。這是使用所選模型框架的現實且有用的方法，其中可以預測未來一周的功耗。它也有助于建模，其中模型可用于預測特定日期（例如星期三）或整個序列。我們將數據拆分為標準周，從測試數據集向后工作。數據的最后一年是 2010 年，2010 年的第一個星期日是 1 月 3 日。數據于 2010 年 11 月中旬結束，數據中最接近的最后一個星期六是 11 月 20 日。這給出了 46 周的測試數據。下面提供了測試數據集的每日數據的第一行和最后一行以供確認。 ```py 2010-01-03,2083.4539999999984,191.61000000000055,350992.12000000034,8703.600000000033,3842.0,4920.0,10074.0,15888.233355799992 ... 2010-11-20,2197.006000000004,153.76800000000028,346475.9999999998,9320.20000000002,4367.0,2947.0,11433.0,17869.76663959999 ``` 每日數據從 2006 年底開始。數據集中的第一個星期日是 12 月 17 日，這是第二行數據。將數據組織到標準周內為訓練預測模型提供了 159 個完整的標準周。 ```py 2006-12-17,3390.46,226.0059999999994,345725.32000000024,14398.59999999998,2033.0,4187.0,13341.0,36946.66673200004 ... 2010-01-02,1309.2679999999998,199.54600000000016,352332.8399999997,5489.7999999999865,801.0,298.0,6425.0,14297.133406600002 ``` 下面的函數 _split_dataset（）_ 將每日數據拆分為訓練集和測試集，并將每個數據組織成標準周。使用特定行偏移來使用數據集的知識來分割數據。然后使用 NumPy [split（）函數](https://docs.scipy.org/doc/numpy/reference/generated/numpy.split.html)將分割數據集組織成每周數據。 ```py # split a univariate dataset into train/test sets def split_dataset(data): # split into standard weeks train, test = data[1:-328], data[-328:-6] # restructure into windows of weekly data train = array(split(train, len(train)/7)) test = array(split(test, len(test)/7)) return train, test ``` 我們可以通過加載每日數據集并打印訓練和測試集的第一行和最后一行數據來測試此功能，以確認它們符合上述預期。完整的代碼示例如下所示。 ```py # split into standard weeks from numpy import split from numpy import array from pandas import read_csv # split a univariate dataset into train/test sets def split_dataset(data): # split into standard weeks train, test = data[1:-328], data[-328:-6] # restructure into windows of weekly data train = array(split(train, len(train)/7)) test = array(split(test, len(test)/7)) return train, test # load the new file dataset = read_csv('household_power_consumption_days.csv', header=0, infer_datetime_format=True, parse_dates=['datetime'], index_col=['datetime']) train, test = split_dataset(dataset.values) # validate train data print(train.shape) print(train[0, 0, 0], train[-1, -1, 0]) # validate test print(test.shape) print(test[0, 0, 0], test[-1, -1, 0]) ``` 運行該示例表明，訓練數據集確實有 159 周的數據，而測試數據集有 46 周。我們可以看到，第一行和最后一行的訓練和測試數據集的總有效功率與我們定義為每組標準周界限的特定日期的數據相匹配。 ```py (159, 7, 8) 3390.46 1309.2679999999998 (46, 7, 8) 2083.4539999999984 2197.006000000004 ``` ### 前瞻性驗證將使用稱為[前進驗證](https://machinelearningmastery.com/backtest-machine-learning-models-time-series-forecasting/)的方案評估模型。這是需要模型進行一周預測的地方，然后該模型的實際數據可用于模型，以便它可以用作在隨后一周進行預測的基礎。這對于如何在實踐中使用模型以及對模型有益而使其能夠利用最佳可用數據都是現實的。我們可以通過分離輸入數據和輸出/預測數據來證明這一點。 ```py Input, Predict [Week1] Week2 [Week1 + Week2] Week3 [Week1 + Week2 + Week3] Week4 ... ``` 評估該數據集上的預測模型的前瞻性驗證方法在下面提供名為 _evaluate_model（）_。標準周格式的訓練和測試數據集作為參數提供給函數。提供了另一個參數 n_input，用于定義模型將用作輸入以進行預測的先前觀察的數量。調用兩個新函數：一個用于根據稱為 _build_model（）_ 的訓練數據構建模型，另一個用于使用該模型對每個新標準周進行預測，稱為 _forecast（）_。這些將在后續章節中介紹。我們正在使用神經網絡，因此，它們通常很難訓練，但很快就能進行評估。這意味著模型的首選用法是在歷史數據上構建一次，并使用它們來預測前向驗證的每個步驟。模型在評估期間是靜態的（即未更新）。這與其他模型不同，這些模型可以更快地進行訓練，在新數據可用時，可以重新擬合或更新模型的每個步驟。有了足夠的資源，就可以通過這種方式使用神經網絡，但在本教程中我們不會這樣做。下面列出了完整的 _evaluate_model（）_ 函數。 ```py # evaluate a single model def evaluate_model(train, test, n_input): # fit model model = build_model(train, n_input) # history is a list of weekly data history = [x for x in train] # walk-forward validation over each week predictions = list() for i in range(len(test)): # predict the week yhat_sequence = forecast(model, history, n_input) # store the predictions predictions.append(yhat_sequence) # get real observation and add to history for predicting the next week history.append(test[i, :]) # evaluate predictions days for each week predictions = array(predictions) score, scores = evaluate_forecasts(test[:, :, 0], predictions) return score, scores ``` 一旦我們對模型進行評估，我們就可以總結表現。以下名為 _summarize_scores（）_ 的函數將模型的表現顯示為單行，以便與其他模型進行比較。 ```py # summarize scores def summarize_scores(name, score, scores): s_scores = ', '.join(['%.1f' % s for s in scores]) print('%s: [%.3f] %s' % (name, score, s_scores)) ``` 我們現在已經開始評估數據集上的預測模型的所有元素。 ## 用于多步預測的 LSTM 循環神經網絡或 RNN 專門用于工作，學習和預測序列數據。循環神經網絡是神經網絡，其中來自一個時間步長的網絡輸出在隨后的時間步驟中被提供作為輸入。這允許模型基于當前時間步長的輸入和對先前時間步驟中輸出的直接知識來決定預測什么。也許最成功和最廣泛使用的 RNN 是長期短期記憶網絡，或簡稱 LSTM。它之所以成功，是因為它克服了訓練復現神經網絡所帶來的挑戰，從而產生了穩定的模型。除了利用先前時間步的輸出的循環連接之外，LSTM 還具有內部存儲器，其操作類似于局部變量，允許它們在輸入序列上累積狀態。有關 Recurrent Neural Networks 的更多信息，請參閱帖子： * [深度學習的循環神經網絡崩潰課程](https://machinelearningmastery.com/crash-course-recurrent-neural-networks-deep-learning/) 有關長期短期內存網絡的更多信息，請參閱帖子： * [專家對長短期記憶網絡的簡要介紹](https://machinelearningmastery.com/gentle-introduction-long-short-term-memory-networks-experts/) LSTM 在多步時間序列預測方面具有許多優勢;他們是： * **序列的原生支持**。 LSTM 是一種循環網絡，因此設計用于將序列數據作為輸入，不像其他模型，其中滯后觀察必須作為輸入特征呈現。 * **多變量輸入**。 LSTM 直接支持多變量輸入的多個并行輸入序列，不同于其中多變量輸入以平面結構呈現的其他模型。 * **向量輸出**。與其他神經網絡一樣，LSTM 能夠將輸入數據直接映射到可以表示多個輸出時間步長的輸出向量。此外，已經開發了專門設計用于進行多步序列預測的專用架構，通常稱為序列到序列預測，或簡稱為 seq2seq。這很有用，因為多步時間序列預測是一種 seq2seq 預測。針對 seq2seq 問題設計的循環神經網絡架構的示例是編碼器 - 解碼器 LSTM。編碼器 - 解碼器 LSTM 是由兩個子模型組成的模型：一個稱為編碼器，其讀取輸入序列并將其壓縮為固定長度的內部表示，以及稱為解碼器的輸出模型，其解釋內部表示并使用它預測輸出序列。事實證明，序列預測的編碼器 - 解碼器方法比直接輸出向量更有效，并且是首選方法。通常，已經發現 LSTM 在自回歸類型問題上不是非常有效。這些是預測下一個時間步長是最近時間步長的函數。有關此問題的更多信息，請參閱帖子： * [關于 LSTM 對時間序列預測的適用性](https://machinelearningmastery.com/suitability-long-short-term-memory-networks-time-series-forecasting/) 一維卷積神經網絡（CNN）已證明在自動學習輸入序列的特征方面是有效的。一種流行的方法是將 CNN 與 LSTM 組合，其中 CNN 作為編碼器來學習輸入數據的子序列的特征，這些子序列作為 LSTM 的時間步長提供。該架構稱為 [CNN-LSTM](https://machinelearningmastery.com/cnn-long-short-term-memory-networks/) 。有關此體系結構的更多信息，請參閱帖子： * [CNN 長短期記憶網絡](https://machinelearningmastery.com/cnn-long-short-term-memory-networks/) CNN LSTM 架構的功率變化是 ConvLSTM，它直接在 LSTM 的單元內使用輸入子序列的卷積讀取。事實證明，這種方法對于時間序列分類非常有效，并且可以適用于多步驟時間序列預測。在本教程中，我們將探索一套用于多步時間序列預測的 LSTM 架構。具體來說，我們將看看如何開發以下模型： * **LSTM** 模型，帶有向量輸出，用于多變量預測，具有單變量輸入數據。 * **編碼器 - 解碼器 LSTM** 模型，用于使用單變量輸入數據進行多步預測。 * **編碼器 - 解碼器 LSTM** 模型，用于多變量輸入數據的多步預測。 * **CNN-LSTM 編碼器 - 解碼器**模型，用于使用單變量輸入數據進行多步預測。 * **ConvLSTM 編碼器 - 解碼器**模型，用于使用單變量輸入數據進行多步預測。如果您不熟悉使用 LSTM 進行時間序列預測，我強烈推薦這篇文章： * [如何為時間序列預測開發 LSTM 模型](https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-series-forecasting/) 將在家庭電力預測問題上開發和演示這些模型。如果一個模型比一個樸素的模型更好地實現表現，那么該模型被認為是技術性的，在 7 天的預測中，該模型的總體 RMSE 約為 465 千瓦。我們不會專注于調整這些模型以實現最佳表現;相反，與樸素的預測相比，我們將停止熟練的模型。選擇的結構和超參數通過一些試驗和錯誤來選擇。分數應僅作為示例，而不是研究問題的最佳模型或配置。鑒于模型的隨機性，[良好實踐](https://machinelearningmastery.com/evaluate-skill-deep-learning-models/)是多次評估給定模型并報告測試數據集的平均表現。為了簡潔起見并保持代碼簡單，我們將在本教程中介紹單行模型。我們無法知道哪種方法對于給定的多步預測問題最有效。探索一套方法是個好主意，以便發現哪些方法最適合您的特定數據集。 ## 具有單變量輸入和向量輸出的 LSTM 模型我們將從開發一個簡單或香草 LSTM 模型開始，該模型讀取每日總功耗的天數，并預測每日功耗的下一個標準周的向量輸出。這將為后續章節中開發的更精細的模型奠定基礎。用作輸入的前幾天定義了 LSTM 將讀取并學習提取特征的數據的一維（1D）子序列。關于此輸入的大小和性質的一些想法包括： * 所有前幾天，最多數年的數據。 * 前 7 天。 * 前兩周。 * 前一個月。 * 前一年。 * 前一周和一周從一年前預測。沒有正確的答案;相反，可以測試每種方法和更多方法，并且可以使用模型的表現來選擇導致最佳模型表現的輸入的性質。這些選擇定義了一些東西： * 如何準備訓練數據以適應模型。 * 如何準備測試數據以評估模型。 * 如何使用該模型在未來使用最終模型進行預測。一個好的起點是使用前七天。 LSTM 模型期望數據具有以下形狀： ```py [samples, timesteps, features] ``` 一個樣本將包含七個時間步驟，其中一個功能用于每日總耗電量的七天。訓練數據集有 159 周的數據，因此訓練數據集的形狀為： ```py [159, 7, 1] ``` 這是一個好的開始。此格式的數據將使用先前的標準周來預測下一個標準周。一個問題是訓練神經網絡的 159 個實例并不是很多。創建更多訓練數據的方法是在訓練期間更改問題，以預測前七天的下一個七天，無論標準周。這僅影響訓練數據，并且測試問題保持不變：預測給定前一標準周的下一個標準周的每日功耗。這將需要一點準備訓練數據。訓練數據在標準周內提供八個變量，特別是形狀[ _159,7,8_ ]。第一步是展平數據，以便我們有八個時間序列序列。 ```py # flatten data data = train.reshape((train.shape[0]*train.shape[1], train.shape[2])) ``` 然后，我們需要迭代時間步驟并將數據劃分為重疊窗口;每次迭代沿著一個時間步移動并預測隨后的七天。例如： ```py Input, Output [d01, d02, d03, d04, d05, d06, d07], [d08, d09, d10, d11, d12, d13, d14] [d02, d03, d04, d05, d06, d07, d08], [d09, d10, d11, d12, d13, d14, d15] ... ``` 我們可以通過跟蹤輸入和輸出的開始和結束索引來實現這一點，因為我們在時間步長方面迭代展平數據的長度。我們也可以通過參數化輸入和輸出的數量來實現這一點（例如 _n_input_ ， _n_out_ ），這樣您就可以嘗試不同的值或根據自己的問題進行調整。下面是一個名為 _to_supervised（）_ 的函數，它采用周（歷史）列表和用作輸入和輸出的時間步數，并以重疊移動窗口格式返回數據。 ```py # convert history into inputs and outputs def to_supervised(train, n_input, n_out=7): # flatten data data = train.reshape((train.shape[0]*train.shape[1], train.shape[2])) X, y = list(), list() in_start = 0 # step over the entire history one time step at a time for _ in range(len(data)): # define the end of the input sequence in_end = in_start + n_input out_end = in_end + n_out # ensure we have enough data for this instance if out_end < len(data): x_input = data[in_start:in_end, 0] x_input = x_input.reshape((len(x_input), 1)) X.append(x_input) y.append(data[in_end:out_end, 0]) # move along one time step in_start += 1 return array(X), array(y) ``` 當我們在整個訓練數據集上運行此函數時，我們將 159 個樣本轉換為 1,099 個;具體地，變換的數據集具有形狀 _X = [1099,7,1]_ 和 _y = [1099,7]。_ 接下來，我們可以在訓練數據上定義和擬合 LSTM 模型。這個多步驟時間序列預測問題是一個自回歸。這意味著它可能最好建模，其中接下來的七天是先前時間步驟的觀測功能。這和相對少量的數據意味著需要一個小型號。我們將開發一個具有 200 個單元的單個隱藏 LSTM 層的模型。隱藏層中的單元數與輸入序列中的時間步數無關。 LSTM 層之后是具有 200 個節點的完全連接層，其將解釋 LSTM 層學習的特征。最后，輸出層將直接預測具有七個元素的向量，輸出序列中每天一個元素。我們將使用均方誤差損失函數，因為它與我們選擇的 RMSE 誤差度量非常匹配。我們將使用隨機梯度下降的高效 [Adam 實現](https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/)，并將模型擬合 70 個時期，批量大小為 16。小批量大小和算法的隨機性意味著相同的模型將在每次訓練時學習輸入到輸出的略微不同的映射。這意味著評估模型時結果可能會有所不同。您可以嘗試多次運行模型并計算模型表現的平均值。下面的 _build_model（）_ 準備訓練數據，定義模型，并將模型擬合到訓練數據上，使擬合模型準備好進行預測。 ```py # train the model def build_model(train, n_input): # prepare data train_x, train_y = to_supervised(train, n_input) # define parameters verbose, epochs, batch_size = 0, 70, 16 n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1] # define model model = Sequential() model.add(LSTM(200, activation='relu', input_shape=(n_timesteps, n_features))) model.add(Dense(100, activation='relu')) model.add(Dense(n_outputs)) model.compile(loss='mse', optimizer='adam') # fit network model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose) return model ``` 現在我們知道如何擬合模型，我們可以看看如何使用模型進行預測。通常，模型期望數據在進行預測時具有相同的三維形狀。在這種情況下，輸入模式的預期形狀是一個樣本，每天消耗的一個功能的七天： ```py [1, 7, 1] ``` 在對測試集進行預測時以及在將來使用最終模型進行預測時，數據必須具有此形狀。如果在輸入天數為 14 時更改數字，則必須相應更改訓練數據的形狀和進行預測時新樣本的形狀，以便有 14 個時間步長。在使用模型時，您必須繼續使用它。我們正在使用前向驗證來評估模型，如上一節中所述。這意味著我們有前一周的觀察結果，以預測下周。這些被收集到一系列稱為歷史的標準周。為了預測下一個標準周，我們需要檢索觀察的最后幾天。與訓練數據一樣，我們必須首先展平歷史數據以刪除每周結構，以便最終得到八個平行時間序列。 ```py # flatten data data = data.reshape((data.shape[0]*data.shape[1], data.shape[2])) ``` 接下來，我們需要檢索每日總功耗的最后七天（特征索引 0）。我們將對訓練數據進行參數化，以便將來可以修改模型輸入的前幾天的數量。 ```py # retrieve last observations for input data input_x = data[-n_input:, 0] ``` 接下來，我們將輸入重塑為預期的三維結構。 ```py # reshape into [1, n_input, 1] input_x = input_x.reshape((1, len(input_x), 1)) ``` 然后，我們使用擬合模型和輸入數據進行預測，并檢索七天輸出的向量。 ```py # forecast the next week yhat = model.predict(input_x, verbose=0) # we only want the vector forecast yhat = yhat[0] ``` 下面的 _forecast（）_ 函數實現了這個功能，并將模型擬合到訓練數據集，到目前為止觀察到的數據歷史以及模型預期的輸入時間步數。 ```py # make a forecast def forecast(model, history, n_input): # flatten data data = array(history) data = data.reshape((data.shape[0]*data.shape[1], data.shape[2])) # retrieve last observations for input data input_x = data[-n_input:, 0] # reshape into [1, n_input, 1] input_x = input_x.reshape((1, len(input_x), 1)) # forecast the next week yhat = model.predict(input_x, verbose=0) # we only want the vector forecast yhat = yhat[0] return yhat ``` 而已;我們現在擁有了所需的一切，我們需要通過 LSTM 模型對單日數據集的每日總功耗進行多步時間序列預測。我們可以將所有這些結合在一起。下面列出了完整的示例。 ```py # univariate multi-step lstm from math import sqrt from numpy import split from numpy import array from pandas import read_csv from sklearn.metrics import mean_squared_error from matplotlib import pyplot from keras.models import Sequential from keras.layers import Dense from keras.layers import Flatten from keras.layers import LSTM # split a univariate dataset into train/test sets def split_dataset(data): # split into standard weeks train, test = data[1:-328], data[-328:-6] # restructure into windows of weekly data train = array(split(train, len(train)/7)) test = array(split(test, len(test)/7)) return train, test # evaluate one or more weekly forecasts against expected values def evaluate_forecasts(actual, predicted): scores = list() # calculate an RMSE score for each day for i in range(actual.shape[1]): # calculate mse mse = mean_squared_error(actual[:, i], predicted[:, i]) # calculate rmse rmse = sqrt(mse) # store scores.append(rmse) # calculate overall RMSE s = 0 for row in range(actual.shape[0]): for col in range(actual.shape[1]): s += (actual[row, col] - predicted[row, col])**2 score = sqrt(s / (actual.shape[0] * actual.shape[1])) return score, scores # summarize scores def summarize_scores(name, score, scores): s_scores = ', '.join(['%.1f' % s for s in scores]) print('%s: [%.3f] %s' % (name, score, s_scores)) # convert history into inputs and outputs def to_supervised(train, n_input, n_out=7): # flatten data data = train.reshape((train.shape[0]*train.shape[1], train.shape[2])) X, y = list(), list() in_start = 0 # step over the entire history one time step at a time for _ in range(len(data)): # define the end of the input sequence in_end = in_start + n_input out_end = in_end + n_out # ensure we have enough data for this instance if out_end < len(data): x_input = data[in_start:in_end, 0] x_input = x_input.reshape((len(x_input), 1)) X.append(x_input) y.append(data[in_end:out_end, 0]) # move along one time step in_start += 1 return array(X), array(y) # train the model def build_model(train, n_input): # prepare data train_x, train_y = to_supervised(train, n_input) # define parameters verbose, epochs, batch_size = 0, 70, 16 n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1] # define model model = Sequential() model.add(LSTM(200, activation='relu', input_shape=(n_timesteps, n_features))) model.add(Dense(100, activation='relu')) model.add(Dense(n_outputs)) model.compile(loss='mse', optimizer='adam') # fit network model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose) return model # make a forecast def forecast(model, history, n_input): # flatten data data = array(history) data = data.reshape((data.shape[0]*data.shape[1], data.shape[2])) # retrieve last observations for input data input_x = data[-n_input:, 0] # reshape into [1, n_input, 1] input_x = input_x.reshape((1, len(input_x), 1)) # forecast the next week yhat = model.predict(input_x, verbose=0) # we only want the vector forecast yhat = yhat[0] return yhat # evaluate a single model def evaluate_model(train, test, n_input): # fit model model = build_model(train, n_input) # history is a list of weekly data history = [x for x in train] # walk-forward validation over each week predictions = list() for i in range(len(test)): # predict the week yhat_sequence = forecast(model, history, n_input) # store the predictions predictions.append(yhat_sequence) # get real observation and add to history for predicting the next week history.append(test[i, :]) # evaluate predictions days for each week predictions = array(predictions) score, scores = evaluate_forecasts(test[:, :, 0], predictions) return score, scores # load the new file dataset = read_csv('household_power_consumption_days.csv', header=0, infer_datetime_format=True, parse_dates=['datetime'], index_col=['datetime']) # split into train and test train, test = split_dataset(dataset.values) # evaluate model and get scores n_input = 7 score, scores = evaluate_model(train, test, n_input) # summarize scores summarize_scores('lstm', score, scores) # plot scores days = ['sun', 'mon', 'tue', 'wed', 'thr', 'fri', 'sat'] pyplot.plot(days, scores, marker='o', label='lstm') pyplot.show() ``` 運行該示例適合并評估模型，在所有七天內打印整體 RMSE，以及每個提前期的每日 RMSE。鑒于算法的隨機性，您的具體結果可能會有所不同。您可能想嘗試幾次運行該示例。我們可以看到，在這種情況下，與樸素的預測相比，該模型是巧妙的，實現了大約 399 千瓦的總體 RMSE，小于 465 千瓦的樸素模型。 ```py lstm: [399.456] 419.4, 422.1, 384.5, 395.1, 403.9, 317.7, 441.5 ``` 還創建了每日 RMSE 的圖。該圖顯示，周二和周五可能比其他日子更容易預測，也許星期六在標準周結束時是最難預測的日子。 ![Line Plot of RMSE per Day for Univariate LSTM with Vector Output and 7-day Inputs](https://img.kancloud.cn/b3/b3/b3b3e7b9a5f980ca6d9d8178937ec0fe_1280x960.jpg) 具有向量輸出和 7 天輸入的單變量 LSTM 的每日 RMSE 線圖我們可以通過更改 _n_input_ 變量來增加用作 7 到 14 之間輸入的前幾天的數量。 ```py # evaluate model and get scores n_input = 14 ``` 使用此更改重新運行示例首先會打印模型表現的摘要。具體結果可能有所不同;嘗試運行幾次這個例子。在這種情況下，我們可以看到整體 RMSE 進一步下降到大約 370 千瓦，這表明進一步調整輸入大小以及模型中節點的數量可能會帶來更好的表現。 ```py lstm: [370.028] 387.4, 377.9, 334.0, 371.2, 367.1, 330.4, 415.1 ``` 比較每日 RMSE 分數，我們看到一些更好，有些比使用七天輸入更差。這可以建議以某種方式使用兩個不同大小的輸入的益處，例如兩種方法的集合或者可能是以不同方式讀取訓練數據的單個模型（例如，多頭模型）。 ![Line Plot of RMSE per Day for Univariate LSTM with Vector Output and 14-day Inputs](https://img.kancloud.cn/2b/eb/2beb1459de6997edd5fe3a4f52800314_1280x960.jpg) 具有向量輸出和 14 天輸入的單變量 LSTM 每日 RMSE 的線圖 ## 具有單變量輸入的編碼器 - 解碼器 LSTM 模型在本節中，我們可以更新 vanilla LSTM 以使用編碼器 - 解碼器模型。這意味著模型不會直接輸出向量序列。相反，該模型將包括兩個子模型，即用于讀取和編碼輸入序列的編碼器，以及將讀取編碼輸入序列并對輸出序列中的每個元素進行一步預測的解碼器。差異是微妙的，因為實際上兩種方法實際上都預測了序列輸出。重要的區別在于，在解碼器中使用 LSTM 模型，允許它既知道序列中前一天的預測值，又在輸出序列時累積內部狀態。讓我們仔細看看這個模型是如何定義的。和以前一樣，我們定義了一個包含 200 個單位的 LSTM 隱藏層。這是解碼器模型，它將讀取輸入序列并輸出一個 200 元素向量（每個單元一個輸出），用于捕獲輸入序列中的特征。我們將使用 14 天的總功耗作為輸入。 ```py # define model model = Sequential() model.add(LSTM(200, activation='relu', input_shape=(n_timesteps, n_features))) ``` 我們將使用一種易于在 Keras 中實現的簡單編碼器 - 解碼器架構，它與 LSTM 自動編碼器的架構有很多相似之處。首先，輸入序列的內部表示重復多次，輸出序列中的每個時間步長一次。該序列的向量將被呈現給 LSTM 解碼器。 ```py model.add(RepeatVector(7)) ``` 然后，我們將解碼器定義為具有 200 個單位的 LSTM 隱藏層。重要的是，解碼器將輸出整個序列，而不僅僅是輸出序列末尾的輸出，就像我們對編碼器一樣。這意味著 200 個單位中的每一個都將為七天中的每一天輸出一個值，表示輸出序列中每天預測的基礎。 ```py model.add(LSTM(200, activation='relu', return_sequences=True)) ``` 然后，我們將使用完全連接的層來解釋最終輸出層之前的輸出序列中的每個時間步長。重要的是，輸出層預測輸出序列中的單個步驟，而不是一次七天，這意味著我們將使用應用于輸出序列中每個步驟的相同層。這意味著將使用相同的完全連接的層和輸出層來處理由解碼器提供的每個時間步長。為此，我們將解釋層和輸出層包裝在 [TimeDistributed 包裝器](https://machinelearningmastery.com/timedistributed-layer-for-long-short-term-memory-networks-in-python/)中，該包裝器允許包裝層用于解碼器的每個時間步長。 ```py model.add(TimeDistributed(Dense(100, activation='relu'))) model.add(TimeDistributed(Dense(1))) ``` 這允許 LSTM 解碼器找出輸出序列中的每個步驟所需的上下文以及包裹的密集層以分別解釋每個時間步驟，同時重用相同的權重來執行解釋。另一種方法是展平 LSTM 解碼器創建的所有結構并直接輸出向量。您可以嘗試將其作為擴展程序來查看它的比較方式。因此，網絡輸出具有與輸入相同結構的三維向量，其尺寸為[_ 樣本，時間步長，特征 _]。只有一個功能，每日消耗的總功率，總有七個功能。因此，單個一周的預測將具有以下大小：[ _1,7,1_ ]。因此，在訓練模型時，我們必須重新構造輸出數據（ _y_ ）以具有三維結構而不是[_ 樣本的二維結構，特征 _]用于上一節。 ```py # reshape output into [samples, timesteps, features] train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1)) ``` 我們可以將所有這些綁定到下面列出的更新的 _build_model（）_ 函數中。 ```py # train the model def build_model(train, n_input): # prepare data train_x, train_y = to_supervised(train, n_input) # define parameters verbose, epochs, batch_size = 0, 20, 16 n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1] # reshape output into [samples, timesteps, features] train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1)) # define model model = Sequential() model.add(LSTM(200, activation='relu', input_shape=(n_timesteps, n_features))) model.add(RepeatVector(n_outputs)) model.add(LSTM(200, activation='relu', return_sequences=True)) model.add(TimeDistributed(Dense(100, activation='relu'))) model.add(TimeDistributed(Dense(1))) model.compile(loss='mse', optimizer='adam') # fit network model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose) return model ``` 下面列出了編碼器 - 解碼器模型的完整示例。 ```py # univariate multi-step encoder-decoder lstm from math import sqrt from numpy import split from numpy import array from pandas import read_csv from sklearn.metrics import mean_squared_error from matplotlib import pyplot from keras.models import Sequential from keras.layers import Dense from keras.layers import Flatten from keras.layers import LSTM from keras.layers import RepeatVector from keras.layers import TimeDistributed # split a univariate dataset into train/test sets def split_dataset(data): # split into standard weeks train, test = data[1:-328], data[-328:-6] # restructure into windows of weekly data train = array(split(train, len(train)/7)) test = array(split(test, len(test)/7)) return train, test # evaluate one or more weekly forecasts against expected values def evaluate_forecasts(actual, predicted): scores = list() # calculate an RMSE score for each day for i in range(actual.shape[1]): # calculate mse mse = mean_squared_error(actual[:, i], predicted[:, i]) # calculate rmse rmse = sqrt(mse) # store scores.append(rmse) # calculate overall RMSE s = 0 for row in range(actual.shape[0]): for col in range(actual.shape[1]): s += (actual[row, col] - predicted[row, col])**2 score = sqrt(s / (actual.shape[0] * actual.shape[1])) return score, scores # summarize scores def summarize_scores(name, score, scores): s_scores = ', '.join(['%.1f' % s for s in scores]) print('%s: [%.3f] %s' % (name, score, s_scores)) # convert history into inputs and outputs def to_supervised(train, n_input, n_out=7): # flatten data data = train.reshape((train.shape[0]*train.shape[1], train.shape[2])) X, y = list(), list() in_start = 0 # step over the entire history one time step at a time for _ in range(len(data)): # define the end of the input sequence in_end = in_start + n_input out_end = in_end + n_out # ensure we have enough data for this instance if out_end < len(data): x_input = data[in_start:in_end, 0] x_input = x_input.reshape((len(x_input), 1)) X.append(x_input) y.append(data[in_end:out_end, 0]) # move along one time step in_start += 1 return array(X), array(y) # train the model def build_model(train, n_input): # prepare data train_x, train_y = to_supervised(train, n_input) # define parameters verbose, epochs, batch_size = 0, 20, 16 n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1] # reshape output into [samples, timesteps, features] train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1)) # define model model = Sequential() model.add(LSTM(200, activation='relu', input_shape=(n_timesteps, n_features))) model.add(RepeatVector(n_outputs)) model.add(LSTM(200, activation='relu', return_sequences=True)) model.add(TimeDistributed(Dense(100, activation='relu'))) model.add(TimeDistributed(Dense(1))) model.compile(loss='mse', optimizer='adam') # fit network model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose) return model # make a forecast def forecast(model, history, n_input): # flatten data data = array(history) data = data.reshape((data.shape[0]*data.shape[1], data.shape[2])) # retrieve last observations for input data input_x = data[-n_input:, 0] # reshape into [1, n_input, 1] input_x = input_x.reshape((1, len(input_x), 1)) # forecast the next week yhat = model.predict(input_x, verbose=0) # we only want the vector forecast yhat = yhat[0] return yhat # evaluate a single model def evaluate_model(train, test, n_input): # fit model model = build_model(train, n_input) # history is a list of weekly data history = [x for x in train] # walk-forward validation over each week predictions = list() for i in range(len(test)): # predict the week yhat_sequence = forecast(model, history, n_input) # store the predictions predictions.append(yhat_sequence) # get real observation and add to history for predicting the next week history.append(test[i, :]) # evaluate predictions days for each week predictions = array(predictions) score, scores = evaluate_forecasts(test[:, :, 0], predictions) return score, scores # load the new file dataset = read_csv('household_power_consumption_days.csv', header=0, infer_datetime_format=True, parse_dates=['datetime'], index_col=['datetime']) # split into train and test train, test = split_dataset(dataset.values) # evaluate model and get scores n_input = 14 score, scores = evaluate_model(train, test, n_input) # summarize scores summarize_scores('lstm', score, scores) # plot scores days = ['sun', 'mon', 'tue', 'wed', 'thr', 'fri', 'sat'] pyplot.plot(days, scores, marker='o', label='lstm') pyplot.show() ``` 運行該示例適合模型并總結測試數據集的表現。鑒于算法的隨機性，您的具體結果可能會有所不同。您可能想嘗試幾次運行該示例。我們可以看到，在這種情況下，該模型非常巧妙，總體 RMSE 得分約為 372 千瓦。 ```py lstm: [372.595] 379.5, 399.8, 339.6, 372.2, 370.9, 309.9, 424.8 ``` 還創建了每日 RMSE 的線圖，顯示了與上一節中看到的類似的錯誤模式。 ![Line Plot of RMSE per Day for Univariate Encoder-Decoder LSTM with 14-day Inputs](https://img.kancloud.cn/d9/e9/d9e922a450d7b586c7cde4ebea76ac69_1280x960.jpg) 具有 14 天輸入的單變量編碼器 - 解碼器 LSTM 每天 RMSE 的線圖 ## 具有多變量輸入的編碼器 - 解碼器 LSTM 模型在本節中，我們將更新上一節中開發的編碼器 - 解碼器 LSTM，以使用八個時間序列變量中的每一個來預測下一個標準周的每日總功耗。我們將通過將每個一維時間序列作為單獨的輸入序列提供給模型來實現此目的。 LSTM 將依次創建每個輸入序列的內部表示，其將由解碼器一起解釋。使用多變量輸入有助于那些輸出序列是來自多個不同特征的先前時間步驟的觀察的某些功能的問題，而不僅僅是（或包括）預測的特征。目前還不清楚功耗問題是否屬于這種情況，但我們仍可以探索它。首先，我們必須更新訓練數據的準備工作，以包括所有八項功能，而不僅僅是每日消耗的一項功能。它需要單行更改： ```py X.append(data[in_start:in_end, :]) ``` 下面列出了具有此更改的完整 _to_supervised（）_ 功能。 ```py # convert history into inputs and outputs def to_supervised(train, n_input, n_out=7): # flatten data data = train.reshape((train.shape[0]*train.shape[1], train.shape[2])) X, y = list(), list() in_start = 0 # step over the entire history one time step at a time for _ in range(len(data)): # define the end of the input sequence in_end = in_start + n_input out_end = in_end + n_out # ensure we have enough data for this instance if out_end < len(data): X.append(data[in_start:in_end, :]) y.append(data[in_end:out_end, 0]) # move along one time step in_start += 1 return array(X), array(y) ``` 我們還必須使用擬合模型更新用于進行預測的函數，以使用先前時間步驟中的所有八個特征。再次，另一個小變化： ```py # retrieve last observations for input data input_x = data[-n_input:, :] # reshape into [1, n_input, n] input_x = input_x.reshape((1, input_x.shape[0], input_x.shape[1])) ``` 下面列出了具有此更改的完整 _forecast（）_ 函數： ```py # make a forecast def forecast(model, history, n_input): # flatten data data = array(history) data = data.reshape((data.shape[0]*data.shape[1], data.shape[2])) # retrieve last observations for input data input_x = data[-n_input:, :] # reshape into [1, n_input, n] input_x = input_x.reshape((1, input_x.shape[0], input_x.shape[1])) # forecast the next week yhat = model.predict(input_x, verbose=0) # we only want the vector forecast yhat = yhat[0] return yhat ``` 直接使用相同的模型架構和配置，盡管我們將訓練時期的數量從 20 增加到 50，因為輸入數據量增加了 8 倍。下面列出了完整的示例。 ```py # multivariate multi-step encoder-decoder lstm from math import sqrt from numpy import split from numpy import array from pandas import read_csv from sklearn.metrics import mean_squared_error from matplotlib import pyplot from keras.models import Sequential from keras.layers import Dense from keras.layers import Flatten from keras.layers import LSTM from keras.layers import RepeatVector from keras.layers import TimeDistributed # split a univariate dataset into train/test sets def split_dataset(data): # split into standard weeks train, test = data[1:-328], data[-328:-6] # restructure into windows of weekly data train = array(split(train, len(train)/7)) test = array(split(test, len(test)/7)) return train, test # evaluate one or more weekly forecasts against expected values def evaluate_forecasts(actual, predicted): scores = list() # calculate an RMSE score for each day for i in range(actual.shape[1]): # calculate mse mse = mean_squared_error(actual[:, i], predicted[:, i]) # calculate rmse rmse = sqrt(mse) # store scores.append(rmse) # calculate overall RMSE s = 0 for row in range(actual.shape[0]): for col in range(actual.shape[1]): s += (actual[row, col] - predicted[row, col])**2 score = sqrt(s / (actual.shape[0] * actual.shape[1])) return score, scores # summarize scores def summarize_scores(name, score, scores): s_scores = ', '.join(['%.1f' % s for s in scores]) print('%s: [%.3f] %s' % (name, score, s_scores)) # convert history into inputs and outputs def to_supervised(train, n_input, n_out=7): # flatten data data = train.reshape((train.shape[0]*train.shape[1], train.shape[2])) X, y = list(), list() in_start = 0 # step over the entire history one time step at a time for _ in range(len(data)): # define the end of the input sequence in_end = in_start + n_input out_end = in_end + n_out # ensure we have enough data for this instance if out_end < len(data): X.append(data[in_start:in_end, :]) y.append(data[in_end:out_end, 0]) # move along one time step in_start += 1 return array(X), array(y) # train the model def build_model(train, n_input): # prepare data train_x, train_y = to_supervised(train, n_input) # define parameters verbose, epochs, batch_size = 0, 50, 16 n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1] # reshape output into [samples, timesteps, features] train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1)) # define model model = Sequential() model.add(LSTM(200, activation='relu', input_shape=(n_timesteps, n_features))) model.add(RepeatVector(n_outputs)) model.add(LSTM(200, activation='relu', return_sequences=True)) model.add(TimeDistributed(Dense(100, activation='relu'))) model.add(TimeDistributed(Dense(1))) model.compile(loss='mse', optimizer='adam') # fit network model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose) return model # make a forecast def forecast(model, history, n_input): # flatten data data = array(history) data = data.reshape((data.shape[0]*data.shape[1], data.shape[2])) # retrieve last observations for input data input_x = data[-n_input:, :] # reshape into [1, n_input, n] input_x = input_x.reshape((1, input_x.shape[0], input_x.shape[1])) # forecast the next week yhat = model.predict(input_x, verbose=0) # we only want the vector forecast yhat = yhat[0] return yhat # evaluate a single model def evaluate_model(train, test, n_input): # fit model model = build_model(train, n_input) # history is a list of weekly data history = [x for x in train] # walk-forward validation over each week predictions = list() for i in range(len(test)): # predict the week yhat_sequence = forecast(model, history, n_input) # store the predictions predictions.append(yhat_sequence) # get real observation and add to history for predicting the next week history.append(test[i, :]) # evaluate predictions days for each week predictions = array(predictions) score, scores = evaluate_forecasts(test[:, :, 0], predictions) return score, scores # load the new file dataset = read_csv('household_power_consumption_days.csv', header=0, infer_datetime_format=True, parse_dates=['datetime'], index_col=['datetime']) # split into train and test train, test = split_dataset(dataset.values) # evaluate model and get scores n_input = 14 score, scores = evaluate_model(train, test, n_input) # summarize scores summarize_scores('lstm', score, scores) # plot scores days = ['sun', 'mon', 'tue', 'wed', 'thr', 'fri', 'sat'] pyplot.plot(days, scores, marker='o', label='lstm') pyplot.show() ``` 運行該示例適合模型并總結測試數據集的表現。實驗發現該模型看起來不如單變量情況穩定，并且可能與輸入的八個變量的不同尺度有關。鑒于算法的隨機性，您的具體結果可能會有所不同。您可能想嘗試幾次運行該示例。我們可以看到，在這種情況下，該模型非常巧妙，總體 RMSE 得分約為 376 千瓦。 ```py lstm: [376.273] 378.5, 381.5, 328.4, 388.3, 361.2, 308.0, 467.2 ``` 還創建了每日 RMSE 的線圖。 ![Line Plot of RMSE per Day for Multivariate Encoder-Decoder LSTM with 14-day Inputs](https://img.kancloud.cn/f4/b6/f4b624192b4ac7d4253e174c954f108f_1280x960.jpg) 具有 14 天輸入的多變量編碼器 - 解碼器 LSTM 每天 RMSE 的線圖 ## 具有單變量輸入的 CNN-LSTM 編碼器 - 解碼器模型卷積神經網絡或 CNN 可以用作編碼器 - 解碼器架構中的編碼器。 CNN 不直接支持序列輸入;相反，1D CNN 能夠讀取序列輸入并自動學習顯著特征。然后可以按照正常情況由 LSTM 解碼器解釋這些。我們將使用 CNN 和 LSTM 的混合模型稱為 [CNN-LSTM 模型](https://machinelearningmastery.com/cnn-long-short-term-memory-networks/)，在這種情況下，我們在編碼器 - 解碼器架構中一起使用它們。 CNN 期望輸入數據具有與 LSTM 模型相同的 3D 結構，盡管多個特征被讀取為最終具有相同效果的不同通道。我們將簡化示例并關注具有單變量輸入的 CNN-LSTM，但它可以很容易地更新以使用多變量輸入，這是一個練習。和以前一樣，我們將使用包含 14 天每日總功耗的輸入序列。我們將為編碼器定義一個簡單但有效的 CNN 架構，該架構由兩個卷積層和一個最大池層組成，其結果隨后被展平。第一個卷積層讀取輸入序列并將結果投影到要素圖上。第二個對第一層創建的要素圖執行相同的操作，嘗試放大任何顯著特征。我們將在每個卷積層使用 64 個特征映射，并以三個時間步長的內核大小讀取輸入序列。最大池化層通過將 1/4 的值保持為最大（最大）信號來簡化特征映射。然后將匯集層之后的蒸餾特征映射平展為一個長向量，然后可以將其用作解碼過程的輸入。 ```py model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(n_timesteps,n_features))) model.add(Conv1D(filters=64, kernel_size=3, activation='relu')) model.add(MaxPooling1D(pool_size=2)) model.add(Flatten()) ``` 解碼器與前面部分中定義的相同。唯一的另一個變化是將訓練時期的數量設置為 20。下面列出了具有這些更改的 _build_model（）_ 函數。 ```py # train the model def build_model(train, n_input): # prepare data train_x, train_y = to_supervised(train, n_input) # define parameters verbose, epochs, batch_size = 0, 20, 16 n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1] # reshape output into [samples, timesteps, features] train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1)) # define model model = Sequential() model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(n_timesteps,n_features))) model.add(Conv1D(filters=64, kernel_size=3, activation='relu')) model.add(MaxPooling1D(pool_size=2)) model.add(Flatten()) model.add(RepeatVector(n_outputs)) model.add(LSTM(200, activation='relu', return_sequences=True)) model.add(TimeDistributed(Dense(100, activation='relu'))) model.add(TimeDistributed(Dense(1))) model.compile(loss='mse', optimizer='adam') # fit network model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose) return model ``` 我們現在準備嘗試使用 CNN 編碼器的編碼器 - 解碼器架構。完整的代碼清單如下。 ```py # univariate multi-step encoder-decoder cnn-lstm from math import sqrt from numpy import split from numpy import array from pandas import read_csv from sklearn.metrics import mean_squared_error from matplotlib import pyplot from keras.models import Sequential from keras.layers import Dense from keras.layers import Flatten from keras.layers import LSTM from keras.layers import RepeatVector from keras.layers import TimeDistributed from keras.layers.convolutional import Conv1D from keras.layers.convolutional import MaxPooling1D # split a univariate dataset into train/test sets def split_dataset(data): # split into standard weeks train, test = data[1:-328], data[-328:-6] # restructure into windows of weekly data train = array(split(train, len(train)/7)) test = array(split(test, len(test)/7)) return train, test # evaluate one or more weekly forecasts against expected values def evaluate_forecasts(actual, predicted): scores = list() # calculate an RMSE score for each day for i in range(actual.shape[1]): # calculate mse mse = mean_squared_error(actual[:, i], predicted[:, i]) # calculate rmse rmse = sqrt(mse) # store scores.append(rmse) # calculate overall RMSE s = 0 for row in range(actual.shape[0]): for col in range(actual.shape[1]): s += (actual[row, col] - predicted[row, col])**2 score = sqrt(s / (actual.shape[0] * actual.shape[1])) return score, scores # summarize scores def summarize_scores(name, score, scores): s_scores = ', '.join(['%.1f' % s for s in scores]) print('%s: [%.3f] %s' % (name, score, s_scores)) # convert history into inputs and outputs def to_supervised(train, n_input, n_out=7): # flatten data data = train.reshape((train.shape[0]*train.shape[1], train.shape[2])) X, y = list(), list() in_start = 0 # step over the entire history one time step at a time for _ in range(len(data)): # define the end of the input sequence in_end = in_start + n_input out_end = in_end + n_out # ensure we have enough data for this instance if out_end < len(data): x_input = data[in_start:in_end, 0] x_input = x_input.reshape((len(x_input), 1)) X.append(x_input) y.append(data[in_end:out_end, 0]) # move along one time step in_start += 1 return array(X), array(y) # train the model def build_model(train, n_input): # prepare data train_x, train_y = to_supervised(train, n_input) # define parameters verbose, epochs, batch_size = 0, 20, 16 n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1] # reshape output into [samples, timesteps, features] train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1)) # define model model = Sequential() model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(n_timesteps,n_features))) model.add(Conv1D(filters=64, kernel_size=3, activation='relu')) model.add(MaxPooling1D(pool_size=2)) model.add(Flatten()) model.add(RepeatVector(n_outputs)) model.add(LSTM(200, activation='relu', return_sequences=True)) model.add(TimeDistributed(Dense(100, activation='relu'))) model.add(TimeDistributed(Dense(1))) model.compile(loss='mse', optimizer='adam') # fit network model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose) return model # make a forecast def forecast(model, history, n_input): # flatten data data = array(history) data = data.reshape((data.shape[0]*data.shape[1], data.shape[2])) # retrieve last observations for input data input_x = data[-n_input:, 0] # reshape into [1, n_input, 1] input_x = input_x.reshape((1, len(input_x), 1)) # forecast the next week yhat = model.predict(input_x, verbose=0) # we only want the vector forecast yhat = yhat[0] return yhat # evaluate a single model def evaluate_model(train, test, n_input): # fit model model = build_model(train, n_input) # history is a list of weekly data history = [x for x in train] # walk-forward validation over each week predictions = list() for i in range(len(test)): # predict the week yhat_sequence = forecast(model, history, n_input) # store the predictions predictions.append(yhat_sequence) # get real observation and add to history for predicting the next week history.append(test[i, :]) # evaluate predictions days for each week predictions = array(predictions) score, scores = evaluate_forecasts(test[:, :, 0], predictions) return score, scores # load the new file dataset = read_csv('household_power_consumption_days.csv', header=0, infer_datetime_format=True, parse_dates=['datetime'], index_col=['datetime']) # split into train and test train, test = split_dataset(dataset.values) # evaluate model and get scores n_input = 14 score, scores = evaluate_model(train, test, n_input) # summarize scores summarize_scores('lstm', score, scores) # plot scores days = ['sun', 'mon', 'tue', 'wed', 'thr', 'fri', 'sat'] pyplot.plot(days, scores, marker='o', label='lstm') pyplot.show() ``` 運行該示例適合模型并總結測試數據集的表現。一些實驗表明，使用兩個卷積層使得模型比僅使用單個層更穩定。鑒于算法的隨機性，您的具體結果可能會有所不同。您可能想嘗試幾次運行該示例。我們可以看到，在這種情況下，該模型非常巧妙，總體 RMSE 得分約為 372 千瓦。 ```py lstm: [372.055] 383.8, 381.6, 339.1, 371.8, 371.8, 319.6, 427.2 ``` 還創建了每日 RMSE 的線圖。 ![Line Plot of RMSE per Day for Univariate Encoder-Decoder CNN LSTM with 14-day Inputs](https://img.kancloud.cn/88/77/8877062d1f95e0e1084ebf8dc406c706_1280x960.jpg) 具有 14 天輸入的單變量編碼器 - 解碼器 CNN LSTM 每天 RMSE 的線圖 ## 具有單變量輸入的 ConvLSTM 編碼器 - 解碼器模型 CNN-LSTM 方法的進一步擴展是執行 CNN 的卷積（例如 CNN 如何讀取輸入序列數據）作為每個時間步長的 LSTM 的一部分。這種組合稱為卷積 LSTM，簡稱 ConvLSTM，CNN-LSTM 也用于時空數據。與直接讀取數據以計算內部狀態和狀態轉換的 LSTM 不同，與解釋 CNN 模型輸出的 CNN-LSTM 不同，ConvLSTM 直接使用卷積作為讀取 LSTM 單元本身輸入的一部分。有關如何在 LSTM 單元內計算 ConvLSTM 方程的更多信息，請參閱文章： * [卷積 LSTM 網絡：用于降水預報的機器學習方法](https://arxiv.org/abs/1506.04214v1)，2015。 Keras 庫提供 [ConvLSTM2D 類](https://keras.io/layers/recurrent/#convlstm2d)，支持用于 2D 數據的 ConvLSTM 模型。它可以配置為 1D 多變量時間序列預測。默認情況下，ConvLSTM2D 類要求輸入數據具有以下形狀： ```py [samples, timesteps, rows, cols, channels] ``` 其中每個時間步長數據被定義為（_ 行*列 _）數據點的圖像。我們正在使用一個總功耗的一維序列，如果我們假設我們使用兩周的數據作為輸入，我們可以將其解釋為具有 14 列的一行。對于 ConvLSTM，這將是一次讀取：也就是說，LSTM 將讀取 14 天的一個時間步長并在這些時間步驟之間執行卷積。這不太理想。相反，我們可以將 14 天分成兩個子序列，長度為七天。然后，ConvLSTM 可以讀取兩個時間步驟，并在每個步驟中的七天數據上執行 CNN 過程。對于這個選擇的問題框架，ConvLSTM2D 的輸入因此是： ```py [n, 2, 1, 7, 1] ``` 要么： * **樣本**：n，用于訓練數據集中的示例數。 * **時間**：2，對于我們拆分 14 天窗口的兩個子序列。 * **行**：1，用于每個子序列的一維形狀。 * **列**：7，每個子序列中的七天。 * **頻道**：1，我們正在使用的單一功能作為輸入。您可以探索其他配置，例如將 21 天的輸入分為七天的三個子序列，和/或提供所有八個功能或通道作為輸入。我們現在可以為 ConvLSTM2D 模型準備數據。首先，我們必須將訓練數據集重塑為[_ 樣本，時間步長，行，列，通道 _]的預期結構。 ```py # reshape into subsequences [samples, time steps, rows, cols, channels] train_x = train_x.reshape((train_x.shape[0], n_steps, 1, n_length, n_features)) ``` 然后，我們可以將編碼器定義為 ConvLSTM 隱藏層，然后是準備好解碼的展平層。 ```py model.add(ConvLSTM2D(filters=64, kernel_size=(1,3), activation='relu', input_shape=(n_steps, 1, n_length, n_features))) model.add(Flatten()) ``` 我們還將參數化子序列的數量（ _n_steps_ ）和每個子序列的長度（ _n_length_ ）并將它們作為參數傳遞。模型和訓練的其余部分是相同的。下面列出了具有這些更改的 _build_model（）_ 函數。 ```py # train the model def build_model(train, n_steps, n_length, n_input): # prepare data train_x, train_y = to_supervised(train, n_input) # define parameters verbose, epochs, batch_size = 0, 20, 16 n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1] # reshape into subsequences [samples, time steps, rows, cols, channels] train_x = train_x.reshape((train_x.shape[0], n_steps, 1, n_length, n_features)) # reshape output into [samples, timesteps, features] train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1)) # define model model = Sequential() model.add(ConvLSTM2D(filters=64, kernel_size=(1,3), activation='relu', input_shape=(n_steps, 1, n_length, n_features))) model.add(Flatten()) model.add(RepeatVector(n_outputs)) model.add(LSTM(200, activation='relu', return_sequences=True)) model.add(TimeDistributed(Dense(100, activation='relu'))) model.add(TimeDistributed(Dense(1))) model.compile(loss='mse', optimizer='adam') # fit network model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose) return model ``` 該模型期望五維數據作為輸入。因此，我們還必須在進行預測時更新 _forecast（）_ 函數中單個樣本的準備。 ```py # reshape into [samples, time steps, rows, cols, channels] input_x = input_x.reshape((1, n_steps, 1, n_length, 1)) ``` 具有此變化的 _forecast（）_ 函數以及參數化子序列如下所示。 ```py # make a forecast def forecast(model, history, n_steps, n_length, n_input): # flatten data data = array(history) data = data.reshape((data.shape[0]*data.shape[1], data.shape[2])) # retrieve last observations for input data input_x = data[-n_input:, 0] # reshape into [samples, time steps, rows, cols, channels] input_x = input_x.reshape((1, n_steps, 1, n_length, 1)) # forecast the next week yhat = model.predict(input_x, verbose=0) # we only want the vector forecast yhat = yhat[0] return yhat ``` 我們現在擁有評估編碼器 - 解碼器架構的所有元素，用于多步時間序列預測，其中 ConvLSTM 用作編碼器。完整的代碼示例如下所示。 ```py # univariate multi-step encoder-decoder convlstm from math import sqrt from numpy import split from numpy import array from pandas import read_csv from sklearn.metrics import mean_squared_error from matplotlib import pyplot from keras.models import Sequential from keras.layers import Dense from keras.layers import Flatten from keras.layers import LSTM from keras.layers import RepeatVector from keras.layers import TimeDistributed from keras.layers import ConvLSTM2D # split a univariate dataset into train/test sets def split_dataset(data): # split into standard weeks train, test = data[1:-328], data[-328:-6] # restructure into windows of weekly data train = array(split(train, len(train)/7)) test = array(split(test, len(test)/7)) return train, test # evaluate one or more weekly forecasts against expected values def evaluate_forecasts(actual, predicted): scores = list() # calculate an RMSE score for each day for i in range(actual.shape[1]): # calculate mse mse = mean_squared_error(actual[:, i], predicted[:, i]) # calculate rmse rmse = sqrt(mse) # store scores.append(rmse) # calculate overall RMSE s = 0 for row in range(actual.shape[0]): for col in range(actual.shape[1]): s += (actual[row, col] - predicted[row, col])**2 score = sqrt(s / (actual.shape[0] * actual.shape[1])) return score, scores # summarize scores def summarize_scores(name, score, scores): s_scores = ', '.join(['%.1f' % s for s in scores]) print('%s: [%.3f] %s' % (name, score, s_scores)) # convert history into inputs and outputs def to_supervised(train, n_input, n_out=7): # flatten data data = train.reshape((train.shape[0]*train.shape[1], train.shape[2])) X, y = list(), list() in_start = 0 # step over the entire history one time step at a time for _ in range(len(data)): # define the end of the input sequence in_end = in_start + n_input out_end = in_end + n_out # ensure we have enough data for this instance if out_end < len(data): x_input = data[in_start:in_end, 0] x_input = x_input.reshape((len(x_input), 1)) X.append(x_input) y.append(data[in_end:out_end, 0]) # move along one time step in_start += 1 return array(X), array(y) # train the model def build_model(train, n_steps, n_length, n_input): # prepare data train_x, train_y = to_supervised(train, n_input) # define parameters verbose, epochs, batch_size = 0, 20, 16 n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1] # reshape into subsequences [samples, time steps, rows, cols, channels] train_x = train_x.reshape((train_x.shape[0], n_steps, 1, n_length, n_features)) # reshape output into [samples, timesteps, features] train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1)) # define model model = Sequential() model.add(ConvLSTM2D(filters=64, kernel_size=(1,3), activation='relu', input_shape=(n_steps, 1, n_length, n_features))) model.add(Flatten()) model.add(RepeatVector(n_outputs)) model.add(LSTM(200, activation='relu', return_sequences=True)) model.add(TimeDistributed(Dense(100, activation='relu'))) model.add(TimeDistributed(Dense(1))) model.compile(loss='mse', optimizer='adam') # fit network model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose) return model # make a forecast def forecast(model, history, n_steps, n_length, n_input): # flatten data data = array(history) data = data.reshape((data.shape[0]*data.shape[1], data.shape[2])) # retrieve last observations for input data input_x = data[-n_input:, 0] # reshape into [samples, time steps, rows, cols, channels] input_x = input_x.reshape((1, n_steps, 1, n_length, 1)) # forecast the next week yhat = model.predict(input_x, verbose=0) # we only want the vector forecast yhat = yhat[0] return yhat # evaluate a single model def evaluate_model(train, test, n_steps, n_length, n_input): # fit model model = build_model(train, n_steps, n_length, n_input) # history is a list of weekly data history = [x for x in train] # walk-forward validation over each week predictions = list() for i in range(len(test)): # predict the week yhat_sequence = forecast(model, history, n_steps, n_length, n_input) # store the predictions predictions.append(yhat_sequence) # get real observation and add to history for predicting the next week history.append(test[i, :]) # evaluate predictions days for each week predictions = array(predictions) score, scores = evaluate_forecasts(test[:, :, 0], predictions) return score, scores # load the new file dataset = read_csv('household_power_consumption_days.csv', header=0, infer_datetime_format=True, parse_dates=['datetime'], index_col=['datetime']) # split into train and test train, test = split_dataset(dataset.values) # define the number of subsequences and the length of subsequences n_steps, n_length = 2, 7 # define the total days to use as input n_input = n_length * n_steps score, scores = evaluate_model(train, test, n_steps, n_length, n_input) # summarize scores summarize_scores('lstm', score, scores) # plot scores days = ['sun', 'mon', 'tue', 'wed', 'thr', 'fri', 'sat'] pyplot.plot(days, scores, marker='o', label='lstm') pyplot.show() ``` 運行該示例適合模型并總結測試數據集的表現。一些實驗表明，使用兩個卷積層使得模型比僅使用單個層更穩定。我們可以看到，在這種情況下，該模型非常巧妙，總體 RMSE 得分約為 367 千瓦。 ```py lstm: [367.929] 416.3, 379.7, 334.7, 362.3, 374.7, 284.8, 406.7 ``` 還創建了每日 RMSE 的線圖。 ![Line Plot of RMSE per Day for Univariate Encoder-Decoder ConvLSTM with 14-day Inputs](https://img.kancloud.cn/b7/b7/b7b7712768828b8438ac1a8172241bcc_1280x960.jpg) 具有 14 天輸入的單變量編碼器 - 解碼器 ConvLSTM 每天 RMSE 的線圖 ## 擴展本節列出了一些擴展您可能希望探索的教程的想法。 * **輸入大小**。探索用作模型輸入的更多或更少天數，例如三天，21 天，30 天等。 * **模型調整**。調整模型的結構和超參數，并進一步平均提升模型表現。 * **數據縮放**。探索數據擴展（例如標準化和規范化）是否可用于改善任何 LSTM 模型的表現。 * **學習診斷**。使用診斷，例如訓練的學習曲線和驗證損失以及均方誤差，以幫助調整 LSTM 模型的結構和超參數。如果你探索任何這些擴展，我很想知道。 ## 進一步閱讀如果您希望深入了解，本節將提供有關該主題的更多資源。 ### 帖子 * [多步時間序列預測的 4 種策略](https://machinelearningmastery.com/multi-step-time-series-forecasting/) * [深度學習的循環神經網絡崩潰課程](https://machinelearningmastery.com/crash-course-recurrent-neural-networks-deep-learning/) * [專家對長短期記憶網絡的簡要介紹](https://machinelearningmastery.com/gentle-introduction-long-short-term-memory-networks-experts/) * [關于 LSTM 對時間序列預測的適用性](https://machinelearningmastery.com/suitability-long-short-term-memory-networks-time-series-forecasting/) * [CNN 長短期記憶網絡](https://machinelearningmastery.com/cnn-long-short-term-memory-networks/) * [如何開發 Keras 中序列到序列預測的編碼器 - 解碼器模型](https://machinelearningmastery.com/develop-encoder-decoder-model-sequence-sequence-prediction-keras/) ### API * [pandas.read_csv API](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html) * [pandas.DataFrame.resample API](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.resample.html) * [重采樣偏移別名](http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases) * [sklearn.metrics.mean_squared_error API](http://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_squared_error.html) * [numpy.split API](https://docs.scipy.org/doc/numpy/reference/generated/numpy.split.html) ### 用品 * [個人家庭用電量數據集，UCI 機器學習庫](https://archive.ics.uci.edu/ml/datasets/individual+household+electric+power+consumption)。 * [交流電源，維基百科](https://en.wikipedia.org/wiki/AC_power)。 * [卷積 LSTM 網絡：用于降水預報的機器學習方法](https://arxiv.org/abs/1506.04214v1)，2015。 ## 摘要在本教程中，您了解了如何開發長期短時記憶循環神經網絡，用于家庭功耗的多步時間序列預測。具體來說，你學到了： * 如何開發和評估用于多步時間序列預測的單變量和多變量編碼器 - 解碼器 LSTM。 * 如何開發和評估用于多步時間序列預測的 CNN-LSTM 編碼器 - 解碼器模型。 * 如何開發和評估用于多步時間序列預測的 ConvLSTM 編碼器 - 解碼器模型。你有任何問題嗎？在下面的評論中提出您的問題，我會盡力回答。 **注**：這篇文章摘自“[深度學習時間序列預測](https://machinelearningmastery.com/deep-learning-for-time-series-forecasting/)”一書。看一下，如果您想獲得更多關于在時間序列預測問題上充分利用深度學習方法的分步教程。