如何開發人類活動識別時間序列分類的 RNN 模型 · Machine Learning Mastery 博客文章翻譯

# 如何開發人類活動識別時間序列分類的 RNN 模型 > 原文： [https://machinelearningmastery.com/how-to-develop-rnn-models-for-human-activity-recognition-time-series-classification/](https://machinelearningmastery.com/how-to-develop-rnn-models-for-human-activity-recognition-time-series-classification/) 人類活動識別是將由專用線束或智能電話記錄的加速度計數據序列分類為已知的明確定義的運動的問題。該問題的經典方法涉及基于固定大小的窗口和訓練機器學習模型（例如決策樹的集合）的時間序列數據中的手工制作特征。困難在于此功能工程需要該領域的強大專業知識。最近，諸如 LSTM 之類的循環神經網絡和利用一維卷積神經網絡或 CNN 的變化等深度學習方法已經被證明可以在很少或沒有數據的情況下提供具有挑戰性的活動識別任務的最新結果特征工程，而不是使用原始數據的特征學習。在本教程中，您將發現三種循環神經網絡體系結構，用于對活動識別時間序列分類問題進行建模。完成本教程后，您將了解： * 如何開發一種用于人類活動識別的長短期記憶循環神經網絡。 * 如何開發一維卷積神經網絡 LSTM 或 CNN-LSTM 模型。 * 如何針對同一問題開發一維卷積 LSTM 或 ConvLSTM 模型。讓我們開始吧。 ![How to Develop RNN Models for Human Activity Recognition Time Series Classification](https://img.kancloud.cn/4f/ef/4fef5cb7ea32c17d7570ee0186e17b03_640x427.jpg) 如何開發用于人類活動識別的 RNN 模型時間序列分類照片由 [Bonnie Moreland](https://www.flickr.com/photos/icetsarina/25033478158/) ，保留一些權利。 ## 教程概述本教程分為四個部分;他們是： 1. 使用智能手機數據集進行活動識別 2. 開發 LSTM 網絡模型 3. 開發 CNN-LSTM 網絡模型 4. 開發 ConvLSTM 網絡模型 ## 使用智能手機數據集進行活動識別 [人類活動識別](https://en.wikipedia.org/wiki/Activity_recognition)，或簡稱為 HAR，是基于使用傳感器的移動痕跡來預測人正在做什么的問題。標準的人類活動識別數據集是 2012 年推出的“使用智能手機數據集的活動識別”。它由 Davide Anguita 等人準備并提供。來自意大利熱那亞大學的 2013 年論文“[使用智能手機進行人類活動識別的公共領域數據集](https://upcommons.upc.edu/handle/2117/20897)”中對該數據集進行了全面描述。該數據集在他們的 2012 年論文中用機器學習算法建模，標題為“[使用多類硬件友好支持向量機](https://link.springer.com/chapter/10.1007/978-3-642-35395-6_30)在智能手機上進行人類活動識別。“ 數據集可用，可以從 UCI 機器學習庫免費下載： * [使用智能手機數據集進行人類活動識別，UCI 機器學習庫](https://archive.ics.uci.edu/ml/datasets/human+activity+recognition+using+smartphones) 該數據來自 30 名年齡在 19 至 48 歲之間的受試者，其執行六項標準活動中的一項，同時佩戴記錄運動數據的腰部智能手機。記錄執行活動的每個受試者的視頻，并從這些視頻手動標記移動數據。以下是在記錄其移動數據的同時執行活動的主體的示例視頻。 <iframe allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="" frameborder="0" height="375" src="https://www.youtube.com/embed/XOEN9W05_4A?feature=oembed" width="500"></iframe> 進行的六項活動如下： 1. 步行 2. 走上樓 3. 走樓下 4. 坐在 5. 常設 6. 鋪設記錄的運動數據是來自智能手機的 x，y 和 z 加速度計數據（線性加速度）和陀螺儀數據（角速度），特別是三星 Galaxy S II。以 50Hz（即每秒 50 個數據點）記錄觀察結果。每個受試者進行兩次活動;一旦設備在左側，一次設備在右側。原始數據不可用。相反，可以使用預處理版本的數據集。預處理步驟包括： * 使用噪聲濾波器預處理加速度計和陀螺儀。 * 將數據拆分為 2.56 秒（128 個數據點）的固定窗口，重疊率為 50％。將加速度計數據分割為重力（總）和身體運動分量。特征工程應用于窗口數據，并且提供具有這些工程特征的數據的副本。從每個窗口提取在人類活動識別領域中常用的許多時間和頻率特征。結果是 561 元素的特征向量。根據受試者的數據，將數據集分成訓練（70％）和測試（30％）組。訓練 21 個，測試 9 個。使用旨在用于智能手機的支持向量機（例如定點算術）的實驗結果導致測試數據集的預測準確度為 89％，實現與未修改的 SVM 實現類似的結果。該數據集是免費提供的，可以從 UCI 機器學習庫下載。數據以單個 zip 文件的形式提供，大小約為 58 兆字節。此下載的直接鏈接如下： * [UCI HAR Dataset.zip](https://archive.ics.uci.edu/ml/machine-learning-databases/00240/UCI%20HAR%20Dataset.zip) 下載數據集并將所有文件解壓縮到當前工作目錄中名為“HARDataset”的新目錄中。 ## 開發 LSTM 網絡模型在本節中，我們將為人類活動識別數據集開發長期短期記憶網絡模型（LSTM）。 LSTM 網絡模型是一種循環神經網絡，能夠學習和記憶長輸入數據序列。它們適用于由長序列數據組成的數據，最多 200 到 400 個時間步長。它們可能非常適合這個問題。該模型可以支持多個并行的輸入數據序列，例如加速度計的每個軸和陀螺儀數據。該模型學習從觀察序列中提取特征以及如何將內部特征映射到不同的活動類型。使用 LSTM 進行序列分類的好處是，他們可以直接從原始時間序列數據中學習，反過來不需要領域專業知識來手動設計輸入功能。該模型可以學習時間序列數據的內部表示，并且理想地實現與適合具有工程特征的數據集版本的模型相當的表現。本節分為四個部分;他們是： 1. 加載數據 2. 擬合和評估模型 3. 總結結果 4. 完整的例子 ## 加載數據第一步是將原始數據集加載到內存中。原始數據中有三種主要信號類型：總加速度，車身加速度和車身陀螺儀。每個都有 3 個數據軸。這意味著每個時間步長總共有九個變量。此外，每個數據系列已被劃分為 2.56 秒數據或 128 個時間步長的重疊窗口。這些數據窗口對應于上一節中工程特征（行）的窗口。這意味著一行數據具有（128 * 9）或 1,152 個元素。這比前一節中 561 個元素向量的大小小一倍，并且可能存在一些冗余數據。信號存儲在 train 和 test 子目錄下的/ _Inertial Signals_ /目錄中。每個信號的每個軸都存儲在一個單獨的文件中，這意味著每個訓練和測試數據集都有九個要加載的輸入文件和一個要加載的輸出文件。在給定一致的目錄結構和文件命名約定的情況下，我們可以批量加載這些文件。輸入數據采用 CSV 格式，其中列由空格分隔。這些文件中的每一個都可以作為 NumPy 數組加載。下面的 _load_file（）_ 函數在給定文件填充路徑的情況下加載數據集，并將加載的數據作為 NumPy 數組返回。 ```py # load a single file as a numpy array def load_file(filepath): dataframe = read_csv(filepath, header=None, delim_whitespace=True) return dataframe.values ``` 然后，我們可以將給定組（訓練或測試）的所有數據加載到單個三維 NumPy 陣列中，其中陣列的尺寸為[_ 樣本，時間步長，特征 _]。為了更清楚，有 128 個時間步和 9 個特征，其中樣本數是任何給定原始信號數據文件中的行數。下面的 _load_group（）_ 函數實現了這種行為。 [dstack（）NumPy 函數](https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.dstack.html)允許我們將每個加載的 3D 數組堆疊成單個 3D 數組，其中變量在第三維（特征）上分開。 ```py # load a list of files into a 3D array of [samples, timesteps, features] def load_group(filenames, prefix=''): loaded = list() for name in filenames: data = load_file(prefix + name) loaded.append(data) # stack group so that features are the 3rd dimension loaded = dstack(loaded) return loaded ``` 我們可以使用此功能加載給定組的所有輸入信號數據，例如訓練或測試。下面的 _load_dataset_group（）_ 函數使用目錄之間的一致命名約定加載單個組的所有輸入信號數據和輸出數據。 ```py # load a dataset group, such as train or test def load_dataset_group(group, prefix=''): filepath = prefix + group + '/Inertial Signals/' # load all 9 files as a single array filenames = list() # total acceleration filenames += ['total_acc_x_'+group+'.txt', 'total_acc_y_'+group+'.txt', 'total_acc_z_'+group+'.txt'] # body acceleration filenames += ['body_acc_x_'+group+'.txt', 'body_acc_y_'+group+'.txt', 'body_acc_z_'+group+'.txt'] # body gyroscope filenames += ['body_gyro_x_'+group+'.txt', 'body_gyro_y_'+group+'.txt', 'body_gyro_z_'+group+'.txt'] # load input data X = load_group(filenames, filepath) # load class output y = load_file(prefix + group + '/y_'+group+'.txt') return X, y ``` 最后，我們可以加載每個訓練和測試數據集。輸出數據定義為類號的整數。我們必須對這些類整數進行熱編碼，以使數據適合于擬合神經網絡多類分類模型。我們可以通過調用 [to_categorical（）Keras 函數](https://keras.io/utils/#to_categorical)來實現。下面的 _load_dataset（）_ 函數實現了這種行為，并返回訓練并測試 X 和 y 元素，以便擬合和評估定義的模型。 ```py # load the dataset, returns train and test X and y elements def load_dataset(prefix=''): # load all train trainX, trainy = load_dataset_group('train', prefix + 'HARDataset/') print(trainX.shape, trainy.shape) # load all test testX, testy = load_dataset_group('test', prefix + 'HARDataset/') print(testX.shape, testy.shape) # zero-offset class values trainy = trainy - 1 testy = testy - 1 # one hot encode y trainy = to_categorical(trainy) testy = to_categorical(testy) print(trainX.shape, trainy.shape, testX.shape, testy.shape) return trainX, trainy, testX, testy ``` ### 擬合和評估模型現在我們已將數據加載到內存中以便進行建模，我們可以定義，擬合和評估 LSTM 模型。我們可以定義一個名為 _evaluate_model（）_ 的函數，它接受訓練和測試數據集，擬合訓練數據集上的模型，在測試數據集上對其進行評估，并返回模型表現的估計值。首先，我們必須使用 Keras 深度學習庫來定義 LSTM 模型。該模型需要使用[_ 樣本，時間步長，特征 _]進行三維輸入。這正是我們加載數據的方式，其中一個樣本是時間序列數據的一個窗口，每個窗口有 128 個時間步長，時間步長有九個變量或特征。模型的輸出將是一個六元素向量，包含屬于六種活動類型中每種活動類型的給定窗口的概率。在擬合模型時需要輸入和輸出尺寸，我們可以從提供的訓練數據集中提取它們。 ```py n_timesteps, n_features, n_outputs = trainX.shape[1], trainX.shape[2], trainy.shape[1] ``` 為簡單起見，該模型被定義為順序 Keras 模型。我們將模型定義為具有單個 LSTM 隱藏層。接下來是一個脫落層，旨在減少模型過度擬合到訓練數據。最后，在使用最終輸出層進行預測之前，使用密集的完全連接層來解釋由 LSTM 隱藏層提取的特征。隨機梯度下降的有效 [Adam 版本將用于優化網絡，并且鑒于我們正在學習多類別分類問題，將使用分類交叉熵損失函數。](https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/) 下面列出了該模型的定義。 ```py model = Sequential() model.add(LSTM(100, input_shape=(n_timesteps,n_features))) model.add(Dropout(0.5)) model.add(Dense(100, activation='relu')) model.add(Dense(n_outputs, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) ``` 該模型適用于固定數量的時期，在這種情況下為 15，并且將使用 64 個樣本的批量大小，其中在更新模型的權重之前將 64 個數據窗口暴露給模型。模型擬合后，將在測試數據集上進行評估，并返回測試數據集上擬合模型的精度。注意，在擬合 LSTM 時，通常不對值序列數據進行混洗。這里我們在訓練期間對輸入數據的窗口進行隨機播放（默認）。在這個問題中，我們感興趣的是利用 LSTM 的能力來學習和提取窗口中時間步長的功能，而不是跨窗口。下面列出了完整的 _evaluate_model（）_ 函數。 ```py # fit and evaluate a model def evaluate_model(trainX, trainy, testX, testy): verbose, epochs, batch_size = 0, 15, 64 n_timesteps, n_features, n_outputs = trainX.shape[1], trainX.shape[2], trainy.shape[1] model = Sequential() model.add(LSTM(100, input_shape=(n_timesteps,n_features))) model.add(Dropout(0.5)) model.add(Dense(100, activation='relu')) model.add(Dense(n_outputs, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) # fit network model.fit(trainX, trainy, epochs=epochs, batch_size=batch_size, verbose=verbose) # evaluate model _, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0) return accuracy ``` 網絡結構或選擇的超參數沒有什么特別之處，它們只是這個問題的起點。 ### 總結結果我們無法從單一評估中判斷模型的技能。其原因是神經網絡是隨機的，這意味著當在相同數據上訓練相同的模型配置時將產生不同的特定模型。這是網絡的一個特征，它為模型提供了自適應能力，但需要對模型進行稍微復雜的評估。我們將多次重復對模型的評估，然后在每次運行中總結模型的表現。例如，我們可以調用 _evaluate_model（）_ 共 10 次。這將導致必須總結的模型評估分數。 ```py # repeat experiment scores = list() for r in range(repeats): score = evaluate_model(trainX, trainy, testX, testy) score = score * 100.0 print('>#%d: %.3f' % (r+1, score)) scores.append(score) ``` 我們可以通過計算和報告績效的均值和標準差來總結得分樣本。均值給出了數據集上模型的平均精度，而標準差給出了精度與平均值的平均方差。下面的函數 _summarize_results（）_ 總結了運行的結果。 ```py # summarize scores def summarize_results(scores): print(scores) m, s = mean(scores), std(scores) print('Accuracy: %.3f%% (+/-%.3f)' % (m, s)) ``` 我們可以將重復評估，結果收集和結果匯總捆綁到實驗的主要功能中，稱為 _run_experiment（）_，如下所示。默認情況下，在報告模型表現之前，會對模型進行 10 次評估。 ```py # run an experiment def run_experiment(repeats=10): # load data trainX, trainy, testX, testy = load_dataset() # repeat experiment scores = list() for r in range(repeats): score = evaluate_model(trainX, trainy, testX, testy) score = score * 100.0 print('>#%d: %.3f' % (r+1, score)) scores.append(score) # summarize results summarize_results(scores) ``` ### 完整的例子現在我們已經擁有了所有的部分，我們可以將它們組合成一個有效的例子。完整的代碼清單如下。 ```py # lstm model from numpy import mean from numpy import std from numpy import dstack from pandas import read_csv from keras.models import Sequential from keras.layers import Dense from keras.layers import Flatten from keras.layers import Dropout from keras.layers import LSTM from keras.utils import to_categorical from matplotlib import pyplot # load a single file as a numpy array def load_file(filepath): dataframe = read_csv(filepath, header=None, delim_whitespace=True) return dataframe.values # load a list of files and return as a 3d numpy array def load_group(filenames, prefix=''): loaded = list() for name in filenames: data = load_file(prefix + name) loaded.append(data) # stack group so that features are the 3rd dimension loaded = dstack(loaded) return loaded # load a dataset group, such as train or test def load_dataset_group(group, prefix=''): filepath = prefix + group + '/Inertial Signals/' # load all 9 files as a single array filenames = list() # total acceleration filenames += ['total_acc_x_'+group+'.txt', 'total_acc_y_'+group+'.txt', 'total_acc_z_'+group+'.txt'] # body acceleration filenames += ['body_acc_x_'+group+'.txt', 'body_acc_y_'+group+'.txt', 'body_acc_z_'+group+'.txt'] # body gyroscope filenames += ['body_gyro_x_'+group+'.txt', 'body_gyro_y_'+group+'.txt', 'body_gyro_z_'+group+'.txt'] # load input data X = load_group(filenames, filepath) # load class output y = load_file(prefix + group + '/y_'+group+'.txt') return X, y # load the dataset, returns train and test X and y elements def load_dataset(prefix=''): # load all train trainX, trainy = load_dataset_group('train', prefix + 'HARDataset/') print(trainX.shape, trainy.shape) # load all test testX, testy = load_dataset_group('test', prefix + 'HARDataset/') print(testX.shape, testy.shape) # zero-offset class values trainy = trainy - 1 testy = testy - 1 # one hot encode y trainy = to_categorical(trainy) testy = to_categorical(testy) print(trainX.shape, trainy.shape, testX.shape, testy.shape) return trainX, trainy, testX, testy # fit and evaluate a model def evaluate_model(trainX, trainy, testX, testy): verbose, epochs, batch_size = 0, 15, 64 n_timesteps, n_features, n_outputs = trainX.shape[1], trainX.shape[2], trainy.shape[1] model = Sequential() model.add(LSTM(100, input_shape=(n_timesteps,n_features))) model.add(Dropout(0.5)) model.add(Dense(100, activation='relu')) model.add(Dense(n_outputs, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) # fit network model.fit(trainX, trainy, epochs=epochs, batch_size=batch_size, verbose=verbose) # evaluate model _, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0) return accuracy # summarize scores def summarize_results(scores): print(scores) m, s = mean(scores), std(scores) print('Accuracy: %.3f%% (+/-%.3f)' % (m, s)) # run an experiment def run_experiment(repeats=10): # load data trainX, trainy, testX, testy = load_dataset() # repeat experiment scores = list() for r in range(repeats): score = evaluate_model(trainX, trainy, testX, testy) score = score * 100.0 print('>#%d: %.3f' % (r+1, score)) scores.append(score) # summarize results summarize_results(scores) # run the experiment run_experiment() ``` 運行該示例首先打印已加載數據集的形狀，然后打印訓練和測試集的形狀以及輸入和輸出元素。這確認了樣本數，時間步長和變量，以及類的數量。接下來，創建和評估模型，并為每個模型打印調試消息。最后，打印分數樣本，然后是平均值和標準差。我們可以看到該模型表現良好，在原始數據集上實現了約 89.7％的分類準確度，標準偏差約為 1.3。這是一個很好的結果，考慮到原始論文發表了 89％的結果，在具有重域特定特征工程的數據集上進行了訓練，而不是原始數據集。注意：鑒于算法的隨機性，您的具體結果可能會有所不同。如果是這樣，請嘗試運行幾次代碼。 ```py (7352, 128, 9) (7352, 1) (2947, 128, 9) (2947, 1) (7352, 128, 9) (7352, 6) (2947, 128, 9) (2947, 6) >#1: 90.058 >#2: 85.918 >#3: 90.974 >#4: 89.515 >#5: 90.159 >#6: 91.110 >#7: 89.718 >#8: 90.295 >#9: 89.447 >#10: 90.024 [90.05768578215134, 85.91788259246692, 90.97387173396675, 89.51476077366813, 90.15948422124194, 91.10960298608755, 89.71835765184933, 90.29521547336275, 89.44689514760775, 90.02375296912113] Accuracy: 89.722% (+/-1.371) ``` 現在我們已經了解了如何開發用于時間序列分類的 LSTM 模型，讓我們看看如何開發更復雜的 CNN LSTM 模型。 ## 開發 CNN-LSTM 網絡模型 CNN LSTM 架構涉及使用卷積神經網絡（CNN）層對輸入數據進行特征提取以及 LSTM 以支持序列預測。 CNN LSTM 是針對視覺時間序列預測問題以及從圖像序列（例如視頻）生成文本描述的應用而開發的。具體來說，問題是： * **活動識別**：生成在一系列圖像中演示的活動的文本描述。 * **圖像說明**：生成單個圖像的文本描述。 * **視頻說明**：生成圖像序列的文本描述。您可以在帖子中了解有關 CNN LSTM 架構的更多信息： * [CNN 長短期記憶網絡](https://machinelearningmastery.com/cnn-long-short-term-memory-networks/) 要了解有關組合這些模型的后果的更多信息，請參閱論文： * [卷積，長短期記憶，完全連接的深度神經網絡](https://ieeexplore.ieee.org/document/7178838/)，2015。 CNN LSTM 模型將以塊為單位讀取主序列的子序列，從每個塊中提取特征，然后允許 LSTM 解釋從每個塊提取的特征。實現此模型的一種方法是將 128 個時間步的每個窗口拆分為 CNN 模型要處理的子序列。例如，每個窗口中的 128 個時間步長可以分成 32 個時間步長的四個子序列。 ```py # reshape data into time steps of sub-sequences n_steps, n_length = 4, 32 trainX = trainX.reshape((trainX.shape[0], n_steps, n_length, n_features)) testX = testX.reshape((testX.shape[0], n_steps, n_length, n_features)) ``` 然后我們可以定義一個 CNN 模型，該模型期望以 32 個時間步長和 9 個特征的長度讀取序列。整個 CNN 模型可以包裹在 [TimeDistributed](https://machinelearningmastery.com/timedistributed-layer-for-long-short-term-memory-networks-in-python/) 層中，以允許相同的 CNN 模型在窗口的四個子序列中的每一個中讀取。然后將提取的特征展平并提供給 LSTM 模型以進行讀取，在最終映射到活動之前提取其自身的特征。 ```py # define model model = Sequential() model.add(TimeDistributed(Conv1D(filters=64, kernel_size=3, activation='relu'), input_shape=(None,n_length,n_features))) model.add(TimeDistributed(Conv1D(filters=64, kernel_size=3, activation='relu'))) model.add(TimeDistributed(Dropout(0.5))) model.add(TimeDistributed(MaxPooling1D(pool_size=2))) model.add(TimeDistributed(Flatten())) model.add(LSTM(100)) model.add(Dropout(0.5)) model.add(Dense(100, activation='relu')) model.add(Dense(n_outputs, activation='softmax')) ``` 通常使用兩個連續的 CNN 層，然后是丟失和最大池層，這是 CNN LSTM 模型中使用的簡單結構。下面列出了更新的 _evaluate_model（）_。 ```py # fit and evaluate a model def evaluate_model(trainX, trainy, testX, testy): # define model verbose, epochs, batch_size = 0, 25, 64 n_timesteps, n_features, n_outputs = trainX.shape[1], trainX.shape[2], trainy.shape[1] # reshape data into time steps of sub-sequences n_steps, n_length = 4, 32 trainX = trainX.reshape((trainX.shape[0], n_steps, n_length, n_features)) testX = testX.reshape((testX.shape[0], n_steps, n_length, n_features)) # define model model = Sequential() model.add(TimeDistributed(Conv1D(filters=64, kernel_size=3, activation='relu'), input_shape=(None,n_length,n_features))) model.add(TimeDistributed(Conv1D(filters=64, kernel_size=3, activation='relu'))) model.add(TimeDistributed(Dropout(0.5))) model.add(TimeDistributed(MaxPooling1D(pool_size=2))) model.add(TimeDistributed(Flatten())) model.add(LSTM(100)) model.add(Dropout(0.5)) model.add(Dense(100, activation='relu')) model.add(Dense(n_outputs, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) # fit network model.fit(trainX, trainy, epochs=epochs, batch_size=batch_size, verbose=verbose) # evaluate model _, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0) return accuracy ``` 我們可以像上一節中的直線 LSTM 模型一樣評估此模型。完整的代碼清單如下。 ```py # cnn lstm model from numpy import mean from numpy import std from numpy import dstack from pandas import read_csv from keras.models import Sequential from keras.layers import Dense from keras.layers import Flatten from keras.layers import Dropout from keras.layers import LSTM from keras.layers import TimeDistributed from keras.layers.convolutional import Conv1D from keras.layers.convolutional import MaxPooling1D from keras.utils import to_categorical from matplotlib import pyplot # load a single file as a numpy array def load_file(filepath): dataframe = read_csv(filepath, header=None, delim_whitespace=True) return dataframe.values # load a list of files and return as a 3d numpy array def load_group(filenames, prefix=''): loaded = list() for name in filenames: data = load_file(prefix + name) loaded.append(data) # stack group so that features are the 3rd dimension loaded = dstack(loaded) return loaded # load a dataset group, such as train or test def load_dataset_group(group, prefix=''): filepath = prefix + group + '/Inertial Signals/' # load all 9 files as a single array filenames = list() # total acceleration filenames += ['total_acc_x_'+group+'.txt', 'total_acc_y_'+group+'.txt', 'total_acc_z_'+group+'.txt'] # body acceleration filenames += ['body_acc_x_'+group+'.txt', 'body_acc_y_'+group+'.txt', 'body_acc_z_'+group+'.txt'] # body gyroscope filenames += ['body_gyro_x_'+group+'.txt', 'body_gyro_y_'+group+'.txt', 'body_gyro_z_'+group+'.txt'] # load input data X = load_group(filenames, filepath) # load class output y = load_file(prefix + group + '/y_'+group+'.txt') return X, y # load the dataset, returns train and test X and y elements def load_dataset(prefix=''): # load all train trainX, trainy = load_dataset_group('train', prefix + 'HARDataset/') print(trainX.shape, trainy.shape) # load all test testX, testy = load_dataset_group('test', prefix + 'HARDataset/') print(testX.shape, testy.shape) # zero-offset class values trainy = trainy - 1 testy = testy - 1 # one hot encode y trainy = to_categorical(trainy) testy = to_categorical(testy) print(trainX.shape, trainy.shape, testX.shape, testy.shape) return trainX, trainy, testX, testy # fit and evaluate a model def evaluate_model(trainX, trainy, testX, testy): # define model verbose, epochs, batch_size = 0, 25, 64 n_timesteps, n_features, n_outputs = trainX.shape[1], trainX.shape[2], trainy.shape[1] # reshape data into time steps of sub-sequences n_steps, n_length = 4, 32 trainX = trainX.reshape((trainX.shape[0], n_steps, n_length, n_features)) testX = testX.reshape((testX.shape[0], n_steps, n_length, n_features)) # define model model = Sequential() model.add(TimeDistributed(Conv1D(filters=64, kernel_size=3, activation='relu'), input_shape=(None,n_length,n_features))) model.add(TimeDistributed(Conv1D(filters=64, kernel_size=3, activation='relu'))) model.add(TimeDistributed(Dropout(0.5))) model.add(TimeDistributed(MaxPooling1D(pool_size=2))) model.add(TimeDistributed(Flatten())) model.add(LSTM(100)) model.add(Dropout(0.5)) model.add(Dense(100, activation='relu')) model.add(Dense(n_outputs, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) # fit network model.fit(trainX, trainy, epochs=epochs, batch_size=batch_size, verbose=verbose) # evaluate model _, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0) return accuracy # summarize scores def summarize_results(scores): print(scores) m, s = mean(scores), std(scores) print('Accuracy: %.3f%% (+/-%.3f)' % (m, s)) # run an experiment def run_experiment(repeats=10): # load data trainX, trainy, testX, testy = load_dataset() # repeat experiment scores = list() for r in range(repeats): score = evaluate_model(trainX, trainy, testX, testy) score = score * 100.0 print('>#%d: %.3f' % (r+1, score)) scores.append(score) # summarize results summarize_results(scores) # run the experiment run_experiment() ``` 運行該示例總結了 10 個運行中每個運行的模型表現，然后報告了測試集上模型表現的最終摘要。我們可以看到該模型的表現約為 90.6％，標準偏差約為 1％。注意：鑒于算法的隨機性，您的具體結果可能會有所不同。如果是這樣，請嘗試運行幾次代碼。 ```py >#1: 91.517 >#2: 91.042 >#3: 90.804 >#4: 92.263 >#5: 89.684 >#6: 88.666 >#7: 91.381 >#8: 90.804 >#9: 89.379 >#10: 91.347 [91.51679674244994, 91.04173736002714, 90.80420766881574, 92.26331862911435, 89.68442483881914, 88.66644044791313, 91.38106549032915, 90.80420766881574, 89.37902952154734, 91.34713267729894] Accuracy: 90.689% (+/-1.051) ``` ## 開發 ConvLSTM 網絡模型 CNN LSTM 想法的進一步擴展是執行 CNN 的卷積（例如 CNN 如何讀取輸入序列數據）作為 LSTM 的一部分。這種組合稱為卷積 LSTM，簡稱 ConvLSTM，與 CNN LSTM 一樣，也用于時空數據。與直接讀取數據以計算內部狀態和狀態轉換的 LSTM 不同，并且與解釋 CNN 模型的輸出的 CNN LSTM 不同，ConvLSTM 直接使用卷積作為讀取 LSTM 單元本身的輸入的一部分。有關如何在 LSTM 單元內計算 ConvLSTM 方程的更多信息，請參閱文章： * [卷積 LSTM 網絡：用于降水預報的機器學習方法](https://arxiv.org/abs/1506.04214v1)，2015。 Keras 庫提供 [ConvLSTM2D 類](https://keras.io/layers/recurrent/#convlstm2d)，支持用于 2D 數據的 ConvLSTM 模型。它可以配置為 1D 多變量時間序列分類。默認情況下，ConvLSTM2D 類要求輸入數據具有以下形狀： ```py (samples, time, rows, cols, channels) ``` 其中每個時間步數據被定義為（行*列）數據點的圖像。在上一節中，我們將給定的數據窗口（128 個時間步長）劃分為 32 個時間步長的四個子序列。我們可以在定義 ConvLSTM2D 輸入時使用相同的子序列方法，其中時間步數是窗口中子序列的數量，當我們處理一維數據時行數是 1，列數代表子序列中的時間步長數，在本例中為 32。對于這個選擇的問題框架，ConvLSTM2D 的輸入因此是： * **樣本**：n，表示數據集中的窗口數。 * **時間**：4，對于我們將 128 個時間步長的窗口分成四個子序列。 * **行**：1，用于每個子序列的一維形狀。 * **列**：32，表示輸入子序列中的 32 個時間步長。 * **頻道**：9，為九個輸入變量。我們現在可以為 ConvLSTM2D 模型準備數據。 ```py n_timesteps, n_features, n_outputs = trainX.shape[1], trainX.shape[2], trainy.shape[1] # reshape into subsequences (samples, time steps, rows, cols, channels) n_steps, n_length = 4, 32 trainX = trainX.reshape((trainX.shape[0], n_steps, 1, n_length, n_features)) testX = testX.reshape((testX.shape[0], n_steps, 1, n_length, n_features)) ``` ConvLSTM2D 類需要根據 CNN 和 LSTM 進行配置。這包括指定濾波器的數量（例如 64），二維內核大小，在這種情況下（子序列時間步長的 1 行和 3 列），以及激活函數，在這種情況下是整流的線性。與 CNN 或 LSTM 模型一樣，輸出必須展平為一個長向量，然后才能通過密集層進行解釋。 ```py # define model model = Sequential() model.add(ConvLSTM2D(filters=64, kernel_size=(1,3), activation='relu', input_shape=(n_steps, 1, n_length, n_features))) model.add(Dropout(0.5)) model.add(Flatten()) model.add(Dense(100, activation='relu')) model.add(Dense(n_outputs, activation='softmax')) ``` 然后我們可以在之前對 LSTM 和 CNN LSTM 模型進行評估。下面列出了完整的示例。 ```py # convlstm model from numpy import mean from numpy import std from numpy import dstack from pandas import read_csv from keras.models import Sequential from keras.layers import Dense from keras.layers import Flatten from keras.layers import Dropout from keras.layers import LSTM from keras.layers import TimeDistributed from keras.layers import ConvLSTM2D from keras.utils import to_categorical from matplotlib import pyplot # load a single file as a numpy array def load_file(filepath): dataframe = read_csv(filepath, header=None, delim_whitespace=True) return dataframe.values # load a list of files and return as a 3d numpy array def load_group(filenames, prefix=''): loaded = list() for name in filenames: data = load_file(prefix + name) loaded.append(data) # stack group so that features are the 3rd dimension loaded = dstack(loaded) return loaded # load a dataset group, such as train or test def load_dataset_group(group, prefix=''): filepath = prefix + group + '/Inertial Signals/' # load all 9 files as a single array filenames = list() # total acceleration filenames += ['total_acc_x_'+group+'.txt', 'total_acc_y_'+group+'.txt', 'total_acc_z_'+group+'.txt'] # body acceleration filenames += ['body_acc_x_'+group+'.txt', 'body_acc_y_'+group+'.txt', 'body_acc_z_'+group+'.txt'] # body gyroscope filenames += ['body_gyro_x_'+group+'.txt', 'body_gyro_y_'+group+'.txt', 'body_gyro_z_'+group+'.txt'] # load input data X = load_group(filenames, filepath) # load class output y = load_file(prefix + group + '/y_'+group+'.txt') return X, y # load the dataset, returns train and test X and y elements def load_dataset(prefix=''): # load all train trainX, trainy = load_dataset_group('train', prefix + 'HARDataset/') print(trainX.shape, trainy.shape) # load all test testX, testy = load_dataset_group('test', prefix + 'HARDataset/') print(testX.shape, testy.shape) # zero-offset class values trainy = trainy - 1 testy = testy - 1 # one hot encode y trainy = to_categorical(trainy) testy = to_categorical(testy) print(trainX.shape, trainy.shape, testX.shape, testy.shape) return trainX, trainy, testX, testy # fit and evaluate a model def evaluate_model(trainX, trainy, testX, testy): # define model verbose, epochs, batch_size = 0, 25, 64 n_timesteps, n_features, n_outputs = trainX.shape[1], trainX.shape[2], trainy.shape[1] # reshape into subsequences (samples, time steps, rows, cols, channels) n_steps, n_length = 4, 32 trainX = trainX.reshape((trainX.shape[0], n_steps, 1, n_length, n_features)) testX = testX.reshape((testX.shape[0], n_steps, 1, n_length, n_features)) # define model model = Sequential() model.add(ConvLSTM2D(filters=64, kernel_size=(1,3), activation='relu', input_shape=(n_steps, 1, n_length, n_features))) model.add(Dropout(0.5)) model.add(Flatten()) model.add(Dense(100, activation='relu')) model.add(Dense(n_outputs, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) # fit network model.fit(trainX, trainy, epochs=epochs, batch_size=batch_size, verbose=verbose) # evaluate model _, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0) return accuracy # summarize scores def summarize_results(scores): print(scores) m, s = mean(scores), std(scores) print('Accuracy: %.3f%% (+/-%.3f)' % (m, s)) # run an experiment def run_experiment(repeats=10): # load data trainX, trainy, testX, testy = load_dataset() # repeat experiment scores = list() for r in range(repeats): score = evaluate_model(trainX, trainy, testX, testy) score = score * 100.0 print('>#%d: %.3f' % (r+1, score)) scores.append(score) # summarize results summarize_results(scores) # run the experiment run_experiment() ``` 與之前的實驗一樣，運行模型會在每次擬合和評估時打印模型的表現。最終模型表現的摘要在運行結束時給出。我們可以看到，該模型在實現約 90％的準確度的問題上始終表現良好，可能比較大的 CNN LSTM 模型具有更少的資源。注意：鑒于算法的隨機性，您的具體結果可能會有所不同。如果是這樣，請嘗試運行幾次代碼。 ```py >#1: 90.092 >#2: 91.619 >#3: 92.128 >#4: 90.533 >#5: 89.243 >#6: 90.940 >#7: 92.026 >#8: 91.008 >#9: 90.499 >#10: 89.922 [90.09161859518154, 91.61859518154056, 92.12758737699356, 90.53274516457415, 89.24329826942655, 90.93993892093654, 92.02578893790296, 91.00780454699695, 90.49881235154395, 89.92195453003053] Accuracy: 90.801% (+/-0.886) ``` ## 擴展本節列出了一些擴展您可能希望探索的教程的想法。 * **數據準備**。考慮探索簡單的數據擴展方案是否可以進一步提升模型表現，例如標準化，標準化和電源轉換。 * **LSTM 變化**。 LSTM 架構的變體可以在此問題上實現更好的表現，例如堆疊 LSTM 和雙向 LSTM。 * **超參數調整**。考慮探索模型超參數的調整，例如單位數，訓練時期，批量大小等。如果你探索任何這些擴展，我很想知道。 ## 進一步閱讀如果您希望深入了解，本節將提供有關該主題的更多資源。 ### 文件 * [使用智能手機進行人類活動識別的公共領域數據集](https://upcommons.upc.edu/handle/2117/20897)，2013 年。 * [智能手機上的人類活動識別使用多類硬件友好支持向量機](https://link.springer.com/chapter/10.1007/978-3-642-35395-6_30)，2012。 * [卷積，長短期記憶，完全連接的深度神經網絡](https://ieeexplore.ieee.org/document/7178838/)，2015。 * [卷積 LSTM 網絡：用于降水預報的機器學習方法](https://arxiv.org/abs/1506.04214v1)，2015。 ### 用品 * [使用智能手機數據集進行人類活動識別，UCI 機器學習庫](https://archive.ics.uci.edu/ml/datasets/human+activity+recognition+using+smartphones) * [活動識別，維基百科](https://en.wikipedia.org/wiki/Activity_recognition) * [使用智能手機傳感器的活動識別實驗，視頻](https://www.youtube.com/watch?v=XOEN9W05_4A)。 ## 摘要在本教程中，您發現了三種循環神經網絡體系結構，用于對活動識別時間序列分類問題進行建模。具體來說，你學到了： * 如何開發一種用于人類活動識別的長短期記憶循環神經網絡。 * 如何開發一維卷積神經網絡 LSTM 或 CNN LSTM 模型。 * 如何針對同一問題開發一維卷積 LSTM 或 ConvLSTM 模型。你有任何問題嗎？在下面的評論中提出您的問題，我會盡力回答。