Bidirectional LSTMを使って時系列データを2クラスに分類できるか確かめる。
Schuster, Mike, and Kuldip K. Paliwal. "Bidirectional recurrent neural networks." IEEE Transactions on Signal Processing 45.11 (1997): 2673-2681. らによって導入された LSTMの拡張版。以下にBidirectional LSTMの簡単な説明とコードあり。
モデルの定義
model = Sequential() model.add(Embedding(100, 128, input_length=X_train.shape[1])) model.add(Bidirectional(LSTM(64))) model.add(Dropout(0.5)) model.add(Dense(1, activation='sigmoid')) model.compile('adam', 'binary_crossentropy', metrics=['accuracy'])
モデルの精度の確認
LOOP = 10 for i in range(LOOP): # fit batch_size = int(100*(LOOP-i)*1.0/LOOP) print ('...fit', 'batch_size=', batch_size) model.fit(X_train, y_train, verbose=2, batch_size=batch_size, nb_epoch=4, validation_data=[X_test, y_test], callbacks=[losshist]) # test print ('...test roc auc') prd = model.predict(X_test) print roc_auc_score(y_test, prd)
少しずつバッチサイズを小さくしていくようにしている。
2クラス分類としているので評価手法はROC AUC
。
コード
# common import tensorflow as tf import pandas as pd import numpy as np import matplotlib.pyplot as plt;plt.style.use('ggplot') import seaborn as sns; sns.set() import random # sklearn from sklearn.metrics import roc_auc_score from sklearn.model_selection import train_test_split from sklearn.metrics import log_loss from sklearn.preprocessing import PolynomialFeatures # keras from keras.callbacks import Callback from keras.models import Sequential from keras.preprocessing import sequence from keras.layers import Dense, Dropout, Embedding, LSTM, Input, Bidirectional %matplotlib inline print(tf.__version__) # load data data_train = pd.read_csv('./training_data.csv') data_tornm = pd.read_csv('./test_data.csv') X, y = np.array(data_train.ix[:,:-1]), np.array(data_train.ix[:,-1]) # callback class LossHistory(Callback): def on_train_begin(self, logs={}): self.losses = [] def on_batch_end(self, batch, logs={}): self.losses.append(logs.get('loss')) # divide train / test X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.15, random_state=random.randint(0, 100)) # model print('X_train shape:', X_train.shape) print('X_test shape:', X_test.shape) y_train = np.array(y_train) y_test = np.array(y_test) model = Sequential() model.add(Embedding(100, 128, input_length=X_train.shape[1])) model.add(Bidirectional(LSTM(64))) model.add(Dropout(0.5)) model.add(Dense(1, activation='sigmoid')) model.compile('adam', 'binary_crossentropy', metrics=['accuracy']) # train model losshist = LossHistory() print(model.summary()) model.fit(X_train, y_train, verbose=2, batch_size=50, nb_epoch=4, validation_data=[X_test, y_test], callbacks=[losshist])
出力
('X_train shape:', (81872, 21)) ('X_test shape:', (14448, 21)) ____________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ==================================================================================================== embedding_1 (Embedding) (None, 21, 128) 12800 embedding_input_1[0][0] ____________________________________________________________________________________________________ bidirectional_1 (Bidirectional) (None, 128) 98816 embedding_1[0][0] ____________________________________________________________________________________________________ dropout_1 (Dropout) (None, 128) 0 bidirectional_1[0][0] ____________________________________________________________________________________________________ dense_1 (Dense) (None, 1) 129 dropout_1[0][0] ==================================================================================================== Total params: 111745 ____________________________________________________________________________________________________ None Train on 81872 samples, validate on 14448 samples Epoch 1/4