qcoding

[LSTNET]빌딩 전력 수요예측 (Functional 방법) 본문

시계열분석_python

[LSTNET]빌딩 전력 수요예측 (Functional 방법)

Qcoding 2023. 3. 26. 21:52
반응형

* 본 글은 이전 발행글에서 아래의 githup의 코드를 그대로 가져와서 사용하였던 것을 수정하여, functional API 방법을 사용하여 코드를 고친 내용이다. 글의 큰 의미는 없으며 n_future 라는 새로운 미래 예측의 sequnce를 편하게 변경할 수 있게 변경하였다.

 

https://github.com/flaviagiammarino/lstnet-tensorflow

 

GitHub - flaviagiammarino/lstnet-tensorflow: TensorFlow implementation of LSTNet model for multivariate time series forecasting.

TensorFlow implementation of LSTNet model for multivariate time series forecasting. - GitHub - flaviagiammarino/lstnet-tensorflow: TensorFlow implementation of LSTNet model for multivariate time se...

github.com

 

** 글의 내용은 없으며 코드정리의 내용만 담았다.

 

### import

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras

from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error, r2_score, mean_absolute_error

 

### data import

# Warnings 제거
import warnings
warnings.filterwarnings('ignore')

# ▶ Google drive mount or 폴더 클릭 후 구글드라이브 연결
from google.colab import drive
drive.mount('/content/drive')

# # ▶ pd.set option
# import pandas as pd 
pd.set_option('display.max_columns',100)
pd.set_option('display.max_rows',100)  

file_path = './drive/MyDrive/Project/arima/arima_model.gz'

df = pd.read_pickle(file_path)
# Colums 내에 'KW'들어 있는 항을 가지고 합산
df_powerMeter = df.loc[:, df.columns.str.contains('kW')].copy()

# Sum up demands of all power meters
df_powerMeter = df_powerMeter.sum(axis=1).rename('load')
df = pd.DataFrame(df_powerMeter)

df = df.reset_index()

# df = pd.read_csv('./building_load_minute.csv')
### feature 생성

df['Date'] = pd.to_datetime(df['Date'])
df['year']=df['Date'].dt.year
df['month']=df['Date'].dt.year
df['hour']=df['Date'].dt.hour
df['weekday']=df['Date'].dt.weekday
df['day'] = df['Date'].dt.day
df['date']=df['Date'].dt.strftime('%Y-%m')
df['year_day']=df['Date'].dt.date

df['workingday'] = np.where(df['weekday']>4, 0, 1)

df['day_hour'] = df['Date'].dt.strftime('%Y-%m-%d %H')

group = df.groupby(['day_hour'],as_index=False).agg(
    avg_load = ('load', 'mean')
)
group['day_hour'] = pd.to_datetime( group['day_hour'] )
group = group.set_index('day_hour')
group

 

 

## scaling

### 전체 데이터에서 train / test 부분 나눔
criteria = '2019-09-30 23'

y_train = group.loc[group.index <= criteria] 
y_test = group.loc[group.index > criteria]

y_train_idx = list(y_train.index)
y_test_idx = list(y_test.index)

y_train = np.array(y_train)
y_test = np.array(y_test)

y_train.shape , y_test.shape

# scale the data
scaler_y = MinMaxScaler(feature_range=(0, 1))
group_sacled = scaler_y.fit_transform(y_train)
group_sacled_df = pd.DataFrame({'avg_load': group_sacled.reshape(-1,)},index=y_train_idx)
group_sacled.shape

### data input 형태로 split => (sample , timestep , feature)

# split a multivariate sequence into samples
def split_sequences(sequences, n_steps_in, n_steps_out):
    X, y = list(), list()
    for i in range(len(sequences)):
    # find the end of this pattern
        end_ix = i + n_steps_in
        out_end_ix = end_ix + n_steps_out
          # check if we are beyond the dataset
        if out_end_ix > len(sequences):
            break
          # gather input and output parts of the pattern
        seq_x, seq_y = sequences[i:end_ix]['avg_load'], sequences[end_ix:out_end_ix]['avg_load']
        X.append(seq_x)
        y.append(seq_y)
    return np.array(X), np.array(y)
n_steps_in = 12   ### 15분 * 10 = 150분
n_steps_out = 1   ### 15분 예측
n_features = 1

X, y = split_sequences(group_sacled_df, n_steps_in, n_steps_out)

# reshape
X = X.reshape(-1, n_steps_in, n_features)
y = y.reshape(-1, n_steps_out)

print ("X.shape" , X.shape) 
print ("y.shape" , y.shape)

=> 위의 n_step_in 은 몇개의 데이터를 보고 n_step_out 몇개의 데이터를 예측할 것 인가의 문제이다. n_features는 여기서는 단순히 시간을 가지고 예측하므로 1이 된다.

### Model

1) layer

class SkipGRU(tf.keras.layers.Layer):

    def __init__(self,
                 units,
                 p=1,
                 activation='relu',
                 return_sequences=False,
                 return_state=False,
                 **kwargs):

        '''
        Recurrent-skip layer, see Section 3.4 in the LSTNet paper.
        
        Parameters:
        __________________________________
        units: int.
            Number of hidden units of the GRU cell.
        p: int.
            Number of skipped hidden cells.
        activation: str, function.
            Activation function, see https://www.tensorflow.org/api_docs/python/tf/keras/activations.
        return_sequences: bool.
            Whether to return the last output or the full sequence.
        return_state: bool.
            Whether to return the last state in addition to the output.
        **kwargs: See https://www.tensorflow.org/api_docs/python/tf/keras/layers/GRUCell.
        '''

        if p < 1:
            raise ValueError('The number of skipped hidden cells cannot be less than 1.')

        self.units = units
        self.p = p
        self.return_sequences = return_sequences
        self.return_state = return_state
        self.timesteps = None
        self.cell = tf.keras.layers.GRUCell(units=units, activation=activation, **kwargs)

        super(SkipGRU, self).__init__()

    def build(self, input_shape):

        if self.timesteps is None:
            self.timesteps = input_shape[1]

            if self.p > self.timesteps:
                raise ValueError('The number of skipped hidden cells cannot be greater than the number of timesteps.')

    def call(self, inputs):

        '''
        Parameters:
        __________________________________
        inputs: tf.Tensor.
            Layer inputs, 2-dimensional tensor with shape (n_samples, filters) where n_samples is the batch size
            and filters is the number of channels of the convolutional layer.
        Returns:
        __________________________________
        outputs: tf.Tensor.
            Layer outputs, 2-dimensional tensor with shape (n_samples, units) if return_sequences == False,
            3-dimensional tensor with shape (n_samples, n_lookback, units) if return_sequences == True where
            n_samples is the batch size, n_lookback is the number of past time steps used as input and units
            is the number of hidden units of the GRU cell.
        states: tf.Tensor.
            Hidden states, 2-dimensional tensor with shape (n_samples, units) where n_samples is the batch size
            and units is the number of hidden units of the GRU cell.
        '''

        outputs = tf.TensorArray(
            element_shape=(inputs.shape[0], self.units),
            size=self.timesteps,
            dynamic_size=False,
            dtype=tf.float32,
            clear_after_read=False
        )

        states = tf.TensorArray(
            element_shape=(inputs.shape[0], self.units),
            size=self.timesteps,
            dynamic_size=False,
            dtype=tf.float32,
            clear_after_read=False
        )

        initial_states = tf.zeros(
            shape=(tf.shape(inputs)[0], self.units),
            dtype=tf.float32
        )

        for t in tf.range(self.timesteps):

            if t < self.p:
                output, state = self.cell(
                    inputs=inputs[:, t, :],
                    states=initial_states
                )

            else:
                output, state = self.cell(
                    inputs=inputs[:, t, :],
                    states=states.read(t - self.p)
                )

            outputs = outputs.write(index=t, value=output)
            states = states.write(index=t, value=state)

        outputs = tf.transpose(outputs.stack(), [1, 0, 2])
        states = tf.transpose(states.stack(), [1, 0, 2])

        if not self.return_sequences:
            outputs = outputs[:, -1, :]

        if self.return_state:
            states = states[:, -1, :]
            return outputs, states

        else:
            return outputs

2) kerner initialize 함수

def kernel_regularizer(regularizer, regularization_factor):

      '''
      Define the kernel regularizer.
      Parameters:
      __________________________________
      regularizer: str.
          Regularizer, either 'L1', 'L2' or 'L1L2'.
      regularization_factor: float.
          Regularization factor.
      '''

      if regularizer == 'L1':
          return tf.keras.regularizers.L1(l1=regularization_factor)

      elif regularizer == 'L2':
          return tf.keras.regularizers.L2(l2=regularization_factor)

      elif regularizer == 'L1L2':
          return tf.keras.regularizers.L1L2(l1=regularization_factor, l2=regularization_factor)

      else:
          raise ValueError('Undefined regularizer {}.'.format(regularizer))

3) LSTNET 모델 생성함수

def LSTNet_model(  
                 input_shape,
                 filters=100,
                 kernel_size=3,
                 gru_units=100,
                 skip_gru_units=50,
                 skip=1,
                 lags=1,
                 dropout=0.1,
                 regularizer='L2',
                 regularization_factor=0.01,
                 n_steps_out=n_steps_out
                 ):
      # Inputs.
      input_ = keras.layers.Input(shape=input_shape)

      # Convolutional component, see Section 3.2 in the LSTNet paper.
      c = tf.keras.layers.Conv1D(filters=filters, kernel_size=kernel_size, activation='relu')(input_)
      c = tf.keras.layers.Dropout(rate=dropout)(c)

      # Recurrent component, see Section 3.3 in the LSTNet paper.
      r = tf.keras.layers.GRU(units=gru_units, activation='relu')(c)
      r = tf.keras.layers.Dropout(rate=dropout)(r)

      # Recurrent-skip component, see Section 3.4 in the LSTNet paper.
      s = SkipGRU(units=skip_gru_units, activation='relu', return_sequences=True)(c)
      s = tf.keras.layers.Dropout(rate=dropout)(s)
      s = tf.keras.layers.Lambda(function=lambda input_: input_[:, - skip:, :])(s)
      s = tf.keras.layers.Reshape(target_shape=(s.shape[1] * s.shape[2],))(s)
      d = tf.keras.layers.Concatenate(axis=1)([r, s])
      d = tf.keras.layers.Dense(units=n_steps_out, kernel_regularizer=kernel_regularizer(regularizer, regularization_factor))(d)

      # Autoregressive component, see Section 3.6 in the LSTNet paper.
      l = tf.keras.layers.Flatten()(input_[:, - lags:, :])
      l = tf.keras.layers.Dense(units=n_steps_out, kernel_regularizer=kernel_regularizer(regularizer, regularization_factor))(l)

      # Outputs.
      y = tf.keras.layers.Add()([d, l])

      output = keras.models.Model(inputs=[input_], outputs=[y], name='LSTNET_model')
      
      return output

### 모델생성

## LSTM
model = LSTNet_model(input_shape=(X.shape[1:]))
                        
model.summary()

 

### 생성된 모델로 학습

### 모델 학습
epochs= 200
batch_size = 32


## early stop   // model checkpoint
early_stopping = keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)
model_checkpoint = keras.callbacks.ModelCheckpoint(
    filepath='./models/one_step/best_weights.h5',
    monitor='val_loss',
    save_best_only = True,
    verbose = 1
)

model.compile(loss='mse', optimizer=keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999), )

history = model.fit(X , y , epochs=epochs, batch_size=batch_size,  verbose=1 ,validation_split=0.2, shuffle=False, callbacks=[early_stopping,model_checkpoint],)


## 가장 좋은 weighs 가져옴
model.load_weights('./models/one_step/best_weights.h5')
print("model_load!!")

#### MODEL로 예측하기

## 가장 좋은 weighs 가져옴
model.load_weights('./models/one_step/best_weights.h5')
print("model_load!!")

n_future = 72    ## 미래 48시간 (15분 * 4 * 48) 

y_future = []

x_pred = X[-1:, :, :]  # 가지고 있는 기존 데이터의 y_t-1 까지의 값 , shape (1, n_steps_in, n_features)
y_pred = y[-1]  # 가지고 있는 기존 데이터의 맨 마지막 y_t 값

## 반복 예측 수행
for i in range(n_future):
    ##맨처음에 y_t-1 에 y_t 을 더해서 y_t+1을 예측함
    ## x_pred shape = (1,n_steps_in,n_features 이므로 y_pred (1,n_steps_out,n_features) 을 더해야 되므로 맨앞의
    ## 한자리수를 뺀 나머지를 append 해줌
    x_pred = np.append(x_pred[:,1:,:], y_pred.reshape(1,n_steps_out,n_features),axis=1)
    
    # generate the next forecast
    y_pred = model.predict(x_pred, verbose=0)

    # save the forecast
    y_future.append(y_pred.flatten()[0])


# transform the forecasts back to the original scale
y_future = np.array(y_future).reshape(-1, 1)
y_future_inv = scaler_y.inverse_transform(y_future)
y_future_inv.shape

### 결과 DataFrame 생성

### 가지고 있는 데이터에서 y_t+1 의 새로운 시간 추가함
import datetime


### 데이터를 y_train / y_test 나누었을 때는 yrain_index[-1] -> str_date 이고, 이값은 y_test_index[0]과 동일
### 검증없이 실 사용에서는 그냥 현재시각 입력필요 => timedelta(hours='0.25') 필요할듯

date_gen = []
for i in range(n_future):
    if i ==0:
        str_date = y_train_idx[-1]
        # print(str_date)
    
    date_cal = str_date + datetime.timedelta(hours=1)
    date_gen.append(date_cal)
    str_date = date_cal


### actual value
actual_label = np.concatenate([ y_train.reshape(-1), y_test.reshape(-1)[:n_future] ])   ### y_train값과 y_test값에서 앞의 n_future까지가 예측한 값

### prediction value
dummy = np.full(len(y_train),np.nan)
prediction =  np.concatenate([ dummy, y_future_inv.reshape(-1)])

result_df = pd.DataFrame()
result_df.index = y_train_idx + date_gen  ### 모든 데이터의 처음 부터 y_train에서 future 갯수만큼 추가 
result_df['label'] = actual_label ## y_train + y_test[:n_future] --> train + 예측하는 갯수
result_df['pred'] = prediction  ## dummy (y_train 갯수) + n_future (갯수)


mse = np.round(mean_squared_error( y_test.reshape(-1)[:n_future] , y_future_inv.reshape(-1) ))
mae = np.round(mean_absolute_error( y_test.reshape(-1)[:n_future] , y_future_inv.reshape(-1) ))
rmse = np.round(np.sqrt(mean_absolute_error( y_test.reshape(-1)[:n_future] , y_future_inv.reshape(-1) )),2)

metric_df = pd.DataFrame([mse,mae,rmse], index=['MSE', 'MAE', 'RMSE']).T
display(metric_df)

result_df

### figure

#### fig
fig, ax = plt.subplots()
fig.set_size_inches(25,7)

sns.lineplot(x=result_df.index ,y='label', data=result_df, ax=ax, label='label')
sns.lineplot(x=result_df.index ,y='pred', data=result_df, ax=ax, label='pred')
ax.grid()
ax.set(title=f'Prediction {n_future}th' )
ax.tick_params(axis='x', labelrotation=45)

### 마지막 예측구간만 보기

result_df.iloc[-n_future*1:]['label'].plot()
result_df.iloc[-n_future*1:]['pred'].plot()

이전의 subclassing 모델 보다 n_future 만 변경하면 예측에 대한 경우를 바꿀 수 있어서 편하게 여러가지 모델을 확인가능하다.

LSTNET_CONV.ipynb
0.46MB

 

 

**** 추가로 best로 결과가 좋았을때

working / holiday가 분포가 매우 다르므로, working day의 data만 빼서 계산하였을때

아래와 같이 patience 20 이고 48시간을 예측하였을 때가 결과가 매우 좋았다.

#### group 시 workingday만 따로 추출하여 계산
group = df.loc[df['workingday']==1].groupby(['day_hour'],as_index=False).agg(
    avg_load = ('load', 'mean')
)


### early stop
### 모델 학습
epochs= 200
batch_size = 32


## early stop   // model checkpoint
early_stopping = keras.callbacks.EarlyStopping(monitor='val_loss', patience=20)




#### n_future = 48    ## 미래 48시간

 

반응형
Comments