[DL] 사용자 정의 훈련 스탭 (fit 메서드 커스터마이즈 하기)

728x90

학습 내용

사용자 정의 훈련 스탭을 사용하는 방법을 이해한다

사용자 정의 훈련 탭을 사용하는 이유를 이해한다.

사용자 정의 훈련 스텝 사용이유

복잡한 모델 트레이닝 로직:
- 기본 fit 메서드는 일반적인 훈련 루프를 제공하지만, 특별한 요구사항이 있는 경우 이를 커스터마이즈해야함.
  - GAN(Generative Adversarial Networks)처럼 두 개 이상의 모델을 동시에 훈련시켜야 하는 경우
  - 추가적인 손실 함수를 사용하는 경우
보다 세밀한 제어:
- 훈련 프로세스를 더 세밀하게 제어하고 각 단계에서 무슨 일이 일어나는지 명확히 이해하고자 하는 경우에 유용
- 문제를 디버깅하거나 최적화할 수 있습니다.
동적 학습률 변경:
- 학습 중 특정 조건에 따라 학습률을 동적으로 변경하거나 맞춤형 학습률 스케줄링을 적용
맞춤형 손실 함수 및 메트릭:
- 기본 컴파일 옵션에서 제공되지 않는 맞춤형 손실 함수나 메트릭을 사용해야 하는 경우
특별한 데이터 전처리 또는 후처리:
- 입력 데이터를 특정 방식으로 전처리하거나, 예측 값을 특별한 방식으로 후처리해야 하는 경우
복잡한 그래디언트 계산 및 업데이트:
- 표준 옵티마이저 업데이트 방식 대신, 맞춤형 그래디언트 계산 및 변수 업데이트 로직을 적용하고자 할 때
학습 중 특정 로직 삽입:
- 학습 중에 특정 조건을 만족할 때마다 특정 작업을 수행하려는 경우
- 모델의 일부 가중치를 고정하거나, 조건부 상태 관리를 구현할 때 유용

참고

원래 train_step

  def train_step(self, data):
    """The logic for one training step.

    This method can be overridden to support custom training logic.
    For concrete examples of how to override this method see
    [Customizing what happends in fit](https://www.tensorflow.org/guide/keras/customizing_what_happens_in_fit).
    This method is called by `Model.make_train_function`.

    This method should contain the mathematical logic for one step of training.
    This typically includes the forward pass, loss calculation, backpropagation,
    and metric updates.

    Configuration details for *how* this logic is run (e.g. `tf.function` and
    `tf.distribute.Strategy` settings), should be left to
    `Model.make_train_function`, which can also be overridden.

    Args:
      data: A nested structure of `Tensor`s.

    Returns:
      A `dict` containing values that will be passed to
      `tf.keras.callbacks.CallbackList.on_train_batch_end`. Typically, the
      values of the `Model`'s metrics are returned. Example:
      `{'loss': 0.2, 'accuracy': 0.7}`.
    """
    x, y, sample_weight = data_adapter.unpack_x_y_sample_weight(data)
    
    # Run forward pass.
    with tf.GradientTape() as tape:
      y_pred = self(x, training=True)
      loss = self.compute_loss(x, y, y_pred, sample_weight)
    self._validate_target_and_loss(y, loss)
    
    # Run backwards pass.
    self.optimizer.minimize(loss, self.trainable_variables, tape=tape)
    return self.compute_metrics(x, y, y_pred, sample_weight)

예) 학습률이 변화하는 모델

class CustomModel(keras.Model):
    def __init__(self, *args, **kwargs):
        super(CustomModel, self).__init__(*args, **kwargs)
        self.initial_lr = 0.001

    def compile(self, optimizer, loss, metrics, schedule_lr=None):
        super(CustomModel, self).compile(optimizer, loss, metrics)
        self.schedule_lr = schedule_lr

    def train_step(self, data):
        images, labels = data
        if self.schedule_lr:
            self.optimizer.learning_rate = self.schedule_lr(self.optimizer.iterations)

        with tf.GradientTape() as tape:
            predictions = self(images, training=True)
            loss = self.compiled_loss(labels, predictions, regularization_losses=self.losses)

        gradients = tape.gradient(loss, self.trainable_variables)
        self.optimizer.apply_gradients(zip(gradients, self.trainable_variables))

        self.compiled_metrics.update_state(labels, predictions)
        return {m.name: m.result() for m in self.metrics}

# 학습률 스케줄링 함수 정의
def schedule_lr(step):
		'''
		학습률(learning rate)을 특정 단계(step)마다 지수적으로 감소시킴
		'''  
    initial_lr = 0.001   # 초기 학습률
    decay_steps = 1000   # 감소 단위 단계 수
    decay_rate = 0.1     # 감소율
    lr = initial_lr * (decay_rate ** (step // decay_steps))
    return lr

# 모델 인스턴스화
model = get_mnist_model()
custom_model = CustomModel(inputs=model.input, outputs=model.output)

$$\text{lr} = \text{initial\_lr} \times (\text{decay\_rate})^{\left(\frac{\text{step}}{\text{decay\_steps}}\right)}$$

스텝(1000)마다 학습률은 decay_rate 비율(0.1)만큼 줄어들게 됨

참고자료

케라스 창시자에게 배우는 딥러닝

728x90

'AIFFLE > STUDY' 카테고리의 다른 글

[NLP] Attention 쉽게 이해하기 (Query, Key, Value, Transformer에서의 attention 3종류) (0)	2024.06.24
[DL] keras API (Sequential, Functional, Subclassing) 이해하기 (1)	2024.06.02
평가지표 - accuracy, precision, recall, F score, PR curve, AUC-ROC (0)	2024.05.23
[DL] 일반화 성능 향상시키기 (0)	2024.05.21
텐서의 이해 (0)	2024.05.21

coding.king

[DL] 사용자 정의 훈련 스탭 (fit 메서드 커스터마이즈 하기)

사용자 정의 훈련 스텝 사용이유

참고

예) 학습률이 변화하는 모델

참고자료

'AIFFLE > STUDY' 카테고리의 다른 글

티스토리툴바

[DL] 사용자 정의 훈련 스탭 (fit 메서드 커스터마이즈 하기)

사용자 정의 훈련 스텝 사용이유

참고

예) 학습률이 변화하는 모델

참고자료

'AIFFLE > STUDY' 카테고리의 다른 글

'AIFFLE/STUDY' Related Articles

티스토리툴바