[Python] albumentations 라이브러리를 이용한 Image Agumentation :: Bounding Box 좌표와 함께 이미지 변형하는 방법

AI/Object Detection

[Python] albumentations 라이브러리를 이용한 Image Agumentation :: Bounding Box 좌표와 함께 이미지 변형하는 방법

슈퍼짱짱 2022. 5. 6. 15:00

albumentations 라이브러리를 이용한 Image Agumentation ::
Bounding Box 좌표와 함께 이미지 변형하는 방법

이미 누군가 구현해 놓은 albumentations 라이브러리를 사용해서 Image를 변형시킬 수 있다.

주로 Class가 Imbalance 할 때 적은 수의 Class 이미지를 증강시키는데 사용하거나(Image Agumentation),

꼭 이미지 개수 증강이 아니더라도, 모델의 성능을 높히기 위해 일부러 확률적으로 노이즈를 첨가하는 등의 역할을 한다.

Image를 변형시키는 방법에는 색 보정 외에도 이미지를 회전시키거나, 뒤집는 등의 방법이 있는데

이 라이브러리는 Image Object Detection 문제에서 Bounding Box 좌표도 자동으로 함께 이동시켜준다는 장점이 있다.

실습을 통해 albumentations가 제공하는 여러 방법들을 알아보겠다.

참고로 albumentations의 공식 홈페이지도 있다. 더 많은, 더 자세한 설명이 필요하다면 오른쪽 링크에 가면 있다. https://albumentations.ai/docs/

사용할 이미지는 다음과 같다.

먼저 필요한 라이브러리들을 불러와준다.

import albumentations as A
from albumentations.pytorch import ToTensorV2
import cv2
import os
import numpy as np
from PIL import Image

가장 처음에 불러온 albumentations 라이브러리가 핵심이다.

다음으로 이미지와 bounding box 좌표도 불러와준다.

image = cv2.imread(img_path)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # (500, 335)

bboxes = np.loadtxt(fname=label_path, delimiter=" ",ndmin=2)
bboxes = np.roll(bboxes, 4, axis=1).tolist()  # [[0.641, 0.5705705705705706, 0.718, 0.8408408408408409, 6.0]]

bounding box 좌표는 원래 class가 먼저오고 다음에 x, y, w, h 순서인데 x, y, w, h 다음에 class가 오도록 순서를 바꿔준다.

불러온 이미지에 bound box를 표시하면 다음과 같다.

참고로 위 이미지를 그리는 코드는 다음과 같다.

뒤에서도 계속 아래 function을 사용해 결과를 확인하겠다.

import matplotlib.pyplot as plt
import matplotlib.patches as patches

def myFig(img, label) :
    # Create figure and axes
    fig, ax = plt.subplots()

    # Display the image
    ax.imshow(img)

    if len(label)>0 :
        # Create a Rectangle patch
        dw = img.shape[1]
        dh = img.shape[0]

        x1 = (label[0][0] - label[0][2] / 2) * dw
        y1 = (label[0][1] - label[0][3] / 2) * dh
        w = label[0][2] * dw
        h = label[0][3] * dh
        rect = patches.Rectangle((x1, y1), w, h, linewidth=1, edgecolor='r', facecolor='none')

        # Add the patch to the Axes
        ax.add_patch(rect)

    plt.show()

1. LongestMaxSize

LongestMaxSize는 이미지 사이즈를 바꾸고 싶을 때 사용한다.

현재 불러온 이미지의 width는 335, height는 500이다.

이를 LongestMaxSize로 둘 중 더 긴 쪽 사이즈를 정해준 사이즈로 바꾸려면 다음과 같이하면된다.

IMAGE_SIZE = 416
train_transforms = A.Compose(
    [
        A.LongestMaxSize(max_size=IMAGE_SIZE), 
    ],
    bbox_params=A.BboxParams(format="yolo", min_visibility=0.4, label_fields=[],),
)

transformed = train_transforms(image=image, bboxes=bboxes)
transformed_image = transformed['image']
transformed_bboxes = transformed['bboxes']

bounding box 좌표 역시 알맞게 변한것을 확인할 수 있다.

object detection 형태로 bounding box를 변형시켜주기 위해 A.BboxParmas format에 'yolo'를 입력해 주었다.

2. PadIfNeeded

위 이미지는 width에 비해 height가 긴 직사각형 형태의 이미지이다.

이를 padding으로 정사각형 형태로 바꿔주는 방법이다.

IMAGE_SIZE = 416
train_transforms = A.Compose(
    [
        A.LongestMaxSize(max_size=IMAGE_SIZE), # (416, 279, 3)
        A.PadIfNeeded( # (416, 416, 3)
            min_height=IMAGE_SIZE,
            min_width=IMAGE_SIZE,
            border_mode=cv2.BORDER_CONSTANT, # 0
        ),
    ],
    bbox_params=A.BboxParams(format="yolo", min_visibility=0.4, label_fields=[],),
)

transformed = train_transforms(image=image, bboxes=bboxes)
transformed_image = transformed['image']
transformed_bboxes = transformed['bboxes']

width와 height 모두 416으로 정사각형 형태가 되었고, cv2.BORDER_CONSTANT 색으로 빈 곳을 채우도록 했다.

cv2.BORDER_CONSTANT 값은 0으로, color가 0이면 검정색을 의미한다.

3. RandomCrop

random 한 위치에 대해 지정해준 width와 height 만큼 잘라서 return하는 방법이다.

train_transforms = A.Compose(
    [
        A.RandomCrop(width=100, height=100), # (100, 100, 3) 
    ],
    bbox_params=A.BboxParams(format="yolo", min_visibility=0.4, label_fields=[],),
)

transformed = train_transforms(image=image, bboxes=bboxes)
transformed_image = transformed['image']
transformed_bboxes = transformed['bboxes']

width와 height에 100을 입력해주어 100 by 100의 이미지가 return되었고,

bounding box는 해당 이미지에 포함되지 않았다.

어느 위치를 crop할 지는 random 이라 같은 코드를 한 번 더 실행했더니 다음과 같은 결과가 나왔다.

4. ColorJitter

다음은 이미지의 색상을 바꾸는 방법이다.

train_transforms = A.Compose(
    [
        A.ColorJitter(brightness=0.6, contrast=0.6, saturation=0.6, hue=0.6, p=1), 
    ],
    bbox_params=A.BboxParams(format="yolo", min_visibility=0.4, label_fields=[],),
)

transformed = train_transforms(image=image, bboxes=bboxes)
transformed_image = transformed['image']
transformed_bboxes = transformed['bboxes']

밝기, 대비, 채도, 색조 등을 임의로 바꾸는 기능이다.

brightness = 0.6을 넣었다고 딱 0.6만큼 바꾸는게 아니라, [max(0, 1 - brightness), 1 + brightness] 사이의 값 중 하나로 바꾸어준다. 다른 parameter들도 마찬가지이다.

역시 random 성이 있어 매 번 실행할 때마다 다른 결과를 return한다.

마지막 p는 ColorJitter 자체를 실행할지 말지에 대한 확률값이다. p=1이면 무조건 실행하는 것이고, 0에 가까울 수록 color를 바꿀 확률이 낮아지는 것이다.

각 parameter에 대한 description은 다음과 같고, 이는 위 albumentations 공식 홈페이지에서도 확인할 수 있다.

5. ShiftScaleRotate

이미지를 rotation 시키는 방법이다.

train_transforms = A.Compose(
    [
        A.ShiftScaleRotate(rotate_limit=50, p=1, border_mode=cv2.BORDER_CONSTANT),
    ],
    bbox_params=A.BboxParams(format="yolo", min_visibility=0.4, label_fields=[],),
)

transformed = train_transforms(image=image, bboxes=bboxes)
transformed_image = transformed['image']
transformed_bboxes = transformed['bboxes']

p는 역시 ShiftScaleRotate를 실행시킬지 말지에 대한 확률이고, rotate_limit은 최대 각도를 의미한다. (-rotate_limit~rotate_limit)

역시 random 하게 적용되어 매 번 실행할 때마다 몇도가 rotate될 지가 다르다.

border_mode는 이미지가 rotate 되어 빈 공간이 생겼을 때 어떤 색으로 채울 지에 대한 parmeter이다.

단, ShiftScaleRotate에서 주의할 점은 bounding box가 아래 파란색 box처럼 rotation 되지 않는다는 것이다.

이로인해 bounding box 좌표가 왜곡될 수 있으니 이미지의 특성에 따라 rotation 할 때는 주의가 필요할 수도 있다.

6. HorizontalFlip

이미지를 좌우반전하는 방법이다.

train_transforms = A.Compose(
    [
        A.HorizontalFlip(p=1),
    ],
    bbox_params=A.BboxParams(format="yolo", min_visibility=0.4, label_fields=[],),
)

transformed = train_transforms(image=image, bboxes=bboxes)
transformed_image = transformed['image']
transformed_bboxes = transformed['bboxes']

7. Blur

이미지 Blur 처리하는 방법이다.

train_transforms = A.Compose(
    [
        A.Blur(p=1), 
    ],
    bbox_params=A.BboxParams(format="yolo", min_visibility=0.4, label_fields=[],),
)

transformed = train_transforms(image=image, bboxes=bboxes)
transformed_image = transformed['image']
transformed_bboxes = transformed['bboxes']

지금까지 알아본 것 외에도 channel shuffle, to gray 등 더 다양한 방법들이 많으니 홈페이지를 참고하면 좋겠다.

저작자표시 비영리 변경금지

'AI > Object Detection' 카테고리의 다른 글

[Object Detection] YOLO v1 ~ v6 비교(1) (6)	2022.06.23
[Python] Object Detection Mosaic Augmentation :: YOLO v5 (2)	2022.06.09
[Python] mAP(mean Average Precision) 예시 및 코드 (1)	2022.06.08
[Object Detection(객체 검출)] YOLO v1 : You Only Look Once (8)	2022.04.04
Object Detection이란? Object Detection 용어정리 (0)	2022.03.31

현재글[Python] albumentations 라이브러리를 이용한 Image Agumentation :: Bounding Box 좌표와 함께 이미지 변형하는 방법

슈퍼짱짱