TensorFlow+numpyでData Augmentationして画像の学習データを増やす

2018-11-11 machinelearning tensorflow

Data Augmentationは学習データを加工したものを学習データに加えることで数を増やすというもの。加工したデータには通常元のものと同じラベルが付くことになるが、例えば画像を反転や回転させても元々のものと同じだと認識されるべきだとしたら妥当だ。つまり、なんでもすれば良いわけではなくデータセットに応じた、元のデータと同じラベルが付くような加工をする必要があり、裏を返せばそのような違いがあっても同じものであることをモデルに学習させることができる。

今回はData Augmentationで行われる加工をTensorFlowやnumpyの関数でおなじみLennaの画像に行う。

必要なパッケージと画像をimportする。Jupyter Notebooksで実行する。

%matplotlib inline
from PIL import Image
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
im = Image.open("lenna.png", "r")

Flipping

左右と上下の反転。randomは1/2で反転する。

fliph = tf.image.flip_left_right(im)
flipv = tf.image.flip_up_down(im)

with tf.Session() as sess:
    results = sess.run([fliph, flipv])
plt.imshow(np.hstack(results))

Rotating

rot90()

反時計周りに90度回転させる。

rot90 = tf.image.rot90(im)

with tf.Session() as sess:
    results = sess.run([rot90])
plt.imshow(np.hstack(results))

Cropping

中央と一部分の切り取り。

central_crop = tf.image.central_crop(im, 0.7)
crop_to_box =  tf.image.crop_to_bounding_box(im, 50, 50, 100, 100)
ops = [tf.cast(tf.image.resize_images(op, (200, 200)), tf.uint8) for op in [central_crop, crop_to_box]]
with tf.Session() as sess:
    results = sess.run(ops)
plt.imshow(np.hstack(results))

Cutout

以下の論文にあるランダムな正方形の領域をマスキングすることで、画像の一部分だけではなくなるべく全体のコンテキストを使わせる手法。

Terrance DeVries, Graham W. Taylor (2017) Improved Regularization of Convolutional Neural Networks with Cutout

numpyの関数 - sambaiz-net

def random_cutout(image, size = 100):
    w = image.shape[0]
    h = image.shape[1]
    mask = np.ones((w, h, 3), np.uint8)
    x = np.random.randint(w)
    y = np.random.randint(h)
    x1 = np.clip(x - size // 2, 0, w)
    x2 = np.clip(x + size // 2, 0, w)
    y1 = np.clip(y - size // 2, 0, h)
    y2 = np.clip(y + size // 2, 0, h)
    mask[x1: x2, y1: y2] = 0
    return im * mask

result = random_cutout(np.array(im))
plt.imshow(result)