NumPyを使って実行できる10の画像処理ステップ

画像処理は数学的計算の一種です。デジタル画像はピクセルと呼ばれる小さな色の点で構成されています。各ピクセルは、赤、緑、青（RGB）の3つの独立した色で構成されています。各ピクセルの主な色は、RGBの各成分の値によって決まります。

この記事では、NumPyを使って実行できる10の画像処理手順を紹介します。より強力な画像処理ライブラリも存在しますが、これらのシンプルな方法はNumPyの操作をより習得するのに役立ちます。

まずPillowを使って画像を読み取りました。

 import numpy as np #Use PIL to access image data from PIL import Image img = Image.open('monalisa.jpg') #Create array from image data M = np.array(img) #Display array from image data display(Image.fromarray(M))

1. 画像サイズを縮小する

def reduce_image_size_by_n(image, n): # Get the height and width of the image height, width, channels = image.shape # Reduce the height and width by n new_height = height // n new_width = width // n # Create a new array to store the reduced image downsampled_image = np.zeros((new_height, new_width, channels), dtype=image.dtype) # Iterate over each pixel of the reduced image for i in range(new_height): for j in range(new_width): # Take every other pixel along each axis to reduce the image downsampled_image[i, j] = image[n*i, n*j] return downsampled_image #Try the function using n = 2 reduced_M = reduce_image_size_by_n(M, 2) display(reduced_M)

2. 水平反転

def flip_image(image): # Takes all rows in image (:) and reverses it the order of columns (::-1) flip_image = image[:, ::-1] return flip_image #Try function using reduced image display(flip_image(reduced_M))

3. 垂直反転

def rotate_image (image, n): # rotate image using rot90, use n to determine number of rotation rotated_img = Image.fromarray(np.rot90(image, k=n, axes=(1, 0))) return rotated_img #rotate image twice (n=2) display(rotate_image(reduced_M, 2))

4. 画像をトリミングする

def crop_image(image, crop_ratio, zoom_ratio): #create focused part using crop_ratio and zoom_ratio of choice top = image.shape[0] // crop_ratio bottom = zoom_ratio * image.shape[0] // crop_ratio left = image.shape[1] // crop_ratio right = zoom_ratio * image.shape[1] // crop_ratio # Extract the focused part using array slicing focused_part = image[top:bottom, left:right] return focused_part display(crop_image(reduced_M, 4, 2))

5. RGBチャンネル

def RGB_image(image,image_color): if image_color == 'R': #make a copy of image for the color channel img_R = image.copy() #set other color channel to zero. Here Red is the first channel [0] img_R[:, :, (1, 2)] = 0 return img_R elif image_color == 'G': img_G = image.copy() #set other color channel to zero. Here Green is the second channel [1] img_G[:, :, (0, 2)] = 0 return img_G elif image_color == 'B': img_B = image.copy() #set other color channel to zero. Here Blue is the third channel [2] img_B[:, :, (0, 1)] = 0 return img_B

レッドチャンネルを見る

M_red = Image.fromarray(RGB_image(reduced_M, 'R')) display(M_red)

緑

M_green = Image.fromarray(RGB_image(reduced_M, 'G')) display(M_green)

青

M_blue = Image.fromarray(RGB_image(reduced_M, 'B')) display(M_blue)

6. フィルターを適用する

この例ではセピア (茶色) を使用しており、さまざまな要件に応じて変換マトリックスを変更できます。

 def apply_sepia(image): # Sepia transformation matrix sepia_matrix = np.array([[0.393, 0.769, 0.189], [0.349, 0.686, 0.168], [0.272, 0.534, 0.131]]) # Apply the sepia transformation sepia_img = image.dot(sepia_matrix.T) # Using matrix multiplication # Ensure values are within valid range [0, 255] sepia_img = np.clip(sepia_img, 0, 255) return sepia_img.astype(np.uint8) # Apply sepia effect M_sepia = Image.fromarray(apply_sepia(reduced_M)) display(M_sepia)

7. グレースケール

グレースケール変換は、簡単に言えば、RGB の 3 つのチャネルを 1 つの白黒チャネルに結合することと理解できます。

 import numpy as np def grayscale(image): # Convert the RGB image to grayscale using weighted average grayscale_img = np.dot(image[..., :3], [0.2989, 0.5870, 0.1140]) # Ensure values are within valid range [0, 255] grayscale_img = np.clip(grayscale_img, 0, 255) # Convert to uint8 data type grayscale_img = grayscale_img.astype(np.uint8) return grayscale_img # Convert the image to grayscale M_gray = grayscale(reduced_M) display(M_gray)

8. ピクセル化

ピクセルは個々のカラーブロックで構成されています。ピクセル化とは、その名の通り、画像を特定の領域に分割し、それぞれの領域を対応するカラーブロックに変換し、それらのカラーブロックを使ってグラフィックを構成することです。これはカラーコンポジションに似ています。簡単に言うと、ベクターグラフィックをピクセルで構成されるラスターグラフィックに変換すること、つまりラスタライズです。

 def pixelate_image(image, block_size): # Determine the number of blocks in each dimension num_blocks_y = image.shape[0] // block_size num_blocks_x = image.shape[1] // block_size # Calculate the average color for each block block_means = np.zeros((num_blocks_y, num_blocks_x, 3), dtype=np.uint8) for y in range(num_blocks_y): for x in range(num_blocks_x): block = image[y * block_size: (y + 1) * block_size, x * block_size: (x + 1) * block_size] block_mean = np.mean(block, axis=(0, 1)) block_means[y, x] = block_mean.astype(np.uint8) # Upsample block means to original image size pixelated_image = np.repeat(np.repeat(block_means, block_size, axis=0), block_size, axis=1) return pixelated_image # Set the block size for pixelation (adjust as needed) block_size = 10 # Pixelate the image M_pixelated = Image.fromarray(pixelate_image(reduced_M, block_size)) display(M_pixelated)

簡単に言えば、Minecraft スタイルのグラフィックです。

9. 2値化

二値化とは、数値特徴値を閾値化し、ブール値特徴値に変換するプロセスです。簡単に言えば、閾値を設定し、閾値を超える値は真、閾値を超える値は偽と設定します。

 def binarize_image(image, threshold): #set pixel value greater than threshold to 255 binarize_image = ((image > threshold) * 255).astype(np.uint8) return binarize_image #set threshold threshold = 68 M_binarized = Image.fromarray(binarize_image(reduced_M, threshold)) display(M_binarized)

10. 画像の融合

画像を合計する最も簡単な方法は、以下に示すように、2 つの画像のピクセルを、それぞれの透明度のレベルに基づいて合計することです。

 #import and resize second image img_2 = np.array(Image.open('Eiffel.jpg').resize(reduced_M.shape[1::-1])) def blend_image(image1, image2, , visibility_2 ): #blend images by multiplying by visibility ratio for each image blend_image = (image1 * visibility_1 + image2 * visibility_2).astype(np.uint8) return blend_image modified_image = Image.fromarray(blend_image(reduced_M, img_2, 0.7, 0.3)) display(modified_image)

要約

画像操作とは、基本的に画像に対して配列演算を実行するプロセスです。ここで示す簡単な演算は、NumPy の演算に慣れるためのものです。より高度な演算には、OpenCV や Pillow などの専門的なライブラリをご利用ください。

DUICUO