【影像處理】如何加快OpenCV臉部偵測的速度

作者:曾成訓(CH.Tseng)

Haar Cascade 是一款由 OpenCV 提供的物件偵測方法,操作相當簡單方便,只要引入一條稱為 XML classifiers 的資源檔,便能馬上用來偵測影像中的物件,例如臉部的偵測。

臉部偵測影響播放速度

相對於深度學習的方式,Haar Cascade 雖然快速,但如果在一些資源受限的 embedded 系統或單板電腦上使用,透過 Python 執行的速度將不是很理想,下兩圖是針對不同解析度的影片,「直接撥放」與「加入臉部偵測播放」的速度比較:

  • 單純撥放影片

(圖片來源:曾成訓提供)

  • 撥放影片+偵測臉孔

(圖片來源:曾成訓提供)

(圖片來源:曾成訓提供)

可以發現即使透過臉部偵測速度最快的 Haar Cascade 方式,對於系統運行效率的影響仍然很大,因此如果想要減少 Haar Cascade 對於系統的負擔,讓系統運行更流暢些,可透過以下的方式來進行(下方系列步驟主要由附錄 B 程式逐步修改而來)。

使用小尺寸的相片進行偵測

Haar Cascade 偵測尺寸愈大的相片會需要愈多的時間,因此在執行臉部偵測之前,可先將尺寸等比例縮小用於偵測臉部,得到(x, y, w, h)的 Bounding box 輸出結果後再依比例還原,如此一來影片尺寸前後保持不變,但速度卻能加快不少,測試數據如下所示:

(圖片來源:曾成訓提供)

使用此方式需注意相片尺寸縮小對於臉孔偵測效果的影響,必要時需調整 scaleFactor、minNeighbors、 minSize 等參數來維持偵測效果。

間隔數個影格再進行偵測

一般的影片速率是每秒 30 格左右,因此偵測影片中的人臉並不需要逐格檢查,只需要將數個影格再檢測一次即可。

(圖片來源:曾成訓提供)

加入了每隔一個影格才偵測一次臉部後,會發現速度又提升了不少。不過,這樣會產生一個問題,就是影像中的人臉矩形選框會由於間隔偵測關係而出現閃爍現象,使得影片看起來不是很順暢連續,所以我們加入一個功能「當該 frame 沒有作臉部偵測時,仍然讓上個 frame 矩形選框保留在影片中,持續到新的矩形選框產生為止」,如此一來能避免閃爍的問題,同時對於偵測速率也不會有影響。

(圖片來源:曾成訓提供)

使用此方式也需注意,當前後兩個 frame 的臉孔位置差異較大時(即移動速度非常快),人臉矩形選框會有跟不上的情況,因此可透過比對前後 frame 的差異度來決定是否偵測人臉而不需等到下次間隔(如 Scikit-images 的 compare_ssim)。

Multi-processing 平行處理

目前的 CPU大多是 muti-cores,樹莓派也不例外,共有四個核心。如果我們在程式中進行臉部偵測,其實只用到其中一個核心,其餘的並沒有利用到,這樣相當可惜,因此在下方的實驗中,我們將偵測臉部的動作,分到其餘 CPU cores 去執行後再非同步地傳回,最後顯示於影格當中。

(圖片來源:曾成訓提供)

下方附錄F範例便是使用 Python 的 multiprocessing 模組,針對臉部偵測進行平行處理,可以發現影片解析度愈大時平行處理的效率提升則會愈明顯。

(圖片來源:曾成訓提供)

相關程式附錄

A. 直接播放:

import cv2
import time

videoFile = "540p.mp4"
full_screen = True
win_name = "FRAME"

#FPS
fps = 0
start = time.time()
last_time = time.time()
last_frames = 0

def exit_app():
camera.release()

def fps_count(total_frames):
global last_time, last_frames, fps

timenow = time.time()
if(timenow - last_time)>10:
fps = (total_frames - last_frames) / (timenow - last_time)
#print("FPS: {0}".format(fps))

last_time = timenow
last_frames = total_frames

return fps

camera = cv2.VideoCapture(videoFile)
if(full_screen is True):
cv2.namedWindow(win_name, cv2.WND_PROP_FULLSCREEN)
cv2.setWindowProperty(win_name, cv2.WND_PROP_FULLSCREEN,cv2.WINDOW_FULLSCREEN)

grabbed = True
ii = 0
while(camera.isOpened()):
(grabbed, img) = camera.read()

ii += 1
img = cv2.putText(img, "FPS:{}".format(round(fps_count(ii),1)), (30, 60), cv2.FONT_HERSHEY_SIMPLEX, 2.0, (0,0,255), 3, cv2.LINE_AA)
cv2.imshow(win_name, img)
key = cv2.waitKey(1)
if(key==113):
exit_app()

B. 臉孔偵測:

import cv2
import imutils
import time

videoFile = "540p.mp4"
face_cascade = cv2.CascadeClassifier('xml/lbpcascade_frontalface.xml')
cascade_scale = 1.1
cascade_neighbors = 4
minFaceSize = (30,30)

full_screen = True
win_name = "FRAME"

#FPS
fps = 0
start = time.time()
last_time = time.time()
last_frames = 0

def exit_app():
    camera.release()

def fps_count(total_frames):
    global last_time, last_frames, fps

    timenow = time.time()
    if(timenow - last_time)>10:
        fps  = (total_frames - last_frames) / (timenow - last_time)
        #print("FPS: {0}".format(fps))

        last_time  = timenow
        last_frames = total_frames

    return fps

def getFaces(img):
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(
        gray,
        scaleFactor= cascade_scale,
        minNeighbors=cascade_neighbors,
        minSize=minFaceSize,
        flags=cv2.CASCADE_SCALE_IMAGE
    )

    bboxes = []
    for (x,y,w,h) in faces:
        if(w>minFaceSize[0] and h>minFaceSize[1]):
            bboxes.append((x, y, w, h))

    return bboxes

camera = cv2.VideoCapture(videoFile)
width = int(camera.get(cv2.CAP_PROP_FRAME_WIDTH))   # float
height = int(camera.get(cv2.CAP_PROP_FRAME_HEIGHT)) # float
if(full_screen is True):
    cv2.namedWindow(win_name, cv2.WND_PROP_FULLSCREEN)        # Create a named window
    cv2.setWindowProperty(win_name, cv2.WND_PROP_FULLSCREEN,cv2.WINDOW_FULLSCREEN)


grabbed = True
ii = 0
while(camera.isOpened()):
    (grabbed, img) = camera.read()
    faces = getFaces(img)
    for (x,y,w,h) in faces:
        cv2.rectangle( img,(x,y),(x+w,y+h),(0,255,0),2)

    ii += 1
    img = cv2.putText(img, "FPS:{}".format(round(fps_count(ii),1)), (30, 60), cv2.FONT_HERSHEY_SIMPLEX, 2.0,  (0,0,255), 3, cv2.LINE_AA)
    cv2.imshow(win_name, img)
    key = cv2.waitKey(1)
    if(key==113):
        exit_app()

C. 尺寸縮小:

import cv2
import imutils
import time

videoFile = "540p.mp4"
face_cascade = cv2.CascadeClassifier('xml/lbpcascade_frontalface.xml')
cascade_scale = 1.1
cascade_neighbors = 4
minFaceSize = (30,30)

full_screen = False
win_name = "FRAME"

#FPS
fps = 0
start = time.time()
last_time = time.time()
last_frames = 0

#--->
resize_ratio = 0.75

def exit_app():
    camera.release()

def fps_count(total_frames):
    global last_time, last_frames, fps

    timenow = time.time()
    if(timenow - last_time)>10:
        fps  = (total_frames - last_frames) / (timenow - last_time)
        #print("FPS: {0}".format(fps))

        last_time  = timenow
        last_frames = total_frames

    return fps

def getFaces(img):
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(
        gray,
        scaleFactor= cascade_scale,
        minNeighbors=cascade_neighbors,
        minSize=minFaceSize,
        flags=cv2.CASCADE_SCALE_IMAGE
    )

    bboxes = []
    for (x,y,w,h) in faces:
        if(w>minFaceSize[0] and h>minFaceSize[1]):
            bboxes.append((x, y, w, h))

    return bboxes

camera = cv2.VideoCapture(videoFile)
#camera.set(cv2.CAP_PROP_FRAME_WIDTH, video_size[0])
#camera.set(cv2.CAP_PROP_FRAME_HEIGHT, video_size[1])
width = int(camera.get(cv2.CAP_PROP_FRAME_WIDTH))   # float
height = int(camera.get(cv2.CAP_PROP_FRAME_HEIGHT)) # float
if(full_screen is True):
    cv2.namedWindow(win_name, cv2.WND_PROP_FULLSCREEN)        # Create a named window
    cv2.setWindowProperty(win_name, cv2.WND_PROP_FULLSCREEN,cv2.WINDOW_FULLSCREEN)


grabbed = True
ii = 0
while(camera.isOpened()):
    (grabbed, img) = camera.read()
    img_org = img.copy()
    img = cv2.resize(img, (int(img.shape[1]*resize_ratio), int(img.shape[0]*resize_ratio)) )
    faces = getFaces(img)
    for (x,y,w,h) in faces:
        x = int(x / resize_ratio)
        y = int(y / resize_ratio)
        w = int(w / resize_ratio)
        h = int(h / resize_ratio)
        print(x,y,w,h)
        cv2.rectangle( img_org,(x,y),(x+w,y+h),(0,255,0),2)

    ii += 1
    img = cv2.putText(img_org, "FPS:{}".format(round(fps_count(ii),1)), (30, 60), cv2.FONT_HERSHEY_SIMPLEX, 2.0,  (0,0,255), 3, cv2.LINE_AA)
    cv2.imshow(win_name, img_org)
    key = cv2.waitKey(1)
    if(key==113):
        exit_app(

D. 間隔偵測:

import cv2
import imutils
import time

videoFile = "540p.mp4"
face_cascade = cv2.CascadeClassifier('xml/lbpcascade_frontalface.xml')
cascade_scale = 1.1
cascade_neighbors = 4
minFaceSize = (30,30)

full_screen = False
win_name = "FRAME"
interval = 5

#FPS
fps = 0
start = time.time()
last_time = time.time()
last_frames = 0

#__________________
resize_ratio = 0.75

def exit_app():
    camera.release()

def fps_count(total_frames):
    global last_time, last_frames, fps

    timenow = time.time()
    if(timenow - last_time)>10:
        fps  = (total_frames - last_frames) / (timenow - last_time)
        last_time  = timenow
        last_frames = total_frames

    return fps

def getFaces(img):
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(
        gray,
        scaleFactor= cascade_scale,
        minNeighbors=cascade_neighbors,
        minSize=minFaceSize,
        flags=cv2.CASCADE_SCALE_IMAGE
    )

    bboxes = []
    for (x,y,w,h) in faces:
        if(w>minFaceSize[0] and h>minFaceSize[1]):
            bboxes.append((x, y, w, h))

    return bboxes

camera = cv2.VideoCapture(videoFile)
width = int(camera.get(cv2.CAP_PROP_FRAME_WIDTH))   # float
height = int(camera.get(cv2.CAP_PROP_FRAME_HEIGHT)) # float
if(full_screen is True):
    cv2.namedWindow(win_name, cv2.WND_PROP_FULLSCREEN)        # Create a named window
    cv2.setWindowProperty(win_name, cv2.WND_PROP_FULLSCREEN,cv2.WINDOW_FULLSCREEN)


grabbed = True
ii = 0
while(camera.isOpened()):
    (grabbed, img) = camera.read()
    img_org = img.copy()
    if(ii % interval == 0):
        img = cv2.resize(img, (int(img.shape[1]*resize_ratio), int(img.shape[0]*resize_ratio)) )
        faces = getFaces(img)

    for (x,y,w,h) in faces:
        x = int(x / resize_ratio)
        y = int(y / resize_ratio)
        w = int(w / resize_ratio)
        h = int(h / resize_ratio)
        cv2.rectangle( img_org,(x,y),(x+w,y+h),(0,255,0),2)

    ii += 1
    img = cv2.putText(img_org, "FPS:{}".format(round(fps_count(ii),1)), (30, 60), cv2.FONT_HERSHEY_SIMPLEX, 2.0,  (0,0,255), 3, cv2.LINE_AA)
    cv2.imshow(win_name, img_org)
    key = cv2.waitKey(1)
    if(key==113):
        exit_app()

E. 間隔偵測(連續)

import cv2
import imutils
import time

videoFile = "540p.mp4"
face_cascade = cv2.CascadeClassifier('xml/lbpcascade_frontalface.xml')
cascade_scale = 1.1
cascade_neighbors = 4
minFaceSize = (30,30)

full_screen = False
win_name = "FRAME"

interval = 5
resize_ratio = 0.75

#FPS
fps = 0
start = time.time()
last_time = time.time()
last_frames = 0

def exit_app():
    camera.release()

def fps_count(total_frames):
    global last_time, last_frames, fps

    timenow = time.time()
    if(timenow - last_time)>10:
        fps  = (total_frames - last_frames) / (timenow - last_time)
        last_time  = timenow
        last_frames = total_frames

    return fps

def getFaces(img):
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(
        gray,
        scaleFactor= cascade_scale,
        minNeighbors=cascade_neighbors,
        minSize=minFaceSize,
        flags=cv2.CASCADE_SCALE_IMAGE
    )

    bboxes = []
    for (x,y,w,h) in faces:
        if(w>minFaceSize[0] and h>minFaceSize[1]):
            bboxes.append((x, y, w, h))

    return bboxes

camera = cv2.VideoCapture(videoFile)
width = int(camera.get(cv2.CAP_PROP_FRAME_WIDTH))   # float
height = int(camera.get(cv2.CAP_PROP_FRAME_HEIGHT)) # float
if(full_screen is True):
    cv2.namedWindow(win_name, cv2.WND_PROP_FULLSCREEN)        # Create a named window
    cv2.setWindowProperty(win_name, cv2.WND_PROP_FULLSCREEN,cv2.WINDOW_FULLSCREEN)


grabbed = True
ii = 0
while(camera.isOpened()):
    (grabbed, img) = camera.read()
    img_org = img.copy()
    if(ii % interval == 0):
        img = cv2.resize(img, (int(img.shape[1]*resize_ratio), int(img.shape[0]*resize_ratio)) )
        faces = getFaces(img)

    for (x,y,w,h) in faces:
        x = int(x / resize_ratio)
        y = int(y / resize_ratio)
        w = int(w / resize_ratio)
        h = int(h / resize_ratio)
        cv2.rectangle( img_org,(x,y),(x+w,y+h),(0,255,0),2)

    ii += 1
    img = cv2.putText(img_org, "FPS:{}".format(round(fps_count(ii),1)), (30, 60), cv2.FONT_HERSHEY_SIMPLEX, 2.0,  (0,0,255), 3, cv2.LINE_AA)
    cv2.imshow(win_name, img_org)
    key = cv2.waitKey(1)
    if(key==113):
        exit_app()

(本文經作者同意轉載自CH.TSENG部落格、原文連結;責任編輯:賴佩萱)

曾 成訓

Author: 曾 成訓

人到中年就像沒對準的描圖紙,一點一點的錯開,我只能當個Maker來使它復位。

Share This Post On

Submit a Comment

Your email address will not be published. Required fields are marked *