摘要:在本篇博客中,咱们将介绍如何在YOLOv5车牌辨认的基础上进一步完结字符切割与辨认。咱们将详细介绍字符切割办法,如投影法和概括法,以及字符辨认办法,如CNN和LSTM等。

YOLOv5车牌识别实战教程:字符分割与识别

yolov5车牌辨认源码:

www.hedaoapp.com/goods/goods…

正文:

5.1 字符切割

在实践应用中,辨认车牌的字符是很重要的。为了完结字符切割,咱们能够选用以下办法:

  1. 投影法:经过计算车牌图画在水平缓笔直方向上的投影直方图,确认字符的鸿沟。以下是一个简略的投影法完结:
import cv2
import numpy as np
def projection_segmentation(plate_image, direction='horizontal'):
    assert direction in ['horizontal', 'vertical'], 'Invalid direction'
    gray_image = cv2.cvtColor(plate_image, cv2.COLOR_BGR2GRAY)
    binary_image = cv2.adaptiveThreshold(gray_image, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 11, 2)
    if direction == 'horizontal':
        histogram = np.sum(binary_image, axis=1)
    else:
        histogram = np.sum(binary_image, axis=0)
    threshold = np.max(histogram) * 0.5
    peaks = np.where(histogram > threshold)[0]
    start, end = peaks[0], peaks[-1]
    if direction == 'horizontal':
        return plate_image[start:end, :]
    else:
        return plate_image[:, start:end]
  1. 概括法:经过检测二值化车牌图画的概括,然后依据概括的位置和形状挑选出字符。以下是一个简略的概括法完结:
import cv2
def contour_segmentation(plate_image):
    gray_image = cv2.cvtColor(plate_image, cv2.COLOR_BGR2GRAY)
    binary_image = cv2.adaptiveThreshold(gray_image, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 11, 2)
    contours, _ = cv2.findContours(binary_image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    chars = []
    for cnt in contours:
        x, y, w, h = cv2.boundingRect(cnt)
        aspect_ratio = float(w) / h
        if 0.2 < aspect_ratio < 1.0 and 20 < h < 80:
            chars.append(plate_image[y:y + h, x:x + w])
    return chars

5.2 字符辨认

在完结字符切割后,咱们需求辨认每个字符。能够选用以下办法:

  1. CNN:运用卷积神经网络(CNN)对字符进行分类。能够运用预训练的模型,如LeNet、VGG等,或者自定义一个简略的CNN。以下是一个简略的CNN完结:
import torch.nn as nn
class SimpleCNN(nn.Module):
    def __init__(self, num_classes):
        super(SimpleCNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)
        self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
        self.fc1 = nn.Linear(64 * 8 * 16, 128)
        self.fc2 = nn.Linear(128, num_classes)
    def forward(self, x):
        x = self.pool1(F.relu(self.conv1(x)))
        x = self.pool2(F.relu(self.conv2(x)))
        x = x.view(-1, 64 * 8 * 16)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x
num_classes = 36 # 依据实践情况设置类别数
model = SimpleCNN(num_classes)
  1. LSTM:运用长短时记忆网络(LSTM)对字符进行分类。能够在CNN的基础上添加一个LSTM层,以捕捉字符序列的时序信息。以下是一个简略的LSTM完结:、
import torch
import torch.nn as nn
class CNN_LSTM(nn.Module):
    def __init__(self, num_classes):
        super(CNN_LSTM, self).__init__()
        self.cnn = SimpleCNN(128)
        self.lstm = nn.LSTM(128, num_classes, num_layers=1, batch_first=True)
    def forward(self, x):
        batch_size, seq_len, c, h, w = x.size()
        x = x.view(batch_size * seq_len, c, h, w)
        x = self.cnn(x)
        x = x.view(batch_size, seq_len, -1)
        x, _ = self.lstm(x)
        return x
num_classes = 36 # 依据实践情况设置类别数
model = CNN_LSTM(num_classes)

在训练字符辨认模型时,需求运用包括大量字符图画和对应标签的数据集。能够运用揭露的字符辨认数据集,或者自己构建数据集。训练完结后,即可运用模型对车牌中的字符进行辨认。

5.3 预处理与后处理

为了进步字符辨认的准确率,咱们能够在字符辨认之前对字符图画进行预处理,以及在辨认完结后进行后处理。

预处理:

  1. 二值化:将字符图画转化为二值图画,能够削减背景噪声的影响。能够运用OpenCV的adaptiveThreshold函数进行自适应阈值二值化。
import cv2
def binarize(char_image):
    gray_image = cv2.cvtColor(char_image, cv2.COLOR_BGR2GRAY)
    binary_image = cv2.adaptiveThreshold(gray_image, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 11, 2)
    return binary_image
  1. 规范化:将字符图画调整为统一的尺度,以便输入到神经网络。能够运用OpenCV的resize函数完结。
import cv2
def normalize(char_image, target_size=(32, 32)):
    resized_image = cv2.resize(char_image, target_size, interpolation=cv2.INTER_AREA)
    return resized_image

后处理:

  1. 置信度阈值:在字符辨认的成果中,能够依据置信度挑选最或许的字符。能够设置一个置信度阈值,仅保存置信度大于该阈值的字符。
def filter_by_confidence(predictions, confidence_threshold=0.5):
    top_confidences, top_indices = torch.topk(predictions, 1)
    top_confidences = top_confidences.squeeze().numpy()
    top_indices = top_indices.squeeze().numpy()
    filtered_indices = top_indices[top_confidences > confidence_threshold]
    return filtered_indices
  1. NMS:对字符辨认的成果进行非极大值按捺(NMS),以消除重复的字符。
def nms(predictions, iou_threshold=0.5):
    boxes, scores = predictions[:, :4], predictions[:, 4]
    indices = torchvision.ops.nms(boxes, scores, iou_threshold)
    return predictions[indices]

经过这些预处理与后处理办法,能够进一步进步字符辨认的准确率和鲁棒性。

总结:

本篇博客在之前的基础上,弥补了字符切割与辨认的预处理与后处理办法,包括二值化、规范化、置信度阈值挑选和非极大值按捺等。这些办法有助于进步车牌字符辨认的功能,使车牌辨认系统在实践应用中具有更高的可靠性。希望本教程对你在实践项目中完结车牌辨认有所协助。如有任何问题或建议,请在谈论区交流。