本文主要介紹一個(gè)通過(guò)圖像處理改善OCR識(shí)別結(jié)果的實(shí)例，并給出詳細(xì)步驟和源碼。

背景介紹

在很多情況下，文字識(shí)別會(huì)遇到困難。比如非單一的背景、雜訊干擾、文字部分缺失等。

我們希望識(shí)別圖中的黑色文字(12-14),但背景較復(fù)雜且存在其他干擾，如果直接用Tesseract識(shí)別(代碼如下)，識(shí)別結(jié)果為空。

# -*- coding:utf-8 -*- 
import pytesseract
from PIL import Image

# 打開圖像
image = Image.open('0.png')

# OCR識(shí)別：lang默認(rèn)英文
text = pytesseract.image_to_string(image)

# 打印識(shí)別后的文本
print(text)

對(duì)這種復(fù)雜情況的文字識(shí)別，直接去識(shí)別很容易失敗。思考：可不可以通過(guò)圖像處理將我們需要的部分分割或凸顯出來(lái)再做識(shí)別？本文將以此為例做演示說(shuō)明。

**詳細(xì)實(shí)現(xiàn)步驟

【1】OTSU二值化

image = cv2.imread('0.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)


_,thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)
cv2.imshow("Otsu", thresh)

【2】距離變化 + 歸一化

dist = cv2.distanceTransform(thresh, cv2.DIST_L2, 5)
dist = cv2.normalize(dist, dist, 0, 1.0, cv2.NORM_MINMAX)
dist = (dist * 255).astype("uint8")
cv2.imshow("Dist", dist)

【3】對(duì)距離變換結(jié)果圖做OTSU二值化

_,dist = cv2.threshold(dist, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
cv2.imshow("Dist Otsu", dist)

【4】形態(tài)學(xué)開運(yùn)算濾除雜訊

kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (7, 7))
opening = cv2.morphologyEx(dist, cv2.MORPH_OPEN, kernel)
cv2.imshow("Opening", opening)

【5】輪廓篩選，找出文字區(qū)域

black_img = cv2.cvtColor(opening, cv2.COLOR_GRAY2BGR)


cnts = cv2.findContours(opening.copy(), cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
chars = []
# loop over the contours
for c in cnts:
  # compute the bounding box of the contour
  (x, y, w, h) = cv2.boundingRect(c)
  if w >= 35 and h >= 100:
    chars.append(c)

cv2.drawContours(black_img,chars,-1,(0,255,0),2)
cv2.imshow("chars", black_img)

【6】計(jì)算輪廓凸包，進(jìn)一步獲取文字區(qū)域mask

mask = np.zeros(image.shape[:2], dtype="uint8")

cv2.drawContours(mask, [hull], -1, 255, -1)

mask = cv2.dilate(mask, None, iterations=2)

cv2.imshow("Mask", mask)

take the bitwise of the opening image and the mask to reveal just

the characters in the image

final = cv2.bitwise_and(opening, opening, mask=mask)

cv2.imshow("final", mask)

【7】Tesseract文字識(shí)別


text = pytesseract.image_to_string(final)

# 打印識(shí)別后的文本

print(text)

【8】完整代碼：

#公眾號(hào)：OpenCV與AI 深度學(xué)習(xí)

import cv2

import numpy as np

import imutils

import pytesseract

image = cv2.imread('0.png')

gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

_,thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)

cv2.imshow("Otsu", thresh)

dist = cv2.distanceTransform(thresh, cv2.DIST_L2, 5)

dist = cv2.normalize(dist, dist, 0, 1.0, cv2.NORM_MINMAX)

dist = (dist * 255).astype("uint8")

cv2.imshow("Dist", dist)

threshold the distance transform using Otsu's method

_,dist = cv2.threshold(dist, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)

cv2.imshow("Dist Otsu", dist)

kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (7, 7))

opening = cv2.morphologyEx(dist, cv2.MORPH_OPEN, kernel)

cv2.imshow("Opening", opening)

black_img = cv2.cvtColor(opening, cv2.COLOR_GRAY2BGR)

cnts = cv2.findContours(opening.copy(), cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)

cnts = imutils.grab_contours(cnts)

chars = []

loop over the contours

for c in cnts:

compute the bounding box of the contour

(x, y, w, h) = cv2.boundingRect(c)

if w >= 35 and h >= 100:

chars.append(c)

cv2.drawContours(black_img,chars,-1,(0,255,0),2)

cv2.imshow("chars", black_img)

chars = np.vstack([chars[i] for i in range(0, len(chars))])

hull = cv2.convexHull(chars)

allocate memory for the convex hull mask, draw the convex hull on

the image, and then enlarge it via a dilation

mask = np.zeros(image.shape[:2], dtype="uint8")

cv2.drawContours(mask, [hull], -1, 255, -1)

mask = cv2.dilate(mask, None, iterations=2)

cv2.imshow("Mask", mask)

take the bitwise of the opening image and the mask to reveal just

the characters in the image

final = cv2.bitwise_and(opening, opening, mask=mask)

cv2.imshow("final", final)

text = pytesseract.image_to_string(final)

打印識(shí)別后的文本

print(text)

cv2.waitKey()

cv2.destroyAllWindows()

**參考鏈接**

(1)https://pyimagesearch.com/2021/11/22/improving-ocr-results-with-basic-image-processing/

(2)https://stackoverflow.com/questions/33881175/remove-background-noise-from-image-to-make-text-more-clear-for-ocr

聲明：本文內(nèi)容及配圖由入駐作者撰寫或者入駐合作網(wǎng)站授權(quán)轉(zhuǎn)載。文章觀點(diǎn)僅代表作者本人，不代表電子發(fā)燒友網(wǎng)立場(chǎng)。文章及其配圖僅供工程師學(xué)習(xí)之用，如有內(nèi)容侵權(quán)或者其他違規(guī)問(wèn)題，請(qǐng)聯(lián)系本站處理。舉報(bào)投訴

評(píng)論

相關(guān)推薦

OCR文字距離太近應(yīng)該如何處理？

；最近需要做一個(gè)OCR文字識(shí)別的自動(dòng)測(cè)試，之前基本沒(méi)有接觸過(guò)圖像處理的相關(guān)概念，對(duì)于純數(shù)學(xué)上的算法目前也只是大致在看。我需要識(shí)別的圖片情況

發(fā)表于 05-04 15:07

OCR SDK開發(fā)者平臺(tái)推薦：OCR圖像智能字符識(shí)別技術(shù)

為了促進(jìn)業(yè)內(nèi)生態(tài)環(huán)境良好發(fā)展，幫助開發(fā)者更好的專注于自己的產(chǎn)品。想讓產(chǎn)品實(shí)現(xiàn)OCR圖像智能字符識(shí)別技術(shù)，看此帖!樓主吐血為大家找到一個(gè)集身份證識(shí)別，駕駛證

發(fā)表于 09-25 13:48

Labview調(diào)用OCR Training.exe實(shí)現(xiàn)字符識(shí)別

;這一步實(shí)現(xiàn)之后就是讀取訓(xùn)練文件進(jìn)行圖像字符的識(shí)別了，所需參數(shù)：ROI ；圖像；*.abc訓(xùn)練文件路徑；Parameters（字符識(shí)別參數(shù)）；簡(jiǎn)單方式可

發(fā)表于 08-16 17:36

Labview怎么實(shí)現(xiàn)對(duì)OCR識(shí)別定位，在線急等

在實(shí)際應(yīng)用中進(jìn)行OCR識(shí)別時(shí)，字符的位置以及角度是經(jīng)常變化的，怎么利用LabVIEW對(duì)彩色圖像進(jìn)行灰度處理以及定位識(shí)別？這里圖一是彩色照片、

發(fā)表于 11-18 15:18

Python OCR 識(shí)別庫(kù)-ddddocr

;, 'rb') as f: image = f.read()res = ocr.classification(image)print(res)識(shí)別結(jié)果3n3d8342總結(jié)ddddocr 讓驗(yàn)證碼變得如此簡(jiǎn)單

發(fā)表于 03-30 17:26

【KV260視覺(jué)入門套件試用體驗(yàn)】七、VITis AI字符和文本檢測(cè)（OCR&Textmountain）

某些字符的圖像。輸出為包含所識(shí)別的字詞及其位置的結(jié)構(gòu)。下圖顯示了 OCR 的結(jié)果。換一個(gè)帶中文的圖片試一下，結(jié)果只能

發(fā)表于 09-26 16:31

車號(hào)圖像處理與識(shí)別系統(tǒng)的研制

文章介紹了用于火車貨車的車皮號(hào)及自重?cái)?shù)字圖像識(shí)別的計(jì)算機(jī)圖像處理與識(shí)別系統(tǒng)的實(shí)踐。使用濾波和非線性灰度擴(kuò)展，使顯示圖像獲得

發(fā)表于 06-19 08:36 ?12次下載

什么是OCR

什么是OCR OCR的英文全稱： OCR是英文Optical Character Recognition的縮寫，意思是光學(xué)字符識(shí)別，也可簡(jiǎn)單地稱為文字

發(fā)表于 04-10 12:55 ?6744次閱讀

基于FPGA的OCR文字識(shí)別技術(shù)的深度解析

識(shí)別整體性能為GPU P4 130%，處理延時(shí)僅為P4的1/10，CPU的1/30。 1.文字識(shí)別技術(shù)- OCR OCR技術(shù)，通俗來(lái)講就是從

發(fā)表于 01-26 12:19 ?4019次閱讀

移動(dòng)端證件OCR識(shí)別/安卓IOS平臺(tái)

一、證件識(shí)別/證件OCR介紹移動(dòng)端證件識(shí)別是開發(fā)的基于移動(dòng)平臺(tái)的證件識(shí)別/證件OCR應(yīng)用程序，支持Android、iOS等多種主流移動(dòng)操作系

發(fā)表于 06-15 15:42 ?315次閱讀

OCR光學(xué)字符識(shí)別技術(shù)原理講解

紙質(zhì)文檔中的文字轉(zhuǎn)換成為黑白點(diǎn)陣的圖像文件，并通過(guò)識(shí)別軟件將圖像中的文字轉(zhuǎn)換成文本格式，供文字處理軟件進(jìn)一步編輯加工的技術(shù)。

發(fā)表于 03-02 13:49 ?2.1w次閱讀

OCR識(shí)別技術(shù)

在爬蟲對(duì)驗(yàn)證碼進(jìn)行破解時(shí)，經(jīng)常需要對(duì)圖片中的文字內(nèi)容進(jìn)行識(shí)別，這時(shí)就需要用到OCR技術(shù)了，那么 OCR識(shí)別技術(shù)是如何實(shí)現(xiàn)對(duì)文字內(nèi)容“即拍即得”的呢？

發(fā)表于 03-12 09:07 ?4825次閱讀