Huff transformation is a way to detect straight lines in an image, but as far as the library is concerned,
I can't find a way to specify such as "extract only lines that are more than a certain thickness".
I would appreciate it if you could let me know if anyone knows the linear detection method or library that takes into account the thickness of the line.
The languages and libraries currently in use are as follows:
[Supplement]
The target image is a black-and-white image with a mixture of characters and other objects, and the goal is to extract only characters.
Non-character objects are very troublesome, making it difficult to distinguish between characters, such as o in English and ○ in figure.
If I had to say anything, I came to this question because of the characteristics of "the letters are drawn in thicker lines than others."
I intend to leave the processing of the character extraction itself to tesseract-ocr, but the target image is
·Character positions are different
·Mixed vertical and horizontal writing
·Mixed font sizes
It's very troublesome, and even if you do OCR as it is, you won't get a proper result.
Therefore,
before OCR
①Extract only objects that appear to be characters from the image
②Align objects by size (horizontal writing like sentences)
I would like to do the pretreatment that
Here, we call the pixel corresponding to the letter "foreground pixel" and answer your suggestion.
The "straight line" that can be detected by the Huff transformation is simply "linear array of foreground pixels".
On the other hand, the questioner's "characters written with thick lines" is
It may be a "thick line" for humans, but
When it comes to arranging the pixels on the image, it should not be arranged in a straight line.
Therefore, I don't think Huff conversion is a good solution to this problem.
I don't know what the target image looks like, so I can't say anything about it.
"Instead, ""Crowded foreground pixels"" could be a characteristic of the character area?"
According to the comments, there are relatively many kanji characters, so
If non-character figures are drawn with sparse lines of scuskers,
I think I can use the local density of the foreground pixels.
Specific examples of processing include
Scan the entire image in a window of appropriate size, such as template matching, and
Count the number of foreground pixels in a window, and if the number is greater than or equal to 20% of the window area,
You can consider the location to be a text area.
As you commented, the processing shown above tends to be heavy.
If you want to scan an entire image with a width and height of W*H in a w*h window,
Approximately (W*H)*(w*h) pixel references occur.
A common way to speed up window scanning is to: There are ways to abort the calculation or shrink the image:
© 2024 OneMinuteCode. All rights reserved.