Recognize text from picture

mikael

This script recognizes text from a camera or photo library picture. Sharing it since iOS 13 has made it this easy, and Apple Shortcuts do not have support for it (yet, I bet).

Adjust languages on the first row.

language_preference = ['fi','en','se']

import photos, ui, dialogs
import io
from objc_util import *

load_framework('Vision')
VNRecognizeTextRequest = ObjCClass('VNRecognizeTextRequest')
VNImageRequestHandler = ObjCClass('VNImageRequestHandler')

def pil2ui(pil_image):
    buffer = io.BytesIO()
    pil_image.save(buffer, format='PNG')
    return ui.Image.from_data(buffer.getvalue())

selection = dialogs.alert('Get pic', button1='Camera', button2='Photos')

ui_image = None

if selection == 1:
    pil_image = photos.capture_image()
    if pil_image is not None:
        ui_image = pil2ui(pil_image)
elif selection == 2:
    ui_image = photos.pick_asset().get_ui_image()

if ui_image is not None:
    print('Recognizing...\n')

    req = VNRecognizeTextRequest.alloc().init().autorelease()
    req.setRecognitionLanguages_(language_preference)
    handler = VNImageRequestHandler.alloc().initWithData_options_(ui_image.to_png(), None).autorelease()

    success = handler.performRequests_error_([req], None)
    if success:
        for result in req.results():
            print(result.text())
    else:
        print('Problem recognizing anything') ```

pavlinb

I think VNRecognizeText works only for English.

cvp

It works perfectly in French, thanks to @mikael

mikael

@pavlinb, works perfectly for Finnish, too.

But the version above is slower than it needs to be, due to an unnecessary roundtrip to ui.Image. Here’s a faster version:

language_preference = ['fi','en','se']

import photos, ui, dialogs
import io
from objc_util import *

load_framework('Vision')
VNRecognizeTextRequest = ObjCClass('VNRecognizeTextRequest')
VNImageRequestHandler = ObjCClass('VNImageRequestHandler')

def pil2ui(pil_image):
    buffer = io.BytesIO()
    pil_image.save(buffer, format='PNG')
    return ui.Image.from_data(buffer.getvalue())

selection = dialogs.alert('Get pic', button1='Camera', button2='Photos')

pil_image = None

if selection == 1:
    pil_image = photos.capture_image()
elif selection == 2:
    pil_image = photos.pick_asset().get_image()

if pil_image is not None:
    print('Recognizing...\n')
    
    buffer = io.BytesIO()
    pil_image.save(buffer, format='PNG')
    image_data = buffer.getvalue()

    req = VNRecognizeTextRequest.alloc().init().autorelease()
    req.setRecognitionLanguages_(language_preference)
    handler = VNImageRequestHandler.alloc().initWithData_options_(image_data, None).autorelease()

    success = handler.performRequests_error_([req], None)
    if success:
        for result in req.results():
            print(result.text())
    else:
        print('Problem recognizing anything')

mikael

@pavlinb, ah, but you were right. This does not recognize the scandinavian letters ä and ö, substituting them with a and o. @cvp, are you getting é, ô and all the others?

cvp

@mikael
é yes
à no

mikael

@cvp, checked, usesLanguageCorrection is true and recognitionLevel set to ”accurate” by default, so no help there.

pavlinb

Doesn’t work for Cyrillic (Bulgarian).

mikael

Eh.

revision = VNRecognizeTextRequest.currentRevision()
supported = VNRecognizeTextRequest.supportedRecognitionLanguagesForTextRecognitionLevel_revision_error_(0, revision, None)

Returns ”en-US”.

cvp

@mikael I had also seen that but it supports French, thus ...buggy?

pavlinb

BTW, I'm impressed from accuracy ( for Latin based texts ).

JonB

@mikael @cvp have you tried setting customWords attrib of the request? Or, turn off usesLanguageCorrection? (Since the language is en-US you DON'T want language correction when trying to detect other languages!)

I gather they are looking for words you'd find in an English dictionary. So perhaps façade, or tête-à-tête might recognize, while other examples wouldn't?

cvp

@JonB I didn't try but we are not alone with this problem, see here.

I've tried with unknown language codes like xx and yy in setRecognitionLanguages_ and the result is the same. It seems that characters are recognized in any languages.
My last test on a French text was entirely correct

Asie-Pacifique
La mission économique belge en Chine cible de
cyberattaques massives

cvp

@JonB said:

usesLanguageCorrection

Tried with False: à still recognized as a

Edit : even with

    req.setCustomWords_(['à'])

sodoku

Is there any code examples of how to recognize text with ios 12.4.3 for the i pad mini 2 that would be cool to add it to my sodoku app game

mikael

@sodoku, yes, but it seems a bit more involved. Check this thread where @cvp does all kinds of magic.

ccc

def pil2ui(pil_image):
    buffer = io.BytesIO()
    pil_image.save(buffer, format='PNG')
    return ui.Image.from_data(buffer.getvalue())

is memory leaking buffer which has been proven to crash Pythonista when multiple images are processed. A better approach is to use a context manager to force the .close().

def pil2ui(pil_image):
    with io.BytesIO() as buffer:
        pil_image.save(buffer, format='PNG')
        return ui.Image.from_data(buffer.getvalue())

mikael

Revisiting this.

Regardless of language restrictions, I have found the simple and reliable ability to pick text from paper to be useful for me almost weekly - URLs, email addresses, reservation codes, laptop serial numbers, etc.

With the use, I noted that the original script had some issues:

Difficult to find and open when quickly needed.
Slow to get from the picked photo to recognized text.
Results are a pain to copy from the Console as it likes to jump around just as you’ve selected the text to copy.

Point #1 was fixed with a simple Apple Shortcuts shortcut to make the script easy to run.

Point #3 was resolved by presenting the recognized text in a TableView, with tap to copy.

Point #2 took a bit more doing.

Pythonista photos module wants to return PIL images, and that results in two very slow conversions - first the module converts the UIImage to PIL, and then I converted that back to a PNG image for recognition. I found some @cvp code in this thread and replaced photos module with objc_util pickers, which return PNG data almost directly.

And hey presto! Not just faster recognition, but instantaneous - and with much better quality than with the only contender app I could find (Prizmo Go).

Updated script here.

cvp

@mikael Thanks for your great 🎁for New Year

cvp

Question for a specialist of this forum.
In the last post of @mikael , I see my user as @cvp but as a black text and not clickable blue, although I've received a notification "Mikael mentioned you...".
How is that possible?