Welcome!
This is the community forum for my apps Pythonista and Editorial.
For individual support questions, you can also send an email. If you have a very short question or just want to say hello — I'm @olemoritz on Twitter.
Recognize text from picture
-
def pil2ui(pil_image): buffer = io.BytesIO() pil_image.save(buffer, format='PNG') return ui.Image.from_data(buffer.getvalue())
is memory leaking buffer which has been proven to crash Pythonista when multiple images are processed. A better approach is to use a context manager to force the .close().
def pil2ui(pil_image): with io.BytesIO() as buffer: pil_image.save(buffer, format='PNG') return ui.Image.from_data(buffer.getvalue())
-
Revisiting this.
Regardless of language restrictions, I have found the simple and reliable ability to pick text from paper to be useful for me almost weekly - URLs, email addresses, reservation codes, laptop serial numbers, etc.
With the use, I noted that the original script had some issues:
- Difficult to find and open when quickly needed.
- Slow to get from the picked photo to recognized text.
- Results are a pain to copy from the Console as it likes to jump around just as you’ve selected the text to copy.
Point #1 was fixed with a simple Apple Shortcuts shortcut to make the script easy to run.
Point #3 was resolved by presenting the recognized text in a TableView, with tap to copy.
Point #2 took a bit more doing.
Pythonista
photos
module wants to return PIL images, and that results in two very slow conversions - first the module converts the UIImage to PIL, and then I converted that back to a PNG image for recognition. I found some @cvp code in this thread and replacedphotos
module withobjc_util
pickers, which return PNG data almost directly.And hey presto! Not just faster recognition, but instantaneous - and with much better quality than with the only contender app I could find (Prizmo Go).
Updated script here.
-
@mikael Thanks for your great 🎁for New Year
-
-
Happy last day of the decade to everyone who shares my calendar!
I finessed the script a bit with the ability to select, copy or share multiple items, and nicer icons.
@cvp, noted and wondered about the lack of the link for your handle, no idea why.
-
Does this work with the new iPad OS ????
-
@sodoku, do you mean if the latest versions have included robust support for non-English characters? Not to my knowledge.
-
So good news I got an iPhone 11 and I’m testing this on it, for sudoku, do the pictures taken save anywhere when used, just curious if I have to delete them because after I use it the pictures don’t show up in my pictures app
-
Also what’s the updated code posted by Mikael on GitHub used for is it same as this one posted here or not because it’s so much longer and bigger then this code posted on the forum, is it a better version then this one on the forum
-
@sodoku, the code on Github is more of a tool, and much faster than the version at the beginning of this thread. For your purposes, you probably just want pieces of it.
It supports taking a picture normally and then selecting it from the photo library when you use the tool, or just snapping a quick ”disposable” in-tool image which is not saved.
-
This post is deleted! -
So there are a few edits in this thread I don’t know how to piece together to have the best edited version of this???
-
@sodoku, the one on github is the latest version.
-
I will test it for sudoku in the console, well I will try to convert it to use in console for the sudoku solver if need help I’ll post a message
-
I need help adapting this script for inputting the numbers from a picture of sudoku and insert the starting numbers into a console script, it does not work good for recognizing ones and sevens???
Example of sudoku solver
In my version I want to make the board is all zeros and when you take a picture it will add the numbers than solve, this combines the two programs (sudoku solver) & (ocr text recognition)
board=[ [5,8,4,1,0,0,0,0,0], [0,0,6,8,0,0,5,1,0], [0,0,0,0,5,4,7,0,6], [0,5,3,0,1,0,0,6,7], [0,0,0,0,2,0,0,0,0], [4,6,0,0,9,0,8,3,0], [7,0,8,5,4,0,0,0,0], [0,2,9,0,0,3,4,0,0], [0,0,0,0,0,1,3,7,9] ] def solve(bo): find = find_empty(bo) if not find: return True else: row,col = find for i in range(1,10): if valid(bo,i,(row,col)): bo[row][col] = i if solve(bo): return True bo[row][col] = 0 return False def valid(bo,num,pos): #check row for i in range(len(bo[0])): if bo[pos[0]][i] == num and pos[1] != i: return False #check column for i in range(len(bo[0])): if bo[i][pos[1]] == num and pos[0] != i: return False #check quadrant box_x = pos[1] // 3 box_y = pos[0] // 3 for i in range(box_y * 3, box_y * 3 + 3): for j in range(box_x * 3, box_x * 3 + 3): if bo[i][j] == num and (i,j) != pos: return False return True def print_board(bo): for i in range(len(bo)): if i % 3 == 0 and i != 0: print('------+-------+------') for j in range(len(bo[0])): if j % 3 == 0 and j != 0: print('|',end=' ') if j == 8: print(bo[i][j]) else: print(str(bo[i][j])+ ' ', end='') def find_empty(bo): for i in range(len(bo)): for j in range(len(bo[0])): if bo[i][j] == 0: return (i,j) # row, col return None print_board(board) solve(board) print('=====================') print_board(board)
language_preference = ['fi','en','se'] import photos, ui, dialogs import io from objc_util import * load_framework('Vision') VNRecognizeTextRequest = ObjCClass('VNRecognizeTextRequest') VNImageRequestHandler = ObjCClass('VNImageRequestHandler') def pil2ui(pil_image): buffer = io.BytesIO() pil_image.save(buffer, format='PNG') return ui.Image.from_data(buffer.getvalue()) selection = dialogs.alert('Get pic', button1='Camera', button2='Photos') ui_image = None if selection == 1: pil_image = photos.capture_image() if pil_image is not None: ui_image = pil2ui(pil_image) elif selection == 2: ui_image = photos.pick_asset().get_ui_image() if ui_image is not None: print('Recognizing...\n') req = VNRecognizeTextRequest.alloc().init().autorelease() req.setRecognitionLanguages_(language_preference) handler = VNImageRequestHandler.alloc().initWithData_options_(ui_image.to_png(), None).autorelease() success = handler.performRequests_error_([req], None) if success: for result in req.results(): print(result.text()) else: print('Problem recognizing anything') ```
-
@sodoku, I tried something similar as well a while ago, first recognizing rectangles and then trying to recognize the numbers, but I hit the same issue of very poor recognition of the numbers. I wonder if we would need a number-specific recognizer for that.
-
Hi, since the sudoko is a square of many squares I think it is more robust to slice the cells evenly and only have one nr in a small image.
Of course downside is to use a recognition service per image and you go from 1 image to 81 - that can get expensive.
But it would work more robust.
Best reg Tommy -
@Spitfire, thanks. I did try all kinds of approaches, finally resorting to manual cropping, and it still was not reliable enough.
-
@mikael Could you share some picture of sudoku, where recognition fails?
-
Multiple sample Sudoku puzzles would help to achieve a robust solution.