omz:forum

    • Register
    • Login
    • Search
    • Recent
    • Popular

    Welcome!

    This is the community forum for my apps Pythonista and Editorial.

    For individual support questions, you can also send an email. If you have a very short question or just want to say hello — I'm @olemoritz on Twitter.


    Recognize text from picture

    Pythonista
    13
    66
    27590
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • mikael
      mikael last edited by

      Happy last day of the decade to everyone who shares my calendar!

      I finessed the script a bit with the ability to select, copy or share multiple items, and nicer icons.

      @cvp, noted and wondered about the lack of the link for your handle, no idea why.

      1 Reply Last reply Reply Quote 0
      • sodoku
        sodoku last edited by

        Does this work with the new iPad OS ????

        mikael 1 Reply Last reply Reply Quote 0
        • mikael
          mikael @sodoku last edited by

          @sodoku, do you mean if the latest versions have included robust support for non-English characters? Not to my knowledge.

          1 Reply Last reply Reply Quote 0
          • sodoku
            sodoku last edited by

            So good news I got an iPhone 11 and I’m testing this on it, for sudoku, do the pictures taken save anywhere when used, just curious if I have to delete them because after I use it the pictures don’t show up in my pictures app

            1 Reply Last reply Reply Quote 0
            • sodoku
              sodoku last edited by

              Also what’s the updated code posted by Mikael on GitHub used for is it same as this one posted here or not because it’s so much longer and bigger then this code posted on the forum, is it a better version then this one on the forum

              mikael 1 Reply Last reply Reply Quote 0
              • mikael
                mikael @sodoku last edited by

                @sodoku, the code on Github is more of a tool, and much faster than the version at the beginning of this thread. For your purposes, you probably just want pieces of it.

                It supports taking a picture normally and then selecting it from the photo library when you use the tool, or just snapping a quick ”disposable” in-tool image which is not saved.

                pavlinb 1 Reply Last reply Reply Quote 0
                • pavlinb
                  pavlinb @mikael last edited by

                  This post is deleted!
                  1 Reply Last reply Reply Quote 0
                  • sodoku
                    sodoku last edited by

                    So there are a few edits in this thread I don’t know how to piece together to have the best edited version of this???

                    mikael 1 Reply Last reply Reply Quote 0
                    • mikael
                      mikael @sodoku last edited by mikael

                      @sodoku, the one on github is the latest version.

                      1 Reply Last reply Reply Quote 0
                      • sodoku
                        sodoku last edited by

                        I will test it for sudoku in the console, well I will try to convert it to use in console for the sudoku solver if need help I’ll post a message

                        1 Reply Last reply Reply Quote 0
                        • sodoku
                          sodoku last edited by

                          I need help adapting this script for inputting the numbers from a picture of sudoku and insert the starting numbers into a console script, it does not work good for recognizing ones and sevens???

                          Example of sudoku solver

                          In my version I want to make the board is all zeros and when you take a picture it will add the numbers than solve, this combines the two programs (sudoku solver) & (ocr text recognition)

                          board=[
                             [5,8,4,1,0,0,0,0,0],
                             [0,0,6,8,0,0,5,1,0],
                             [0,0,0,0,5,4,7,0,6],
                             [0,5,3,0,1,0,0,6,7],
                             [0,0,0,0,2,0,0,0,0],
                             [4,6,0,0,9,0,8,3,0],
                             [7,0,8,5,4,0,0,0,0],
                             [0,2,9,0,0,3,4,0,0],
                             [0,0,0,0,0,1,3,7,9]
                             ]
                          
                          
                          
                          def solve(bo):
                          
                             find = find_empty(bo)
                             if not find:
                             	return True
                             else:
                             	row,col = find
                          
                             for i in range(1,10):
                             	if valid(bo,i,(row,col)):
                             		bo[row][col] = i
                          
                             		if solve(bo):
                             			return True
                          
                             		bo[row][col] = 0
                          
                             return False
                          
                          
                          
                          def valid(bo,num,pos):
                             #check row
                             for i in range(len(bo[0])):
                             	if bo[pos[0]][i] == num and pos[1] != i:
                             		return False
                             #check column
                             for i in range(len(bo[0])):
                             	if bo[i][pos[1]] == num and pos[0] != i:
                             		return False
                             #check quadrant
                             box_x = pos[1] // 3
                             box_y = pos[0] // 3
                          
                             for i in range(box_y * 3, box_y * 3 + 3):
                             	for j in range(box_x * 3, box_x * 3 + 3):
                             		if bo[i][j] == num and (i,j) != pos:
                             			return False
                          
                             return True
                          
                          
                          
                          def print_board(bo):
                             for i in range(len(bo)):
                             	if i % 3 == 0 and i != 0:
                             		print('------+-------+------')
                          
                             	for j in range(len(bo[0])):
                             		if j % 3 == 0 and j != 0:
                             			print('|',end=' ')
                             		if j == 8:
                             			print(bo[i][j])
                             		else:
                             			print(str(bo[i][j])+ ' ', end='')
                          
                          def find_empty(bo):
                             for i in range(len(bo)):
                             	for j in range(len(bo[0])):
                             		if bo[i][j] == 0:
                             			return (i,j) # row, col
                             return None
                          
                          
                          
                          print_board(board)
                          solve(board)
                          print('=====================')
                          print_board(board)
                          
                          language_preference = ['fi','en','se']
                          
                          import photos, ui, dialogs
                          import io
                          from objc_util import *
                          
                          load_framework('Vision')
                          VNRecognizeTextRequest = ObjCClass('VNRecognizeTextRequest')
                          VNImageRequestHandler = ObjCClass('VNImageRequestHandler')
                          
                          def pil2ui(pil_image):
                             buffer = io.BytesIO()
                             pil_image.save(buffer, format='PNG')
                             return ui.Image.from_data(buffer.getvalue())
                          
                          selection = dialogs.alert('Get pic', button1='Camera', button2='Photos')
                          
                          ui_image = None
                          
                          if selection == 1:
                             pil_image = photos.capture_image()
                             if pil_image is not None:
                                 ui_image = pil2ui(pil_image)
                          elif selection == 2:
                             ui_image = photos.pick_asset().get_ui_image()
                          
                          if ui_image is not None:
                             print('Recognizing...\n')
                          
                             req = VNRecognizeTextRequest.alloc().init().autorelease()
                             req.setRecognitionLanguages_(language_preference)
                             handler = VNImageRequestHandler.alloc().initWithData_options_(ui_image.to_png(), None).autorelease()
                          
                             success = handler.performRequests_error_([req], None)
                             if success:
                                 for result in req.results():
                                     print(result.text())
                             else:
                                 print('Problem recognizing anything') ```
                          mikael 1 Reply Last reply Reply Quote 0
                          • mikael
                            mikael @sodoku last edited by

                            @sodoku, I tried something similar as well a while ago, first recognizing rectangles and then trying to recognize the numbers, but I hit the same issue of very poor recognition of the numbers. I wonder if we would need a number-specific recognizer for that.

                            1 Reply Last reply Reply Quote 0
                            • Spitfire
                              Spitfire last edited by

                              Hi, since the sudoko is a square of many squares I think it is more robust to slice the cells evenly and only have one nr in a small image.

                              Of course downside is to use a recognition service per image and you go from 1 image to 81 - that can get expensive.

                              But it would work more robust.
                              Best reg Tommy

                              mikael 1 Reply Last reply Reply Quote 0
                              • mikael
                                mikael @Spitfire last edited by

                                @Spitfire, thanks. I did try all kinds of approaches, finally resorting to manual cropping, and it still was not reliable enough.

                                pavlinb 1 Reply Last reply Reply Quote 0
                                • pavlinb
                                  pavlinb @mikael last edited by

                                  @mikael Could you share some picture of sudoku, where recognition fails?

                                  1 Reply Last reply Reply Quote 1
                                  • ccc
                                    ccc last edited by

                                    Multiple sample Sudoku puzzles would help to achieve a robust solution.

                                    1 Reply Last reply Reply Quote 0
                                    • sodoku
                                      sodoku last edited by sodoku

                                      I have a few questions about the very first text recognition code posted on this one

                                      the example video I am referring to is https://developer.apple.com/videos/play/wwdc2019/234

                                      1 how do you change the recognition level from fast to accurate
                                      example code from apple website I am not sure if its written in swift or objective c but it is like this :

                                      myTextRegcognitionRequest.recognitionLevel = VNRequestTextRecognitionLevel.accurate
                                      

                                      and another example of this shown in the apple video for setting the recognition level

                                       request.recognitionLevel = .fast 
                                      

                                      question 2
                                      to ensure that numbers don't get mistaken as letters
                                      without the language corrector active to avoid mistaking the number 5 for an S or I as 1
                                      example of this from the video is

                                      extension Character {
                                             
                                           func GetSimilarCharacterIfNotIn(allowedChars: String -> Character {
                                                  let  conversionTable = [
                                                            's':'5',
                                                            'S':'5',
                                                            'i':'1',
                                                            'I':'1', ] 
                                      

                                      question 3
                                      if you know how to set up the special words detector thingy feature mentioned in the video

                                      JonB 1 Reply Last reply Reply Quote 0
                                      • westjensontexas
                                        westjensontexas last edited by westjensontexas

                                        I gather they are looking for words you'd find in an English dictionary. So perhaps façade, or tête-à-tête might recognize, while other examples wouldn't? mobdro apk tubemate

                                        1 Reply Last reply Reply Quote 0
                                        • JonB
                                          JonB @sodoku last edited by

                                          @sodoku
                                          See https://developer.apple.com/documentation/vision/vnrequesttextrecognitionlevel/fast

                                          Try req.recognitionLevel=1 for fast, or 0 for accurate.

                                          Re fixing characters... I gather you might set req.usesLanguageCorrection=False (or maybe 0), then make your own replacement map and use str.translate.

                                          Custom words is handled by
                                          req.customWords = ['customword1', 'etc']

                                          See apple docs for VNRecognizeTextRequest

                                          1 Reply Last reply Reply Quote 0
                                          • sodoku
                                            sodoku last edited by sodoku

                                            ive seen the apple documentation coding on Vision Framework I just dont know how to convert it to python

                                            Question 1
                                            What about the setting the minimum text height how do you translate either of these codes to python????
                                            @property(readwrite, nonatomic, assign) float minimumTextHeight; written in objective-c
                                            var minimumTextHeight: Float { get set } written in Swift

                                            Question 2
                                            I was also interested in learning how to recognize the individual boxes from a sudoku puzzle to extract the numbers is there a way to do that possibly with
                                            VNRecognizedTextObservation A request that detects and recognizes regions of text in an image.
                                            or possibly with the bounding box technique show in the video https://developer.apple.com/videos/play/wwdc2019/234 , also can you put multiple bounding boxes to recognize text from a sudoku card

                                            this is Mikeals code i am trying to insert the code into but dont know how to convert the code shown in the apple documentation into python

                                            language_preference = ['fi','en','se']
                                            
                                            import photos, ui, dialogs
                                            import io
                                            from objc_util import *
                                            
                                            load_framework('Vision')
                                            VNRecognizeTextRequest = ObjCClass('VNRecognizeTextRequest')
                                            VNImageRequestHandler = ObjCClass('VNImageRequestHandler')
                                            
                                            def pil2ui(pil_image):
                                                buffer = io.BytesIO()
                                                pil_image.save(buffer, format='PNG')
                                                return ui.Image.from_data(buffer.getvalue())
                                            
                                            selection = dialogs.alert('Get pic', button1='Camera', button2='Photos')
                                            
                                            ui_image = None
                                            
                                            if selection == 1:
                                                pil_image = photos.capture_image()
                                                if pil_image is not None:
                                                    ui_image = pil2ui(pil_image)
                                            elif selection == 2:
                                                ui_image = photos.pick_asset().get_ui_image()
                                            
                                            if ui_image is not None:
                                                print('Recognizing...\n')
                                            
                                                req = VNRecognizeTextRequest.alloc().init().autorelease()
                                                req.recognitionLevel=1
                                                req.setRecognitionLanguages_(language_preference)
                                                handler = VNImageRequestHandler.alloc().initWithData_options_(ui_image.to_png(), None).autorelease()
                                            
                                                success = handler.performRequests_error_([req], None)
                                                if success:
                                                    for result in req.results():
                                                        print(result.text())
                                                else:
                                                    print('Problem recognizing anything')
                                            
                                            JonB 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Powered by NodeBB Forums | Contributors