I have a few wacky ideas on this.
TextView inside of a scrollview.
textview_did_change_selection or textview_should_change would use ui.measure_string (which has to be inside a drawing context as i recall) to figure out the width of text before the cursor, and then would adjust the scrollview's content_offset to keep the cursor on screen.
My initial thought was to compute this on the fly, which might be insanely slow, since you'd be calling measure_string each and every keystroke, but I've been surprised at the speed of the built in drawing functions. Also, since you're using a monospace font, just compute the character width once during startup (maybe compute the average character width for a string containing all of the alpha numeric, uppercase, punctuation that are typical, so that even proportional fonts would be approximately correct).
Then, your textview_should_change simply grabs selected_range[1], multiplies by the character width, and then adjusts the offset to leave a certain amount of gap.
You could instead use an HTML input field controlled with javascript inside a WebView. This has the advantage of being very customizable, but the disadvantage of being javascript, and debugging javascript within pythonista is akin to trying to program using only a typewriter.
I've been long considering trying to create a fully custom ui.View that acts similar to a textfield. touch_began would focus a hidden textview, to accept keystrokes and in particular backspaces. This would require implementing customized timers to implement gestures for the ui events (long tap to bring up a copy/paste dialog, double tap to select words, dragging selection boundaries, etc). The deal breaker would probably be that as soon as you touch anywhere, the keyboard might try to disappear, thinking input has ended.