Welcome!
This is the community forum for my apps Pythonista and Editorial.
For individual support questions, you can also send an email. If you have a very short question or just want to say hello — I'm @olemoritz on Twitter.
re (Regular Expression) module "caret" character not working?
-
In the code below, the action should replace all words that begin with "the" as a list. But it returns a blank list.
<pre>
# Extracts words from input text and outputs it as a list
import re
import editor
import workflowparams = workflow.get_parameters()
sentence = workflow.get_input()list = []
expression = '^the'
pattern = re.compile(expression)
matches = re.findall(pattern,sentence)for word in matches:
list.append(word)workflow.set_output('\n'.join(list))
</pre>
This code works with other special characters, but this isn't working. Any tips?
-
Caret matches only at the start of a string, I.e if the is the first work in the sentence.
The expression you are looking for probably looks likeexpression='\bthe'
\b matches but does not consume a word boundary.
-
Remember that you can also simplify things by using list comprehensions:
# Instead of this: list = [] for word in matches: list.append(word) workflow.set_output('\n'.join(list)) # You can just write this: workflow.set_output('\n'.join([word for word in matches]))
-
@ccc The list comprehension seems redundant here,
'\n'.join(matches)
would do the same, as far as I can see.Btw, it's not a good idea to use
list
as a variable name, you'll run into problems when you try to use the built-in functionlist()
. -
I am still not getting a match on the word "them" nor on an email that begins with "the."
<pre>
Extracts words from input text and outputs it as a list
import re
import editor
import workflowparams = workflow.get_parameters()
sentence = workflow.get_input()match_list = []
expression = '\bthe'
pattern = re.compile(expression)
matches = re.findall(pattern,sentence)for word in matches:
match_list.append(word)workflow.set_output('\n'.join(match_list))
</pre>
-
You need to escape the backslash in the pattern or use a raw string, i.e. use either
'\\bthe'
orr'\bthe'
. -
Thanks ole. That did help.