Using continue with yield
-
Sorry, this is not a Pythonista question, it's a Python Question. I have looked this up, but I don't get it.
The below is just walking through the dir structure. Just copied the code from stackflow. Well I added the yield as I wanted to return a generator.
What I am having difficulty with is filtering. I want to introduce ignore dir lust as well as define what file types (ext) I want returned.the filter conditions are not a problem. I am just not sure what to do if I want to ignore a file either based on its ext or dir.
My simple idea was to use continue if it failed my filter test otherwise yield.
From what I can ascertain, this does not work, the generator terminates. Eg, no more values. I am not even sure it's possible to have conditional tests inside a generator to skip items. I know I could return some flag, but that's ugly.The code without filtering
def allfiles(self): for path, subdirs, files in os.walk(self.root_dir): for filename in files: f = os.path.join(path, filename) yield f
-
There's no need to do any filtering in
allfiles
, you can do that afterwards with a list comprehension (or a normalfor
loop):print([filename for filename in self.allfiles() if os.path.splitext(filename)[1] == "py"]) for filename in self.allfiles(): if os.path.dirname(filename) == os.path.expanduser("~/Documents"): print(filename)
-
@dgelessus , yes I realised that and I do that at the moment, I just seemed redundant to me that's all. Also if other methods are calling allfiles or what would become something like _allfiles with a filter param, the intention gets more explicit as per import this
Maybe it's not possible to skip over items in a generator, I really don't know
-
I'm not sure if I understand the question correctly, but you can just not use a
yield
statement for items in your generator that you want to skip, i.e. something likeif some_condition: yield item
Below is a complete
allfiles
function that accepts two regular expressions for filtering the results; one that is matched against subdirectory names (matches are skipped), and one that is matched against file extensions (matches are included).The example iterates over all py/txt/md files in ~/Documents, except for files in site-packages or .Trash.
import os import re def allfiles(root_dir, skip_dirs_re=None, file_ext_re=None): for path, subdirs, files in os.walk(root_dir): if skip_dirs_re: new_subdirs = [] for subdir in subdirs: if not re.match(skip_dirs_re, subdir): new_subdirs.append(subdir) subdirs[:] = new_subdirs for filename in files: ext = os.path.splitext(filename)[1][1:] if (file_ext_re is None) or (re.match(file_ext_re, ext, re.IGNORECASE)): full_path = os.path.join(path, filename) yield full_path skip_dirs = '\\.Trash|site-packages' exts = 'py|md|txt' root_dir = os.path.expanduser('~/Documents') for file_path in allfiles(root_dir, skip_dirs, exts): print file_path
-
@omz , ok thanks. You answered my question and give a nice allfiles 😜
My main thing was not understanding if I could skip files or not with the generator. It sort of makes sense you can't. I just thought python might do some under the hood tricks if it sees a continue in a generator. But it appears not.
Anyway thanks guys. I know it was not a Pythonista question. Just had trouble tracking down the answer
-
This post is deleted!last edited by JonB
-
@Phuket2
Just think of using print instead of yield. Anything that gets printed would be yielded. If you wanted to skip printing an item you would useif some_condition: print item
or alternatively
if some_skip_condition: continue print item
now replace print with yield and you have a generator. I don't think there are any restrictions on the type of control structures you use in a generator, only that your logic must be correct in the first place. If you had tried something like that and it didn't work, it wouldn't have worked if you used print instead of yield!