So I've got to pull some text out of some html pages. The parts are marked like so:
'<!- TERM1_[ ->Html data I want in here<!- ]_TERM1 ->'
And there are 4 or 5 different terms in different areas of the page. I'm very new to all of this so I'm not quite seeing the best way to pull that out. Since the terms aren't valid html tags it looks like BS4 won't help? Use a regular expression and group the part in between?
Any guidance or suggestions would be very helpful. Thanks.