I have a data file that I sent to UTF8 which puts the inverted comma character in its output.
How do I detect and remove the “ character from a string? Trying to define is as “”” i.e. 3 “ in a row, is overidden by the editor to become 4 “ in a row. It will not treat the “ character as a character.
#The "with" statement overflows into the next line due to this narrow comment box
with urllib.request.urlopen("https://www.asx.com.au/asx/statistics/todayAnns.do") as response
#Print the entire html string so I know what is in it
#The output of this print statement starts:
#Separately print the first 5 characters in the html string
#The output of this is, including spaces between items:
#b'\r' b'\n' b'\r' b'\n' b'\r'
#Print the first 5 characters in the string
#The output of this is:
I downloaded a webpage successfuly which looks correct in content. When printing slices the characters are different.
Specifically, when I print the first character it shows the first plus the next 3 characters and an apostrophe on the end. So one character becomes 5 characters.
On printing longer slices of the webpage the number of characters is also greater and the apostrophe is always added on the end.
What is happening?