rsstube/scripts/download_page.py

#!/usr/bin/python3

import pycurl
from io import BytesIO
from utils import notify,debug,error

# args should be a dictionary of arguments
# return page bytes, response code
def download (platform, url, args, verbosity, return_http_code=False, follow_location=True):
	page_bytes = BytesIO()
	c = pycurl.Curl()

	c.setopt(c.URL, url)
	c.setopt(c.WRITEDATA, page_bytes)
	c.setopt(c.FOLLOWLOCATION, follow_location)

	# TODO: handle possible arguments
	if "user_agent" in args:
		c.setopt(pycurl.USERAGENT, args["user_agent"])
	if "header" in args:
		c.setopt(pycurl.HTTPHEADER, args["header"])
	notify ("Downloading " + url + "...", verbosity, platform)
	try:
		c.perform()
	except pycurl.error as e:
		error (str(e), verbosity, platform)
		return None
	response_code = c.getinfo(c.RESPONSE_CODE)
	c.close()
	debug (url + " downloaded!", verbosity, platform)
	debug ("Response code: " + str(response_code), verbosity, platform)
	if return_http_code:
		return page_bytes.getvalue().decode('utf8'),response_code
	else:
		return page_bytes.getvalue().decode('utf8')
Initial code push. This version of rsstube works but is not complete. 2021-07-22 20:00:00 -04:00			`#!/usr/bin/python3`

			`import pycurl`
			`from io import BytesIO`
Improve error handling a little. 2021-07-24 20:00:00 -04:00			`from utils import notify,debug,error`
Initial code push. This version of rsstube works but is not complete. 2021-07-22 20:00:00 -04:00
			`# args should be a dictionary of arguments`
			`# return page bytes, response code`
Support more Player FM links. 2021-11-07 19:00:00 -05:00			`def download (platform, url, args, verbosity, return_http_code=False, follow_location=True):`
Initial code push. This version of rsstube works but is not complete. 2021-07-22 20:00:00 -04:00			`page_bytes = BytesIO()`
			`c = pycurl.Curl()`

			`c.setopt(c.URL, url)`
			`c.setopt(c.WRITEDATA, page_bytes)`
Support more Player FM links. 2021-11-07 19:00:00 -05:00			`c.setopt(c.FOLLOWLOCATION, follow_location)`
Initial code push. This version of rsstube works but is not complete. 2021-07-22 20:00:00 -04:00
			`# TODO: handle possible arguments`
Add unbreak option for dealing with Cloudflare. 2021-12-29 19:00:00 -05:00			`if "user_agent" in args:`
			`c.setopt(pycurl.USERAGENT, args["user_agent"])`
			`if "header" in args:`
			`c.setopt(pycurl.HTTPHEADER, args["header"])`
Initial code push. This version of rsstube works but is not complete. 2021-07-22 20:00:00 -04:00			`notify ("Downloading " + url + "...", verbosity, platform)`
Improve error handling a little. 2021-07-24 20:00:00 -04:00			`try:`
			`c.perform()`
			`except pycurl.error as e:`
			`error (str(e), verbosity, platform)`
			`return None`
Initial code push. This version of rsstube works but is not complete. 2021-07-22 20:00:00 -04:00			`response_code = c.getinfo(c.RESPONSE_CODE)`
			`c.close()`
			`debug (url + " downloaded!", verbosity, platform)`
			`debug ("Response code: " + str(response_code), verbosity, platform)`
			`if return_http_code:`
			`return page_bytes.getvalue().decode('utf8'),response_code`
			`else:`
			`return page_bytes.getvalue().decode('utf8')`