Handle redirects from generic site to specific site #6

Open
opened 2022-10-21 15:13:57 -04:00 by 0x80 · 0 comments
Owner

For example, https://sadboyzpod.com/itunes redirects to https://podcasts.apple.com/us/podcast/sad-boyz/id1296625412. rsstube knows how to get the feed from podcasts.apple.com, but it doesn't identify https://sadboyzpod.com/itunes as a podcasts.apple.com link (because it's not).

It would be nice if it could figure this out. In particular, maybe rather than just telling pycurl to FOLLOWLOCATION, rsstube should just receive the redirect, then re-run on the new URL. This should handle this situation:

  1. call rsstube https://sadboyzpod.com/itunes
  2. download https://sadboyzpod.com/itunes (301, location: https://itunes.apple.com/us/podcast/sad-boyz/id1296625412)
  3. call rsstube https://itunes.apple.com/us/podcast/sad-boyz/id1296625412
  4. download https://itunes.apple.com/us/podcast/sad-boyz/id1296625412 (301, location: https://podcasts.apple.com/us/podcast/sad-boyz/id1296625412)
  5. call rsstube https://podcasts.apple.com/us/podcast/sad-boyz/id1296625412
  6. get feed

Simply doing this anytime a page is downloaded could cause its own issues. Figure out how and when to apply this rule.

(I don't anticipate this change making it into the Python version of rsstube, but it's an issue that should be documented somewhere.)

For example, https://sadboyzpod.com/itunes redirects to https://podcasts.apple.com/us/podcast/sad-boyz/id1296625412. rsstube knows how to get the feed from podcasts.apple.com, but it doesn't identify https://sadboyzpod.com/itunes as a podcasts.apple.com link (because it's not). It would be nice if it could figure this out. In particular, maybe rather than just telling pycurl to FOLLOWLOCATION, rsstube should just receive the redirect, then re-run on the new URL. This should handle this situation: 1. call `rsstube https://sadboyzpod.com/itunes` 2. download https://sadboyzpod.com/itunes (301, location: https://itunes.apple.com/us/podcast/sad-boyz/id1296625412) 3. call `rsstube https://itunes.apple.com/us/podcast/sad-boyz/id1296625412` 4. download https://itunes.apple.com/us/podcast/sad-boyz/id1296625412 (301, location: https://podcasts.apple.com/us/podcast/sad-boyz/id1296625412) 5. call `rsstube https://podcasts.apple.com/us/podcast/sad-boyz/id1296625412` 6. get feed Simply doing this *anytime* a page is downloaded could cause its own issues. Figure out how and when to apply this rule. (I don't anticipate this change making it into the Python version of rsstube, but it's an issue that should be documented somewhere.)
Sign in to join this conversation.
No Label
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: 0x80/rsstube#6
No description provided.