MCPcopy
hub / github.com/scrapy/scrapy / should_follow

Method should_follow

scrapy/downloadermiddlewares/offsite.py:67–71  ·  view source on GitHub ↗
(self, request: Request, spider: Spider)

Source from the content-addressed store, hash-verified

65 raise IgnoreRequest
66
67 def should_follow(self, request: Request, spider: Spider) -> bool:
68 regex = self.host_regex
69 # hostname can be None for wrong urls (like javascript links)
70 host = urlparse_cached(request).hostname or ""
71 return bool(regex.search(host))
72
73 def get_host_regex(self, spider: Spider) -> re.Pattern[str]:
74 """Override this method to implement a different offsite policy"""

Callers 1

process_requestMethod · 0.95

Calls 1

urlparse_cachedFunction · 0.90

Tested by

no test coverage detected