This is a convenient helper class that keeps track of, manages and runs crawlers inside an already setup :mod:`~twisted.internet.reactor`. The CrawlerRunner object must be instantiated with a :class:`~scrapy.settings.Settings` object. This class shouldn't be needed (since Scra
| 395 | |
| 396 | |
| 397 | class CrawlerRunner(CrawlerRunnerBase): |
| 398 | """ |
| 399 | This is a convenient helper class that keeps track of, manages and runs |
| 400 | crawlers inside an already setup :mod:`~twisted.internet.reactor`. |
| 401 | |
| 402 | The CrawlerRunner object must be instantiated with a |
| 403 | :class:`~scrapy.settings.Settings` object. |
| 404 | |
| 405 | This class shouldn't be needed (since Scrapy is responsible of using it |
| 406 | accordingly) unless writing scripts that manually handle the crawling |
| 407 | process. See :ref:`run-from-script` for an example. |
| 408 | |
| 409 | This class provides Deferred-based APIs. Use :class:`AsyncCrawlerRunner` |
| 410 | for modern coroutine APIs. |
| 411 | """ |
| 412 | |
| 413 | def __init__(self, settings: dict[str, Any] | Settings | None = None): |
| 414 | super().__init__(settings) |
| 415 | if not self.settings.getbool("TWISTED_REACTOR_ENABLED"): |
| 416 | raise RuntimeError( |
| 417 | f"{type(self).__name__} doesn't support TWISTED_REACTOR_ENABLED=False." |
| 418 | ) |
| 419 | self._active: set[Deferred[None]] = set() |
| 420 | |
| 421 | def crawl( |
| 422 | self, |
| 423 | crawler_or_spidercls: type[Spider] | str | Crawler, |
| 424 | *args: Any, |
| 425 | **kwargs: Any, |
| 426 | ) -> Deferred[None]: |
| 427 | """ |
| 428 | Run a crawler with the provided arguments. |
| 429 | |
| 430 | It will call the given Crawler's :meth:`~Crawler.crawl` method, while |
| 431 | keeping track of it so it can be stopped later. |
| 432 | |
| 433 | If ``crawler_or_spidercls`` isn't a :class:`~scrapy.crawler.Crawler` |
| 434 | instance, this method will try to create one using this parameter as |
| 435 | the spider class given to it. |
| 436 | |
| 437 | Returns a deferred that is fired when the crawling is finished. |
| 438 | |
| 439 | :param crawler_or_spidercls: already created crawler, or a spider class |
| 440 | or spider's name inside the project to create it |
| 441 | :type crawler_or_spidercls: :class:`~scrapy.crawler.Crawler` instance, |
| 442 | :class:`~scrapy.spiders.Spider` subclass or string |
| 443 | |
| 444 | :param args: arguments to initialize the spider |
| 445 | |
| 446 | :param kwargs: keyword arguments to initialize the spider |
| 447 | """ |
| 448 | if isinstance(crawler_or_spidercls, Spider): |
| 449 | raise ValueError( |
| 450 | "The crawler_or_spidercls argument cannot be a spider object, " |
| 451 | "it must be a spider class (or a Crawler object)" |
| 452 | ) |
| 453 | crawler = self.create_crawler(crawler_or_spidercls) |
| 454 | return self._crawl(crawler, *args, **kwargs) |
no outgoing calls