An instance of :class:`Selector` is a wrapper over response to select certain parts of its content. ``response`` is an :class:`~scrapy.http.HtmlResponse` or an :class:`~scrapy.http.XmlResponse` object that will be used for selecting and extracting data. ``text`` is a unico
| 37 | |
| 38 | |
| 39 | class Selector(_ParselSelector, object_ref): |
| 40 | """ |
| 41 | An instance of :class:`Selector` is a wrapper over response to select |
| 42 | certain parts of its content. |
| 43 | |
| 44 | ``response`` is an :class:`~scrapy.http.HtmlResponse` or an |
| 45 | :class:`~scrapy.http.XmlResponse` object that will be used for selecting |
| 46 | and extracting data. |
| 47 | |
| 48 | ``text`` is a unicode string or utf-8 encoded text for cases when a |
| 49 | ``response`` isn't available. Using ``text`` and ``response`` together is |
| 50 | undefined behavior. |
| 51 | |
| 52 | ``type`` defines the selector type, it can be ``"html"``, ``"xml"``, ``"json"`` |
| 53 | or ``None`` (default). |
| 54 | |
| 55 | If ``type`` is ``None``, the selector automatically chooses the best type |
| 56 | based on ``response`` type (see below), or defaults to ``"html"`` in case it |
| 57 | is used together with ``text``. |
| 58 | |
| 59 | If ``type`` is ``None`` and a ``response`` is passed, the selector type is |
| 60 | inferred from the response type as follows: |
| 61 | |
| 62 | * ``"html"`` for :class:`~scrapy.http.HtmlResponse` type |
| 63 | * ``"xml"`` for :class:`~scrapy.http.XmlResponse` type |
| 64 | * ``"json"`` for :class:`~scrapy.http.TextResponse` type |
| 65 | * ``"html"`` for anything else |
| 66 | |
| 67 | Otherwise, if ``type`` is set, the selector type will be forced and no |
| 68 | detection will occur. |
| 69 | """ |
| 70 | |
| 71 | __slots__ = ["response"] |
| 72 | selectorlist_cls = SelectorList |
| 73 | |
| 74 | def __init__( |
| 75 | self, |
| 76 | response: TextResponse | None = None, |
| 77 | text: str | None = None, |
| 78 | type: str | None = None, # noqa: A002 |
| 79 | root: Any | None = _NOT_SET, |
| 80 | **kwargs: Any, |
| 81 | ): |
| 82 | if response is not None and text is not None: |
| 83 | raise ValueError( |
| 84 | f"{self.__class__.__name__}.__init__() received both response and text" |
| 85 | ) |
| 86 | |
| 87 | st = _st(response, type) |
| 88 | |
| 89 | if text is not None: |
| 90 | response = _response_from_text(text, st) |
| 91 | |
| 92 | if response is not None: |
| 93 | text = response.text |
| 94 | kwargs.setdefault("base_url", get_base_url(response)) |
| 95 | |
| 96 | self.response = response |
no outgoing calls