MCPcopy
hub / github.com/scrapy/scrapy / Selector

Class Selector

scrapy/selector/unified.py:39–101  ·  view source on GitHub ↗

An instance of :class:`Selector` is a wrapper over response to select certain parts of its content. ``response`` is an :class:`~scrapy.http.HtmlResponse` or an :class:`~scrapy.http.XmlResponse` object that will be used for selecting and extracting data. ``text`` is a unico

Source from the content-addressed store, hash-verified

37
38
39class Selector(_ParselSelector, object_ref):
40 """
41 An instance of :class:`Selector` is a wrapper over response to select
42 certain parts of its content.
43
44 ``response`` is an :class:`~scrapy.http.HtmlResponse` or an
45 :class:`~scrapy.http.XmlResponse` object that will be used for selecting
46 and extracting data.
47
48 ``text`` is a unicode string or utf-8 encoded text for cases when a
49 ``response`` isn't available. Using ``text`` and ``response`` together is
50 undefined behavior.
51
52 ``type`` defines the selector type, it can be ``"html"``, ``"xml"``, ``"json"``
53 or ``None`` (default).
54
55 If ``type`` is ``None``, the selector automatically chooses the best type
56 based on ``response`` type (see below), or defaults to ``"html"`` in case it
57 is used together with ``text``.
58
59 If ``type`` is ``None`` and a ``response`` is passed, the selector type is
60 inferred from the response type as follows:
61
62 * ``"html"`` for :class:`~scrapy.http.HtmlResponse` type
63 * ``"xml"`` for :class:`~scrapy.http.XmlResponse` type
64 * ``"json"`` for :class:`~scrapy.http.TextResponse` type
65 * ``"html"`` for anything else
66
67 Otherwise, if ``type`` is set, the selector type will be forced and no
68 detection will occur.
69 """
70
71 __slots__ = ["response"]
72 selectorlist_cls = SelectorList
73
74 def __init__(
75 self,
76 response: TextResponse | None = None,
77 text: str | None = None,
78 type: str | None = None, # noqa: A002
79 root: Any | None = _NOT_SET,
80 **kwargs: Any,
81 ):
82 if response is not None and text is not None:
83 raise ValueError(
84 f"{self.__class__.__name__}.__init__() received both response and text"
85 )
86
87 st = _st(response, type)
88
89 if text is not None:
90 response = _response_from_text(text, st)
91
92 if response is not None:
93 text = response.text
94 kwargs.setdefault("base_url", get_base_url(response))
95
96 self.response = response

Callers 15

xmliterFunction · 0.90
xmliter_lxmlFunction · 0.90
_parseMethod · 0.90
selectorMethod · 0.90
test_simple_selectionMethod · 0.90
test_root_base_urlMethod · 0.90
test_flavor_detectionMethod · 0.90
test_weakref_slotsMethod · 0.90

Calls

no outgoing calls