MCPcopy
hub / github.com/urllib3/urllib3 / parse_url

Function parse_url

src/urllib3/util/url.py:367–469  ·  view source on GitHub ↗

Given a url, return a parsed :class:`.Url` namedtuple. Best-effort is performed to parse incomplete urls. Fields not provided will be None. This parser is RFC 3986 and RFC 6874 compliant. The parser logic and helper functions are based heavily on work done in the ``rfc3986`` mo

(url: str)

Source from the content-addressed store, hash-verified

365
366
367def parse_url(url: str) -> Url:
368 """
369 Given a url, return a parsed :class:`.Url` namedtuple. Best-effort is
370 performed to parse incomplete urls. Fields not provided will be None.
371 This parser is RFC 3986 and RFC 6874 compliant.
372
373 The parser logic and helper functions are based heavily on
374 work done in the ``rfc3986`` module.
375
376 :param str url: URL to parse into a :class:`.Url` namedtuple.
377
378 Partly backwards-compatible with :mod:`urllib.parse`.
379
380 Example:
381
382 .. code-block:: python
383
384 import urllib3
385
386 print( urllib3.util.parse_url('http://google.com/mail/'))
387 # Url(scheme='http', host='google.com', port=None, path='/mail/', ...)
388
389 print( urllib3.util.parse_url('google.com:80'))
390 # Url(scheme=None, host='google.com', port=80, path=None, ...)
391
392 print( urllib3.util.parse_url('/foo?bar'))
393 # Url(scheme=None, host=None, port=None, path='/foo', query='bar', ...)
394 """
395 if not url:
396 # Empty
397 return Url()
398
399 source_url = url
400 if not _SCHEME_RE.search(url):
401 url = "//" + url
402
403 scheme: str | None
404 authority: str | None
405 auth: str | None
406 host: str | None
407 port: str | None
408 port_int: int | None
409 path: str | None
410 query: str | None
411 fragment: str | None
412
413 try:
414 scheme, authority, path, query, fragment = _URI_RE.match(url).groups() # type: ignore[union-attr]
415 normalize_uri = scheme is None or scheme.lower() in _NORMALIZABLE_SCHEMES
416
417 if scheme:
418 scheme = scheme.lower()
419
420 if authority:
421 auth, _, host_port = authority.rpartition("@")
422 auth = auth or None
423 host, port = _HOST_PORT_RE.match(host_port).groups() # type: ignore[union-attr]
424 if auth and normalize_uri:

Calls 5

UrlClass · 0.85
_encode_invalid_charsFunction · 0.85
LocationParseErrorClass · 0.85
_normalize_hostFunction · 0.70