MCPcopy
hub / github.com/scrapy/scrapy / gunzip

Function gunzip

scrapy/utils/gz.py:14–36  ·  view source on GitHub ↗

Gunzip the given data and return as much data as possible. This is resilient to CRC checksum errors.

(data: bytes, *, max_size: int = 0)

Source from the content-addressed store, hash-verified

12
13
14def gunzip(data: bytes, *, max_size: int = 0) -> bytes:
15 """Gunzip the given data and return as much data as possible.
16
17 This is resilient to CRC checksum errors.
18 """
19 f = GzipFile(fileobj=BytesIO(data))
20 output_stream = BytesIO()
21 chunk = b"."
22 decompressed_size = 0
23 while chunk:
24 try:
25 chunk = f.read1(_CHUNK_SIZE)
26 except (OSError, EOFError, struct.error):
27 # complete only if there is some data, otherwise re-raise
28 # see issue 87 about catching struct.error
29 # some pages are quite small so output_stream is empty
30 if output_stream.getbuffer().nbytes > 0:
31 break
32 raise
33 decompressed_size += len(chunk)
34 _check_max_size(decompressed_size, max_size)
35 output_stream.write(chunk)
36 return output_stream.getvalue()
37
38
39def gzip_magic_number(response: Response) -> bool:

Callers 8

_get_sitemap_bodyMethod · 0.90
_decodeMethod · 0.90
test_gunzip_basicFunction · 0.90
test_gunzip_truncatedFunction · 0.90
test_gunzip_illegal_eofFunction · 0.90

Calls 2

_check_max_sizeFunction · 0.85
writeMethod · 0.45