Spawn ripgrep against *repo_root*; return ``(matches, truncated, error)``. Each entry in *matches* is a scrubbed line of the form ``[ ] : : ``; lines that don't match the expected `` : : `` shape are dropped. The docstri
(
pattern: str,
repo_id: str,
repo_root: Path,
include: str | None,
max_count: int,
)
| 162 | |
| 163 | |
| 164 | def _run_rg( |
| 165 | pattern: str, |
| 166 | repo_id: str, |
| 167 | repo_root: Path, |
| 168 | include: str | None, |
| 169 | max_count: int, |
| 170 | ) -> tuple[list[str], bool, str | None]: |
| 171 | """Spawn ripgrep against *repo_root*; return ``(matches, truncated, error)``. |
| 172 | |
| 173 | Each entry in *matches* is a scrubbed line of the form |
| 174 | ``[<repo_id>] <repo-relative-path>:<line>:<content>``; lines that |
| 175 | don't match the expected ``<abs-path>:<line>:<content>`` shape are |
| 176 | dropped. The docstring promise is that absolute indexing-host |
| 177 | paths never reach the caller, so anything we can't safely rewrite |
| 178 | (binary-file notes, permission warnings, malformed lines) is |
| 179 | filtered out rather than echoed verbatim. |
| 180 | |
| 181 | *max_count* is enforced **per repo** — rg's own ``--max-count`` is |
| 182 | a per-file cap, so a repo with N matching files would otherwise |
| 183 | return up to ``N × max_count`` lines. We pass it through as a |
| 184 | safety bound on per-file work and then truncate the joined result |
| 185 | to ``max_count`` here. *truncated* is True when more matches |
| 186 | existed than were returned. |
| 187 | """ |
| 188 | cmd = [ |
| 189 | "rg", |
| 190 | "--no-heading", |
| 191 | "--line-number", |
| 192 | # rg's ``--max-count`` is a per-file cap. The +1 ensures a |
| 193 | # single file with > max_count matches yields enough rows to |
| 194 | # trip the post-slice's ``len(scrubbed) > max_count`` overflow |
| 195 | # check, which then slices the joined output back to max_count. |
| 196 | "--max-count", |
| 197 | str(max_count + 1), |
| 198 | ] |
| 199 | if include: |
| 200 | cmd.extend(["--glob", include]) |
| 201 | # `--` separator: defends against patterns that begin with `-` |
| 202 | # being interpreted as an rg flag. |
| 203 | cmd.append("--") |
| 204 | cmd.append(pattern) |
| 205 | cmd.append(str(repo_root)) |
| 206 | |
| 207 | try: |
| 208 | proc = subprocess.run( |
| 209 | cmd, |
| 210 | capture_output=True, |
| 211 | text=True, |
| 212 | timeout=_RG_TIMEOUT_SECONDS, |
| 213 | ) |
| 214 | except subprocess.TimeoutExpired: |
| 215 | return [], False, f"ripgrep timed out after {_RG_TIMEOUT_SECONDS}s." |
| 216 | |
| 217 | # rg exit code 1 means "no matches" — that's a normal outcome, |
| 218 | # not an error. Anything else (2 = invalid args/regex, 130 = |
| 219 | # signal, etc.) is a real failure worth surfacing. |
| 220 | if proc.returncode == 1: |
| 221 | return [], False, None |