hub / github.com/python/cpython / get_close_matches

Function get_close_matches

Lib/difflib.py:667–715 · view source on GitHub ↗

Use SequenceMatcher to return list of the best "good enough" matches. word is a sequence for which close matches are desired (typically a string). possibilities is a list of sequences against which to match word (typically a list of strings). Optional arg n (default 3) is the

(word, possibilities, n=3, cutoff=0.6)

Source from the content-addressed store, hash-verified

665
666
667	def get_close_matches(word, possibilities, n=3, cutoff=0.6):
668	"""Use SequenceMatcher to return list of the best "good enough" matches.
669
670	word is a sequence for which close matches are desired (typically a
671	string).
672
673	possibilities is a list of sequences against which to match word
674	(typically a list of strings).
675
676	Optional arg n (default 3) is the maximum number of close matches to
677	return. n must be > 0.
678
679	Optional arg cutoff (default 0.6) is a float in [0, 1]. Possibilities
680	that don't score at least that similar to word are ignored.
681
682	The best (no more than n) matches among the possibilities are returned
683	in a list, sorted by similarity score, most similar first.
684
685	>>> get_close_matches("appel", ["ape", "apple", "peach", "puppy"])
686	['apple', 'ape']
687	>>> import keyword as _keyword
688	>>> get_close_matches("wheel", _keyword.kwlist)
689	['while']
690	>>> get_close_matches("Apple", _keyword.kwlist)
691	[]
692	>>> get_close_matches("accept", _keyword.kwlist)
693	['except']
694	"""
695
696	if not n > 0:
697	raise ValueError("n must be > 0: %r" % (n,))
698	if not 0.0 <= cutoff <= 1.0:
699	raise ValueError("cutoff must be in [0.0, 1.0]: %r" % (cutoff,))
700	result = []
701	s = SequenceMatcher()
702	s.set_seq2(word)
703	for x in possibilities:
704	s.set_seq1(x)
705	if s.real_quick_ratio() < cutoff or s.quick_ratio() < cutoff:
706	continue
707
708	ratio = s.ratio()
709	if ratio >= cutoff:
710	result.append((ratio, x))
711
712	# Move the best scorers to head of list
713	result = _nlargest(n, result)
714	# Strip scores for the best n matches
715	return [x for score, x in result]
716
717
718	def _keep_original_ws(s, tag_s):

Callers

nothing calls this directly

Calls 7

set_seq2Method · 0.95

set_seq1Method · 0.95

real_quick_ratioMethod · 0.95

quick_ratioMethod · 0.95

ratioMethod · 0.95

SequenceMatcherClass · 0.85

appendMethod · 0.45

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…