Difference reports whether two lists of lengths nx and ny are equal given the definition of equality provided as f. This function returns an edit-script, which is a sequence of operations needed to convert one list into the other. The following invariants for the edit-script are maintained: - eq ==
(nx, ny int, f EqualFunc)
| 136 | // favors performance over optimality. The exact output is not guaranteed to |
| 137 | // be stable and may change over time. |
| 138 | func Difference(nx, ny int, f EqualFunc) (es EditScript) { |
| 139 | // This algorithm is based on traversing what is known as an "edit-graph". |
| 140 | // See Figure 1 from "An O(ND) Difference Algorithm and Its Variations" |
| 141 | // by Eugene W. Myers. Since D can be as large as N itself, this is |
| 142 | // effectively O(N^2). Unlike the algorithm from that paper, we are not |
| 143 | // interested in the optimal path, but at least some "decent" path. |
| 144 | // |
| 145 | // For example, let X and Y be lists of symbols: |
| 146 | // X = [A B C A B B A] |
| 147 | // Y = [C B A B A C] |
| 148 | // |
| 149 | // The edit-graph can be drawn as the following: |
| 150 | // A B C A B B A |
| 151 | // ┌─────────────┐ |
| 152 | // C │_|_|\|_|_|_|_│ 0 |
| 153 | // B │_|\|_|_|\|\|_│ 1 |
| 154 | // A │\|_|_|\|_|_|\│ 2 |
| 155 | // B │_|\|_|_|\|\|_│ 3 |
| 156 | // A │\|_|_|\|_|_|\│ 4 |
| 157 | // C │ | |\| | | | │ 5 |
| 158 | // └─────────────┘ 6 |
| 159 | // 0 1 2 3 4 5 6 7 |
| 160 | // |
| 161 | // List X is written along the horizontal axis, while list Y is written |
| 162 | // along the vertical axis. At any point on this grid, if the symbol in |
| 163 | // list X matches the corresponding symbol in list Y, then a '\' is drawn. |
| 164 | // The goal of any minimal edit-script algorithm is to find a path from the |
| 165 | // top-left corner to the bottom-right corner, while traveling through the |
| 166 | // fewest horizontal or vertical edges. |
| 167 | // A horizontal edge is equivalent to inserting a symbol from list X. |
| 168 | // A vertical edge is equivalent to inserting a symbol from list Y. |
| 169 | // A diagonal edge is equivalent to a matching symbol between both X and Y. |
| 170 | |
| 171 | // Invariants: |
| 172 | // - 0 ≤ fwdPath.X ≤ (fwdFrontier.X, revFrontier.X) ≤ revPath.X ≤ nx |
| 173 | // - 0 ≤ fwdPath.Y ≤ (fwdFrontier.Y, revFrontier.Y) ≤ revPath.Y ≤ ny |
| 174 | // |
| 175 | // In general: |
| 176 | // - fwdFrontier.X < revFrontier.X |
| 177 | // - fwdFrontier.Y < revFrontier.Y |
| 178 | // |
| 179 | // Unless, it is time for the algorithm to terminate. |
| 180 | fwdPath := path{+1, point{0, 0}, make(EditScript, 0, (nx+ny)/2)} |
| 181 | revPath := path{-1, point{nx, ny}, make(EditScript, 0)} |
| 182 | fwdFrontier := fwdPath.point // Forward search frontier |
| 183 | revFrontier := revPath.point // Reverse search frontier |
| 184 | |
| 185 | // Search budget bounds the cost of searching for better paths. |
| 186 | // The longest sequence of non-matching symbols that can be tolerated is |
| 187 | // approximately the square-root of the search budget. |
| 188 | searchBudget := 4 * (nx + ny) // O(n) |
| 189 | |
| 190 | // Running the tests with the "cmp_debug" build tag prints a visualization |
| 191 | // of the algorithm running in real-time. This is educational for |
| 192 | // understanding how the algorithm works. See debug_enable.go. |
| 193 | f = debug.Begin(nx, ny, f, &fwdPath.es, &revPath.es) |
| 194 | |
| 195 | // The algorithm below is a greedy, meet-in-the-middle algorithm for |