Commit
1 year ago
gh-37549: Faster chromatic symmetric function computation <!-- ^ Please provide a concise and informative title. --> <!-- ^ Don't put issue numbers in the title, do this in the PR description below. --> <!-- ^ For example, instead of "Fixes #12345" use "Introduce new method to calculate 1 + 2". --> <!-- v Describe your changes below in detail. --> <!-- v Why is this change required? What problem does it solve? --> <!-- v If this PR resolves an open issue, please link to it here. For example, "Fixes #12345". --> ### Description Computation of the chromatic symmetric function is reimplemented to be much faster. The algorithm traverses a binary tree whose leaves represent subsets of edges. The induced set partition on vertices is incrementally updated using a disjoint-set forest so that the set partition does not need to be completely recomputed for each subset. This approach further allows us to prune branches of the binary tree when the next edge of the graph would introduce a cycle to the subset, since the terms produced by the two subtrees cancel in this case. The resulting speedup is dramatic for graphs with many cycles. I set up the following performance tests with ``` sage: from sage.misc.randstate import random ``` and setting the random seed before each individual command with ``` sage: set_random_seed(5) ``` Here are some examples of the performance before this change: ``` sage: %timeit graphs.RandomTree(5,seed=random()).chromatic_symmetric_function() 1.17 ms ± 13.9 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each) sage: %timeit graphs.RandomTree(12,seed=random()).chromatic_symmetric_function() 128 ms ± 5.66 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) sage: %timeit graphs.RandomTree(19,seed=random()).chromatic_symmetric_function() 21.7 s ± 728 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) sage: %timeit graphs.RandomGNM(20,17,seed=random()).chromatic_symmetric_function() 11 s ± 94 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) sage: %timeit graphs.RandomGNM(8,11,seed=random()).chromatic_symmetric_function() 112 ms ± 2.44 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) sage: %timeit graphs.RandomGNM(7,17,seed=random()).chromatic_symmetric_function() 7.36 s ± 86.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` And after: ``` sage: %timeit graphs.RandomTree(5,seed=random()).chromatic_symmetric_function() 620 µs ± 11.6 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each) sage: %timeit graphs.RandomTree(12,seed=random()).chromatic_symmetric_function() 25.9 ms ± 1.06 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) sage: %timeit graphs.RandomTree(19,seed=random()).chromatic_symmetric_function() 3.84 s ± 34.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) sage: %timeit graphs.RandomGNM(20,17,seed=random()).chromatic_symmetric_function() 1.34 s ± 290 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) sage: %timeit graphs.RandomGNM(8,11,seed=random()).chromatic_symmetric_function() 8.33 ms ± 208 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) sage: %timeit graphs.RandomGNM(7,17,seed=random()).chromatic_symmetric_function() 27 ms ± 365 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) ``` The first four tests represent some of the least optimal scenarios for the new implementation since the graphs contain zero or few cycles. Nevertheless, there is significant to astronomical improvement across the board. The time of old implementation should grow approximately with 2^(number of edges) which is reflected in the tests. We should therefore expect a graph with 30 edges to take the better part of a day on my computer. As a final show of force we see that the new algorithm can handle some such graphs faster than the old one can handle 17 edges: ``` sage: %timeit graphs.RandomGNM(10,30,seed=random()).chromatic_symmetric_function() 5.49 s ± 315 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` While there is a cost to code readability, this change can make the difference between a computation being feasible or not. I opted not to balance the disjoint-set forest which would add even more verboseness. The following test represents the forest being constructed in the least and most optimal ways in a decently sized graph, and the difference is essentially within margin of error. ``` sage: G = Graph({18 : list(range(18))}) sage: H = Graph({0 : list(range(1,19))}) sage: %timeit G.chromatic_symmetric_function() 3.74 s ± 44.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) sage: %timeit H.chromatic_symmetric_function() 3.68 s ± 84.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` ### :memo: Checklist <!-- Put an `x` in all the boxes that apply. --> - [x] The title is concise and informative. - [x] The description explains in detail what this PR is about. - [ ] I have linked a relevant issue or discussion. - [x] I have created tests covering the changes. - [ ] I have updated the documentation accordingly. ### :hourglass: Dependencies <!-- List all open PRs that this PR logically depends on. For example, --> <!-- - #12345: short description why this is a dependency --> <!-- - #34567: ... --> URL: https://github.com/sagemath/sage/pull/37549 Reported by: Henry Ehrhard Reviewer(s): Frédéric Chapoton, Henry Ehrhard, Martin Rubey, Travis Scrimshaw
Author
Release Manager
Loading