swift
[Mem2Reg] Canonicalize promoted lifetimes.
#67526
Merged

[Mem2Reg] Canonicalize promoted lifetimes. #67526

nate-chandler
nate-chandler2 years ago

Canonicalize the lifetimes of values which were stored to the promoted alloc_stack and of the newly created phis.

nate-chandler nate-chandler force pushed from ce39bda9 to 02549d32 2 years ago
nate-chandler nate-chandler force pushed from 02549d32 to 7750e161 2 years ago
nate-chandler nate-chandler force pushed from 7750e161 to d428cad5 2 years ago
nate-chandler
nate-chandler2 years ago

@swift-ci please benchmark

nate-chandler
nate-chandler2 years ago

@swift-ci Please Apple Silicon benchmark

nate-chandler
nate-chandler2 years ago

@swift-ci please test

nate-chandler
nate-chandler2 years ago

@swift-ci please test source compatibility

nate-chandler
nate-chandler2 years ago

@swift-ci please test windows platform

nate-chandler [Mem2Reg] NFC: Improved function name.
e321fc4d
nate-chandler [Mem2Reg] NFC: Fixed comments.
49676140
nate-chandler [Mem2Reg] NFC: Removed spurious typealias.
c9e0bb5a
nate-chandler [Mem2Reg] NFC: Lifted livePhiBlocks higher.
7fb9bcad
nate-chandler
nate-chandler2 years ago
------- Performance (arm64): -O -------

REGRESSION                        OLD      NEW      DELTA    RATIO
ArrayAppendGenericStructs         866.0    997.5    +15.2%   **0.87x (?)**

IMPROVEMENT                       OLD      NEW      DELTA    RATIO
StringWithCString2                0.001    0.0      -50.0%   **2.00x (?)**
SIMDReduce.Int32x16.Initializer   13.409   11.135   -17.0%   **1.20x (?)**
MapReduceLazyCollectionShort      21.83    19.731   -9.6%    **1.11x (?)**

------- Code size: -O -------

REGRESSION                                OLD     NEW     DELTA   RATIO
RangeReplaceableCollectionPlusDefault.o   5835    6067    +4.0%   **0.96x**

IMPROVEMENT                               OLD     NEW     DELTA   RATIO
DictOfArraysToArrayOfDicts.o              18270   17874   -2.2%   **1.02x**

------- Performance (arm64): -Osize -------


------- Code size: -Osize -------

IMPROVEMENT                    OLD     NEW     DELTA   RATIO
DictOfArraysToArrayOfDicts.o   15329   15177   -1.0%   **1.01x**

------- Performance (arm64): -Onone -------

REGRESSION                  OLD       NEW      DELTA    RATIO
ArrayAppendToGeneric        226.774   351.29   +54.9%   **0.65x (?)**
StringWithCString2          0.004     0.005    +20.0%   **0.83x (?)**

IMPROVEMENT                 OLD       NEW      DELTA    RATIO
ArrayAppendGenericStructs   872.308   700.0    -19.8%   **1.25x (?)**

------- Code size: -swiftlibs -------
nate-chandler
nate-chandler2 years ago
------- Performance (x86_64): -O -------

REGRESSION                     OLD      NEW       DELTA    RATIO
MapReduceAnyCollection         88.0     103.444   +17.5%   **0.85x**
MapReduceClass2                10.916   12.667    +16.0%   **0.86x (?)**

IMPROVEMENT                    OLD      NEW       DELTA    RATIO
MapReduceLazyCollectionShort   32.667   26.152    -19.9%   **1.25x**
FlattenListLoop                1626.0   1386.0    -14.8%   **1.17x (?)**
MapReduceLazySequence          11.427   9.852     -13.8%   **1.16x**
FlattenListFlatMap             4568.0   4024.0    -11.9%   **1.14x (?)**

------- Code size: -O -------

REGRESSION                                OLD     NEW     DELTA   RATIO
RangeReplaceableCollectionPlusDefault.o   7207    7447    +3.3%   **0.97x**

IMPROVEMENT                               OLD     NEW     DELTA   RATIO
DictOfArraysToArrayOfDicts.o              21871   21391   -2.2%   **1.02x**

------- Performance (x86_64): -Osize -------

REGRESSION                       OLD       NEW       DELTA    RATIO
MapReduceLazyCollectionShort     43.545    67.5      +55.0%   **0.65x**
MapReduceLazySequence            44.115    65.944    +49.5%   **0.67x**
Dictionary4                      158.75    222.286   +40.0%   **0.71x**
Dictionary4OfObjects             192.625   254.333   +32.0%   **0.76x (?)**
StringFromLongWholeSubstring     2.827     3.042     +7.6%    **0.93x (?)**

IMPROVEMENT                      OLD       NEW       DELTA    RATIO
ObjectiveCBridgeStubFromNSDate   3918.0    2680.0    -31.6%   **1.46x (?)**

------- Code size: -Osize -------

IMPROVEMENT           OLD    NEW    DELTA   RATIO
StringRemoveDupes.o   3369   3330   -1.2%   **1.01x**

------- Performance (x86_64): -Onone -------

REGRESSION                  OLD       NEW        DELTA    RATIO
ArrayAppendGenericStructs   946.667   1316.667   +39.1%   **0.72x (?)**

------- Code size: -swiftlibs -------
nate-chandler nate-chandler requested a review from atrick atrick 2 years ago
nate-chandler nate-chandler requested a review from meg-gupta meg-gupta 2 years ago
meg-gupta
meg-gupta commented on 2023-07-28
Conversation is marked as resolved
Show resolved
lib/SILOptimizer/Transforms/SILMem2Reg.cpp
meg-gupta2 years ago

Just curious, is this the first place where we actually canonicalize borrows ? I believe its turned off in the copy propagation pass ?

nate-chandler2 years ago

Though we could turn it on now #65687 thanks to some recent fixes, ( #65686 , #65458 ), we shouldn't because there are some regressions when doing so. I am planning to look into those regressions and get it enabled before too long.

Actually, though, guaranteed lifetime canonicalization is already enabled for guaranteed function arguments in CopyPropagation and SILCombine, which is all that is being handled here for now.

At some point, it might be nice to have a function that could be called from here or SILCombine which deals with the necessary post-processing.

meg-gupta2 years ago

thanks!

lib/SILOptimizer/Transforms/SILMem2Reg.cpp
2158}
2159
2160void MemoryToRegisters::canonicalizeValueLifetimes(
2161
StackList<SILValue> &owned, StackList<SILValue> &guaranteed) {
meg-gupta2 years ago (edited 2 years ago)

What is the difference between calling canonicalize here v/s running a round of copy propagation ? I thought we canonicalize a value in a pass for cleanup or if it enables additional optimization, like in silcombine ?

nate-chandler2 years ago

Yeah, adding a run of CopyPropagation was the first solution that came to mind after noticing that there were values whose copy-extended lifetime could newly be canonicalized after Mem2Reg but weren't before OME runs. This alternative came out of discussion with @atrick.

It might be reasonable to have as a principle that a pass which introduces a new value ought to be responsible for canonicalizing it's lifetime. That could be used to justify canonicalizing the lifetimes of the new phis.

For the stored values, we'd have to say something like "passes that see that they've created opportunities to canonicalize copy-extended lifetimes can do so". It seems kind of a shame to toss in a pass to the pipeline to rediscover opportunities for canonicalization that were known about already.

If we think about the CanonicalizeOSSALifetime and CanonicalizeBorrowScope as utilities that to be called from anywhere rather than as a part of the CopyPropagation pass, this perspective might become more appealing.

meg-gupta2 years ago (edited 2 years ago)

Yeah, you are right it feels wasteful to throw away newly discovered opportunities. But this also seems to make passes complex and might make bisection slightly harder. I am conflicted/ambivalent about this.

A flag to disable the canonicalization might address bisection however.

nate-chandler2 years ago

Good call!

atrick2 years ago

Yeah, you are right it feels wasteful to throw away newly discovered opportunities. But this also seems to make passes complex and might make bisection slightly harder.

These are all important points. In this case, we've already established elsewhere that we need a very simple and efficient OSSA canonicalization utility that runs on specific values after a pass that exposes copies.

We may have discussed before that there are two kinds of borrow canonicalization: (1) a trivial utility that runs on SILFunction argument and basically does that same thing as owned canonicalization and (2) a fairly involved analysis and rewriting of borrow scopes. We should make it more clear in the API that only the first form should be called as a cleanup utility and that the second form should only run during the dedicated CopyPropagation pass.

meg-gupta
meg-gupta approved these changes on 2023-07-28
meg-gupta2 years ago

LGTM! A flag to disable canonicalization would be helpful - but that can be a follow up.

atrick
atrick approved these changes on 2023-07-28
atrick2 years ago

Looks reasonable to me. I think we talked about separating the simple CanonicalizeBorrowScope utility from all the other copy propagation stuff, eventually...

lib/SILOptimizer/Transforms/SILMem2Reg.cpp
atrick2 years ago

I think this warrants a brief explanation. Like...
Record stored values because promoting the store exposes a copy of the stored value. After promotion, OSSA canonicalization runs on these value to cleanup unnecessary copies.

nate-chandler2 years ago

Done.

lib/SILOptimizer/Transforms/SILMem2Reg.cpp
atrick2 years ago

This is slightly mysterious. You may want to explain that, each new phi generated by Mem2Reg of alloc_stack will be the only new phi in a block at the end of the block's arguments list.

nate-chandler2 years ago

Done.

nate-chandler [Mem2Reg] Canonicalize stored values.
52ce100e
nate-chandler [Mem2Reg] Canonicalize new phis.
b563c4ea
nate-chandler nate-chandler force pushed from d428cad5 to b563c4ea 2 years ago
nate-chandler
nate-chandler2 years ago👍 1

@meg-gupta Added the flag sil-mem2reg-disable-lifetime-canonicalization.

nate-chandler
nate-chandler2 years ago

@swift-ci please smoke test and merge

nate-chandler nate-chandler marked this pull request as ready for review 2 years ago
swift-ci swift-ci merged 1cca22ba into main 2 years ago
nate-chandler nate-chandler deleted the mem2reg/canonicalize-values branch 2 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone