PR #67526 [Mem2Reg] Canonicalize promoted lifetimes.

nate-chandler2 years ago

Canonicalize the lifetimes of values which were stored to the promoted alloc_stack and of the newly created phis.

nate-chandler force pushed from ce39bda9 to 02549d32 2 years ago

nate-chandler force pushed from 02549d32 to 7750e161 2 years ago

nate-chandler force pushed from 7750e161 to d428cad5 2 years ago

nate-chandler2 years ago

@swift-ci please benchmark

nate-chandler2 years ago

@swift-ci Please Apple Silicon benchmark

nate-chandler2 years ago

@swift-ci please test

nate-chandler2 years ago

@swift-ci please test source compatibility

nate-chandler2 years ago

@swift-ci please test windows platform

[Mem2Reg] NFC: Improved function name.

e321fc4d

[Mem2Reg] NFC: Fixed comments.

49676140

[Mem2Reg] NFC: Removed spurious typealias.

c9e0bb5a

[Mem2Reg] NFC: Lifted livePhiBlocks higher.

7fb9bcad

nate-chandler2 years ago

------- Performance (arm64): -O -------

REGRESSION                        OLD      NEW      DELTA    RATIO
ArrayAppendGenericStructs         866.0    997.5    +15.2%   **0.87x (?)**

IMPROVEMENT                       OLD      NEW      DELTA    RATIO
StringWithCString2                0.001    0.0      -50.0%   **2.00x (?)**
SIMDReduce.Int32x16.Initializer   13.409   11.135   -17.0%   **1.20x (?)**
MapReduceLazyCollectionShort      21.83    19.731   -9.6%    **1.11x (?)**

------- Code size: -O -------

REGRESSION                                OLD     NEW     DELTA   RATIO
RangeReplaceableCollectionPlusDefault.o   5835    6067    +4.0%   **0.96x**

IMPROVEMENT                               OLD     NEW     DELTA   RATIO
DictOfArraysToArrayOfDicts.o              18270   17874   -2.2%   **1.02x**

------- Performance (arm64): -Osize -------


------- Code size: -Osize -------

IMPROVEMENT                    OLD     NEW     DELTA   RATIO
DictOfArraysToArrayOfDicts.o   15329   15177   -1.0%   **1.01x**

------- Performance (arm64): -Onone -------

REGRESSION                  OLD       NEW      DELTA    RATIO
ArrayAppendToGeneric        226.774   351.29   +54.9%   **0.65x (?)**
StringWithCString2          0.004     0.005    +20.0%   **0.83x (?)**

IMPROVEMENT                 OLD       NEW      DELTA    RATIO
ArrayAppendGenericStructs   872.308   700.0    -19.8%   **1.25x (?)**

------- Code size: -swiftlibs -------

nate-chandler2 years ago

------- Performance (x86_64): -O -------

REGRESSION                     OLD      NEW       DELTA    RATIO
MapReduceAnyCollection         88.0     103.444   +17.5%   **0.85x**
MapReduceClass2                10.916   12.667    +16.0%   **0.86x (?)**

IMPROVEMENT                    OLD      NEW       DELTA    RATIO
MapReduceLazyCollectionShort   32.667   26.152    -19.9%   **1.25x**
FlattenListLoop                1626.0   1386.0    -14.8%   **1.17x (?)**
MapReduceLazySequence          11.427   9.852     -13.8%   **1.16x**
FlattenListFlatMap             4568.0   4024.0    -11.9%   **1.14x (?)**

------- Code size: -O -------

REGRESSION                                OLD     NEW     DELTA   RATIO
RangeReplaceableCollectionPlusDefault.o   7207    7447    +3.3%   **0.97x**

IMPROVEMENT                               OLD     NEW     DELTA   RATIO
DictOfArraysToArrayOfDicts.o              21871   21391   -2.2%   **1.02x**

------- Performance (x86_64): -Osize -------

REGRESSION                       OLD       NEW       DELTA    RATIO
MapReduceLazyCollectionShort     43.545    67.5      +55.0%   **0.65x**
MapReduceLazySequence            44.115    65.944    +49.5%   **0.67x**
Dictionary4                      158.75    222.286   +40.0%   **0.71x**
Dictionary4OfObjects             192.625   254.333   +32.0%   **0.76x (?)**
StringFromLongWholeSubstring     2.827     3.042     +7.6%    **0.93x (?)**

IMPROVEMENT                      OLD       NEW       DELTA    RATIO
ObjectiveCBridgeStubFromNSDate   3918.0    2680.0    -31.6%   **1.46x (?)**

------- Code size: -Osize -------

IMPROVEMENT           OLD    NEW    DELTA   RATIO
StringRemoveDupes.o   3369   3330   -1.2%   **1.01x**

------- Performance (x86_64): -Onone -------

REGRESSION                  OLD       NEW        DELTA    RATIO
ArrayAppendGenericStructs   946.667   1316.667   +39.1%   **0.72x (?)**

------- Code size: -swiftlibs -------

nate-chandler requested a review from

atrick 2 years ago

nate-chandler requested a review from

meg-gupta 2 years ago

meg-gupta commented on 2023-07-28

Conversation is marked as resolved

Show resolved

lib/SILOptimizer/Transforms/SILMem2Reg.cpp

	2158		}
	2159
	2160		void MemoryToRegisters::canonicalizeValueLifetimes(
	2161		StackList<SILValue> &owned, StackList<SILValue> &guaranteed) {

meg-gupta2 years ago (edited 2 years ago)

What is the difference between calling canonicalize here v/s running a round of copy propagation ? I thought we canonicalize a value in a pass for cleanup or if it enables additional optimization, like in silcombine ?

nate-chandler2 years ago

Yeah, adding a run of CopyPropagation was the first solution that came to mind after noticing that there were values whose copy-extended lifetime could newly be canonicalized after Mem2Reg but weren't before OME runs. This alternative came out of discussion with @atrick.

It might be reasonable to have as a principle that a pass which introduces a new value ought to be responsible for canonicalizing it's lifetime. That could be used to justify canonicalizing the lifetimes of the new phis.

For the stored values, we'd have to say something like "passes that see that they've created opportunities to canonicalize copy-extended lifetimes can do so". It seems kind of a shame to toss in a pass to the pipeline to rediscover opportunities for canonicalization that were known about already.

If we think about the CanonicalizeOSSALifetime and CanonicalizeBorrowScope as utilities that to be called from anywhere rather than as a part of the CopyPropagation pass, this perspective might become more appealing.

meg-gupta2 years ago (edited 2 years ago)

Yeah, you are right it feels wasteful to throw away newly discovered opportunities. But this also seems to make passes complex and might make bisection slightly harder. I am conflicted/ambivalent about this.

A flag to disable the canonicalization might address bisection however.

nate-chandler2 years ago

Good call!

atrick2 years ago

Yeah, you are right it feels wasteful to throw away newly discovered opportunities. But this also seems to make passes complex and might make bisection slightly harder.

These are all important points. In this case, we've already established elsewhere that we need a very simple and efficient OSSA canonicalization utility that runs on specific values after a pass that exposes copies.

We may have discussed before that there are two kinds of borrow canonicalization: (1) a trivial utility that runs on SILFunction argument and basically does that same thing as owned canonicalization and (2) a fairly involved analysis and rewriting of borrow scopes. We should make it more clear in the API that only the first form should be called as a cleanup utility and that the second form should only run during the dedicated CopyPropagation pass.

meg-gupta approved these changes on 2023-07-28

meg-gupta2 years ago

LGTM! A flag to disable canonicalization would be helpful - but that can be a follow up.

atrick approved these changes on 2023-07-28

atrick2 years ago

Looks reasonable to me. I think we talked about separating the simple CanonicalizeBorrowScope utility from all the other copy propagation stuff, eventually...

lib/SILOptimizer/Transforms/SILMem2Reg.cpp

atrick2 years ago

I think this warrants a brief explanation. Like...
Record stored values because promoting the store exposes a copy of the stored value. After promotion, OSSA canonicalization runs on these value to cleanup unnecessary copies.

nate-chandler2 years ago

Done.

lib/SILOptimizer/Transforms/SILMem2Reg.cpp

atrick2 years ago

This is slightly mysterious. You may want to explain that, each new phi generated by Mem2Reg of alloc_stack will be the only new phi in a block at the end of the block's arguments list.

nate-chandler2 years ago

Done.

[Mem2Reg] Canonicalize stored values.

52ce100e

[Mem2Reg] Canonicalize new phis.

b563c4ea

nate-chandler force pushed from d428cad5 to b563c4ea 2 years ago

nate-chandler2 years ago👍 1

@meg-gupta Added the flag sil-mem2reg-disable-lifetime-canonicalization.

nate-chandler2 years ago

@swift-ci please smoke test and merge

nate-chandler marked this pull request as ready for review 2 years ago

swift-ci merged 1cca22ba into main 2 years ago

nate-chandler deleted the mem2reg/canonicalize-values branch 2 years ago

swift
[Mem2Reg] Canonicalize promoted lifetimes.
#67526

Merged

[Mem2Reg] Canonicalize promoted lifetimes. #67526

swift [Mem2Reg] Canonicalize promoted lifetimes. #67526 Merged

[Mem2Reg] Canonicalize promoted lifetimes. #67526

swift
[Mem2Reg] Canonicalize promoted lifetimes.
#67526

Merged