Implement AccessControl, Minimal Group Management and a lot of fixes (#892)
* Add ACE1.2 python test to verify something
* Implement ACL Logic
* Implement AccessControl Server for Legacy
In the case of legacy the implementation just contains the logic of the cluster and is wired in the InteractionServer (see next commit) into the read/write/invoke calls.
Validation is handled in custom validators and event sending in custom setters. The getters and setters (because required by legacy api) need to handle itself the fabric scoped loading and saving. The new methods for this are also added in one of the next commits.
* Add LegacyInteractionManager
Because logic is different in legacy and new code paths I created a LegacyInteractionManager that implements these legacy specifics. here the ACL Manager is wired in.
* Streamline legacy Fabricscoped Get/Set
Cut out the logic into extra methods that can also be used outside of AttributeServers. In fact this kind of storage will die with the Legacy API somewhen :-)
* Add ACL infos to Attribute/Event/CommandServers
The relevant acls for this particular attribute/event/command was not available in the Servers, so we add it
* Delay change-events in Legacy AttributeServers
In order to implement ACL in the required way I added a flag to delay firing the change events and increasing the version in legacy AttributeServers. This is used in InteractionServer automatically for all list types and prevents firing change events too early.
* Always ignore fabricIndex on write/commands
see https://github.com/project-chip/connectedhomeip/issues/33475
* Move Events.get into EventServer
To be more in sync with Attributes. LegacyIntegrationManager already uses this.
* Implement AccessControlServer new API
This commit implements the access control server for the new API. I followed the idea that the access control server also is the central place for ACL checks, so the AccessControlManager is owned by him and then looked up and used.
$Changing events are used for validation and $Change events are used for event sending of final values. I added a delay mechanism for ACL updates to become effective which is controlled by the InteractionServer in case of write interactions.
We need to discuss how to change that correctly into sub transactions or such so that we can remove this delay logic again.
* Wire Access control into InteractionServer new API
This commit wires the ACL checks into the Transactional Interaction server.
We basically add event handling and inject relevant data into the OnlineContext.
For events and commands the acl check is done here directly using the ActionContext because the behavior system did not triggered that so far. I left todos here to clean this up later.
For write interactions we delay acl updates in general - this is just a hack and might break with group cluster list updates or such, so we need to think about how to fix that. TODO is in the code.
* Wire Access control into Contexts
This commit adds acl handling in the contexts of new API.
We cache acl privileges from a check in location to prevent too many checks for the same location.
I changed the naming here and semantics to keep the access determination in one place rather then splitted in several places, so the required privilege is directly handed over.
Additionally, the cluster gets injected into the location.
* Sync AccessLevels
syncing access levels to specification to make sure to use the correct values also here
* Add backward compatible acl entry in old api
... like in new API already done in ACL cluster implementation.
* Enhance datatypes
This commit enhances the datatype classes on the following things:
* All datatypes that are MEIs are now also validated for their allowed ranges and values according to specs
* by default we validate as soon as we "build" such a value to make sure that invalid usages are tackled directly on test time ideally :-)
* This validation is skipped on tlv decoding because several requirements from matter require such semantic checks later in the process.
* places where we "misused" these types are adjusted
* ACL cluster needs an adjustment because we need to validate some types later in the process.
* Get and store CATs from session
The ACL determination relies also on CATs as subjects and here the session CATs gets prio over the NOC cats of the stored NOC. So we also need to store the CATs in resumption record.
* No associated fabric needs to throw Unsupported access error
* No Fabric building with invalid caseAdminSubject
Add proper error handling as per specs
* Enhance Op Creds cluster
* add error handling for new Error
* add ACL default entry
* Persist also writable and fabric scoped data in new API
The legacy API was persisting also all writable and fabric scoped data. This commit does the same for the new API, else no ACL list value would ever be persisted :-)
In fact the matter specification is unclear here and tells that "any non-volatile data" must be persisted ... or other where applicable (kind of). So writable is obvious ...
* Move access to deviceType up to EndpointInterface
ACL needs to check also the devicetype of an endpoint, so move it up there. it was just in lower levels accessible.
* Split up attribute data handling
Because we should not collect/build the full list content in one step for write interactions but process "chunk wise" we adjust the logic to allow that.
* Allow fabric event filtering on Events
This was missing so far
* Make some types better defined
To simply code I enhanced some typings and added asserting methods to remove some !== undefined cases from the code. In fact this is used as soon as we have process the incoming paths and splitted them into the known list. here all data are definedf and not optional like in the original datatypes.
* Enhance read/write/invoke methods
The methods used to read/write/invoke used by derived classes needed enhancements on data level. Also readEvent was added
* Enhance readRequest handling
* Adjust WriteRequest handling
* Enhance Write request by some checks
Even if we do not implement "morechunkedMessages" for write right now we add some checks
* Enhance Subscriptions
to consider events and acl stuff correctly
* Prevent Subscription retries when server closes
I stumbled over this case in tests. SO just make sure when interactionServer closes we also do not care about pending subscription tries
* Return correct error on multi-invokes
Because protocol wise multi invoked where never supported Matter 1.3 adds ruling for it, so just return expected error for now till we implemented it correctly.
* Deepcopy current value adding elements
Make sure to not screw up things
* Add a minimalistic GroupKeyCluster implementation
Some ACL related tests we need to pass also do stuff using group key cluster commands we do not had implemented.
So this adds a minimalistic Group key management implementation that just supports the IPK and not other groups.
But it already starts to wire some things into fabric where it also need to be later.
* Also expode matterdevice isClosing info
* Extend chip tests
We add ACE, ACL and IDM tests to chip testing.
* ACE_1_6 is relying on a group cluster which supports groups, so excluded for now
* IDM yaml tests are currently all manual tests, so no point to add them
* Adjust tests to all new requirements
* adjusts values now validated on datatype level
* adds ACL support
* acl reset test is removed from Integrationtest because chip tests are more detailed
* Fix ListManager list clearing
When a list was cleared (aka length set to 0) then this was not implemented correctly.
PS: While adjusting the tests I also found other cases that do not work. Added tests but commented them out because I was not really able to find a way to solve it - seems to have something to do with the managed values.
* Handle bigints too when patching types
Because NodeIds and such are bigints
* Allow decoding of TlvLists with protocol specific tags
... the ACL extensions have them
* Enhance types for bigint and Mutable
Also enhance Immutable typing helper and add a mutable type and fucntion (even if unused currently)
* Fix deepCopy and add tests
* Release locks also in Precommit errors
When an error happened in Precommit the locks were not released. I also added testing here.
BTW: When I remove the async on the precommit test method the error gets somehow still thrown/rejected but not catched at all by the test. thats strange because precommit is "MaybePromise" and so could be sync?
* Linter and test patch
Test needs to be patched because it returns an invalid devicetype id
* test fixes and Legacy API tweaks
The way we use AttributeServers to set an ACL entry locally was not working as intended, so we adjust that
* Adjust testcase
* Changelog and python test config fix
* We still might have an issue here for local ACL changes ... left TODOs for now
* Add new tests
* Trx need to pas StatusResponseErrors
... else we loose informations from validation
* Adjust nullable lists/strings handling on decode
Seems "nullable and empty lists/strings" was not that clear ... in fact "have a length" did not mean" a length >0" but "have a length semantically as datatype" :-) So adjust that
* More deconstruction
* Error happened again ... fix works again
Sorry .. I do not fully understand why and wen this happens and whats the solution. In fact when it happens then when we fail to add a value because of an error this happens as a followup error. Take it out and the latest added chip tests produce it when a validation error happens
* Add one more test
* Adjust tests to passing through error
* Enhance some validation exceptions
* Adjust tests to new texts
* Adjust tests to new texts
* handle ByteArrays correctly too
* Tweak comment
* [execute-chiptests-long] Address review feedback
---------
Co-authored-by: lauckhart <greg@lauckhart.com>