docs: Add kernel registry architecture documentation
- Document KernelRegistry, KernelRegistryManager, and CustomRegistry classes
- Explain how ExecutionProviders register kernels via GetKernelRegistry()
- Clarify CustomRegistry's dual role: user-facing API and internal implementation
- Document priority ordering of registries (Custom > ExecutionProvider)
- Add critical information about OpSchema and DLL boundary limitations
- Explain why OpSchema cannot safely cross DLL boundaries
- Document DirectML's COM-based ABI workaround for schema registration
- Provide integration flow diagram and best practices
- Include practical guidance for same-DLL vs cross-DLL scenarios