drill
cf51aa71 - DRILL-7115: Improve Hive schema show tables performance

Commit
5 years ago
DRILL-7115: Improve Hive schema show tables performance 1. To make SHOW TABLES for Hive schema work much faster, additional Drill feature of showing only accesible tables when Storage-Based authorization is enabled was sacrificed. Now the behaviour matches to Hive/Beeline, all tables will be shown despite of accessibility. For details about previous show tables results, check description of DRILL-540. 2. In HiveDatabaseSchema implemented faster getTableNamesAndTypes() method and removed bulk related code. 3. Deprecated bulk related options and removed bulk code from AbstractSchema, DrillHiveMetastoreClient. 4. For 8000 Hive tables query returned in 1.8 seconds, for combination of 4000 tables and 8000 views query returned in 2.3 seconds. Note, that after first query table names will be cached and next queries will perform in less than 1 sec. 5. Refactored WorkspaceSchemaFactory's getTableNamesAndTypes() method to reuse existing getViews() method. 6. DrillHiveMetastoreClient was refactored. Classes were unnested and enclosed within client package with restricted visibility. Also was updated cache values type to avoid unnecessarry List to Set back and forth conversions. Client creation methods moved to separate class. So the new package exposes only factory and client class. closes #1706
Author
Committer
Parents
Loading