llvm-project
f098aa3b - [lldb][Darwin] debugserver expedite new binary info, lldb use (#192754)

Commit
12 days ago
[lldb][Darwin] debugserver expedite new binary info, lldb use (#192754) When lldb stops at the "new binaries loaded" internal breakpoint, it must read the list of addresses of the new binaries out of process memory, then send a jGetLoadedDynamicLibrariesInfos packet to debugserver to get the filepath, uuid, and addresses of where all the segments are loaded in memory. It's possible for debugserver to find the "new binaries loaded" function address in the inferior itself, recognize when it has stopped at a breakpoint there, and expedite some/all of the information lldb is going to ask for in the stop info packet that we send to lldb. This will make big improvements to a large-batch-of-binaries loaded stop event, but also focuses even more on the single-binary-loaded `dlopen()` use case, which can be quite expensive when many binaries are loaded one by one. This PR reduces the packet traffic for a new binary load notifications by 1. When debugserver sees a thread that has hit a breakpoint, and the pc matches the new-binaries-loaded function address, reads the list of binaries that have been newly added and includes them in the stop info packet (or the jThreadsInfo packet) in the `added-binaries` key. The value is a list (array) of binary addresses. 2. If the number of binaries is small (today: one), debugserver may collect the full information that jGetLoadedDynamicLibrariesInfos would send back about it, and also expedite that in the stop info packet (or jThreadsInfo) in the `detailed-binaries-info` key. This is a JSON string, and the stop info packet is a semicolon separated series of key-values, so it must be asciihex encoded, just like the `jstopinfo` key. In the jThreadsInfo packet, the JSON for the binary information is included in the response as-is as the value-dictionary. 3. If the remote stub doesn't provide these new keys, lldb will use the same process as before. However, in DynamicLoaderMacOS::NotifyBreakpointHit I was reading the load addresses out of memory individually, with each binary having a 24-byte entry. lldb's memory cache meant we read 512 bytes per 8-byte read, but when 1000 binaries were being loaded at process launch time, that was 24,000 bytes of VM that we would read in 512 byte batches. This patch changes that to read the entire VM range that we will be accessing in one large memory read (as large as the remote gdb RSP stub will support), dramatically reducing packet traffic in that case. 4. debugserver needs to read the "new binaries loaded" function pointer out of the "dyld_all_image_infos" structure in the inferior, and it is a signed function pointer on arm64e processes, so debugserver needs to strip off the signing bits before comparing the pc. I hoisted the strip function out of DNBArchImplArm64 into DNBFixAddress(), and the only complicated bit here is in DNBProcessAddrSize(), when an arm64e debugserver is debugging an arm64_32 process on a watch. It's not a common combination (mostly we will have arm64e debugservers debugging arm64 processes, or arm64_32 debugservers debuggging arm64_32 processe), but it is supported. 5. A very minor enhancement, I have debugserver now include a new key, `sizeof_mh_and_loadcmds` in the full binary information that jGetLoadedDynamicLibrariesInfos returns. When lldb needs to read a binary out of memory, it needs to read the Mach-O header & load commands, and it doesn't know the full size of that, so we end up doing one read of the Mach-O header, then the header + load commands. I'm not using this information in lldb yet, but I would like to, to improve that. At an implementation detail level, ProcessGDBRemote collects these two new data from the stop packet / jThreadsInfo, and passes them to the method that creates a new ThreadGDBRemote. I added two methods to the Thread base class to retrieve the information. DynamicLoaderMacOS will try to read the data from the thread that hit the "new binaries loaded" breakpoint, and if the number of entries matches the number expected by the register value, uses them. Else it falls back to fetching them the traditional way. On an old debugserver that doesn't support these new expedited fields, DynamicLoaderMacOS will get back a zero-length of binary addresses and a null StructuredData dictionary for the detailed image information, and behave as it always does. I tested this patch with both the debugserver changes, and without. Testing is clearly the big questionmark here - I added none. While writing these patches, I had some bugs and the lldb testsuite on macOS was very good at finding them, simply with our normal process launching and dlopen'ing in our existing API tests. I could imagine a test that would capture the packet log and try to ensure that the expedited information is being used by lldb and we are not re-fetching the information, though. rdar://175033129 --------- Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com> Co-authored-by: Felipe de Azevedo Piovezan <piovezan.fpi@gmail.com>
Author
Parents
Loading