Optimize ctags calls
It's expensive to call ctags once per source file identified in the DWARF data. This patch: - Calls ctags on a bunch of source files at a time. The '-L -' argument avoids the risk of a huge command line hitting OS limits. Similarly, we read a line at a time from the stdout pipe to avoid holding a huge amount of output at once in memory. - Recovers the file paths from the lines output by ctags. (It always prints them in '-x' mode, even for a single input file). This exposes us to spurious differences in pathnames from ctags and objdump, (e.g. redundant '/./'), so we fix them up with os.path.relpath(). (This inaccuracy happens elsewhere in intermediate_layer.py, and will be addressed in a future patch). - Similarly, changes FunctionLineNumbers to handle a bunch of functions at once. Even with the cache of ctags results, repeated calls to get_line_number() can be expensive. For example, simply checking that the files are already cached takes linear time each call and can dominate the run time for some workloads. This patch reduces the number of calls to two - one for functions with source information in the DWARF, one for functions without. - Renames get_line_number() to get_definition_map() to reflect these changes. - Renames 'workspace' to 'local_workspace' in FunctionLineNumbers for consistency with the rest of the program. This is cosmetic, and has nothing to do with the rest of the patch. LIMITATIONS: - Parsing the file name from ctags output will break for a path with embedded whitespace. This risk is present in the original (because of shell=True), so it is assumed to be acceptable here. Consider ctags json output if it becomes a problem. - The error message for ctags failure appears to lose specificity because it no longer outputs each file name. The likely intent was to report problems with particular missing, corrupt or unreadable files. In practice, Exuberant ctags returns success in all these cases, even for a single file on the command line. The only way I found to provoke a nonzero exit code was to pass a bad command line argument. The point is that ctags will either fail for every file or none of them, so testing each exit code is not effective. If file-specific errors need to be reported, consider capturing and logging lines from the stderr pipe. Change-Id: I89c80a8ac181d6bc01351c0f4922941829a1ed5c
Loading
Please register or sign in to comment