Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upMissing Source Files from Database #39
Comments
Hi @creative16 - I work on our C/C++ support, and I am absolutely interested in finding and fixing extractor crashes. I'll build a Chromium/Windows database to try and reproduce them and track them down. Thanks! |
A quick update: we've made a change to address this, so the next version (2.2.5) should fix the problem for you. I expect that release will happen in approximately 2 weeks' time. Thanks again for letting us know. |
Thanks for looking into it promptly! I'll look forward to the release. |
The chromium on linux build also misses many files, including the aforementioned src/services/network/p2p/socket_manager.cc. It might be a regression, as building with 2.1.0 actually results in a larger database with more files included. |
I noticed 2.25 came out and rebuilt chromium on Linux. Compared to my builds from 2.10-2.14 with 73k files, the 2.24 and 2.25 dbs return about 70k files. Running comm between them flagged up some very important files. I'll hence be moving back to 2.14 for now. I have attached the filelists (from File f select f.getRelativePath())) here @nickrolfe. Please let me know if there is anything else in the log files (4 GB, sorry) to look out for in helping to troubleshoot |
One thing that occurred to me was that my old, bigger databases were created without setting --threads. Codeql could be causing extra crashes in the extractor due to this; I'll rebuild on 2.14 and 2.25 with/without the --threads and see how it goes. Really hoping that this isn't the root cause though, especially if it leads to larger compile times overall. |
Have 4 builds for 2.14 and 2.25 with/without --threads, all of which still have around 70k files. The only two variables left that I can think of which vary as compared to my bigger 73k builds are different versions of github/codeql in my codeql-home, and varying power/clock speed configurations of my CPU (the bigger builds were done on CPUs with less cores and lower clock speeds) . Does the version of codeql affect the cli/extractor? I did notice that identical (same codeql-cli, same github/codeql, same codebase) builds can result in slightly different database contents, but am at my wits end right now. It is looking like instability is the likely cause, but havn't quite pinned down exactly why. |
Hi @sad-dev - sorry to hear that the 2.2.5 release did not fix things for you, and thanks for letting us know. I will build a Chromium snapshot locally. Hopefully I can reproduce your problem and I will track down the cause of the missing files. Usually it's because of either a segfault or catastrophic error in the extractor. In general, build-system parallelism shouldn't affect this, unless the crashes are related to out-of-memory errors. But the more likely cause is that the extractor is choking on some construct in a header file included by all the files that didn't get extracted. I'll let you know what I find.
Yes, each new release of the CodeQL CLI includes the latest extractor changes. |
Thank you for taking a look! With regards to the version of codeql, I was referring to the repos for the queries rather than the CLI (i.e. those found by codeql resolve languages). I deleted them and was still able to build with the cli, so that answers my question, I guess :) |
@nickrolfe I made three builds of chromium with codeql-cli 2.25 on three machines with --threads=0 passed to codeql database create. The specs and results are below: 6C/12T, 32gb ram - 74.6k files From the looks of it, more threads are leading to stability issues in the extractor, possibly due to a larger 'ripple effect'. |
Hi @sad-dev. The Chromium build I'm doing here (also using CodeQL 2.2.5) on an old Linux laptop still hasn't quite finished, but so far I see only 5 files (all in Skia) that failed to extract. And since you are seeing different behaviours on different machines, I'd like to see if we can find what's going wrong with the extractor from your Assuming that the failed extractions are caused by the extractor segfaulting or otherwise hitting a catastrophic error, do you think you could take a look at places where If the log file is too big to open in a text editor, you could do something like:
|
The greps (with and without the -A12) for "CodeQL C++ extractor" are attached for my 6,8, and 16 core builds, as well as the filelists. extractor_6c.txt |
Thanks for sending those. We (the CodeQL team) will need to fix those crashes, but they don't explain why you're missing so many files. Can you confirm that you're building Chromium from scratch? CodeQL will only extract source files that it observes being compiled, so if the build system thinks The next thing I would do is pick a file that's missing, say The other possibility is that extraction was successful, but something went wrong during database finalization. Once my local build finishes, I can investigate whether anything goes wrong there for me, but it might be worth checking the |
I edited the previous post to update with filelists and extractor crashes. Also can confirm that all three were build with autoninja -C out/Default chrome with a empty (rm -rf out) output folder Picking v8/src/wasm/wasm-objects.cc at random from comm f6.txt f16.txt, 1.) The build-tracer.log files are slightly broken up (even on 6c, it appears that multiple threads write to it at the same time) but they all show : only my 6core and 8core logs contain lines of the form 2.) the database-create-*log does not mention v8/src/wasm/wasm-objects.cc on 6c and 8c, but the 16c log has the line: grep "manifest could not be read" on the logs throws up 118, 321, and 769 errors on the 6c, 8c, and 16c builds respectively.Some quick sed commands show they they all correspond to missing files in the respective databases Further observations: |
Thanks, that's really useful. My guess is that, on the 16c machine, you have a file at That's consistent with the extractor terminating prematurely, but I don't yet know why that would happen, and without any kind of backtrace. One guess is that it's the OOM killer. However, I now see that it's also happening on my machine, so I have a good chance of tracking this down. I'll let you know how I get on, but thanks again for your help and patience! |
That is normal - the log entries in steps a to c derive from the name of the file being compiled. Step d corresponds to copying any observed source or header file to the database archive. I also note the extractor running out of memory would be consistent with your observation that more threads means more failures. I will try to confirm if that's what's happening and, if so, what steps we can take to reduce the memory usage. |
Instead of reducing the memory usage (or if it proves unfeasible to do so / fails to completely fix extractor crash problems), is it possible to rerun the extractor, either immediately upon failure or just before finalizing? One obvious thing to look out for is that build files might be temporary, thus necessitating an archival if the extractor is to be rerun at the end. |
Hello @nickrolfe , just an update after trying version 2.2.5 to build Chromium on Windows. Generally, it definitely looks like an improvement over 2.2.4. The table below provides some summarised numbers from my build:
Grepping for "manifest could not be read" from There is, however, no significant difference when looking for "CodeQL C++ extractor: Backtrace" in Finally, as compared to version 2.2.4, the extractor for version 2.2.5 only crashed 7 times due to access violations as compared to hundreds (or thousands) of times for version 2.2.4. To get a rough idea of how many files are missing, I'm wondering if you or @sad-dev have any idea how to get a full list of files that will be built by ninja? |
I monitored my peak memory usage on a smaller codebase that uses ninja: v8 -j = 34 (default with ninja), no codeql: 10GB -j = 34, codeql --threads=0 : maxes out on my 32gb system, and throws about 150 manifest errors as compared to 0 otherwise. About 5% less files in the database. We are thus looking at a roughly 5x multiplier on peak memory usage that codeql imposes when compiling v8, and threads used for importing doesn't seem to affect that. Given chromium's higher complexity and much larger (about 30x) size, I wouldn't be surprised if larger multipliers of 10x or so are observed on the memory requirements. |
After comparing the list of files in the database, and a list of possible files I believe should have been compiled, I think the database is still missing about 1000 source files (i.e. I did a grep for some of the missing files in
|
Catastrophic errors will cause that source file to be missing from the database; regular errors will not. I've created an internal issue to investigate why we don't find Regarding the excessive memory consumption, we've identified one of the major causes. In general, memory consumption should only be affected by the complexity of the one source file being extracted, but there is one feature that causes memory usage to scale with the overall size of the entire project. For the vast majority of projects, this is never a problem, but with Chromium and its tens of thousands of source files, it is. Of course we want to reliably build databases of Chromium and other massive C++ projects, so we're looking for a solution. |
Hello, I've tried a Chromium build on Windows using version 2.2.6. The database had about 1000 more source files than the one I created using 2.2.5, which is great. Thanks for the improvements! The catastrophic error regarding the missing
I checked a few of the source files where this error popped up and indeed they were not included in the database. Would this be a cause for the missing files as well? |
Hi @creative16, thanks for letting us know. Yes, the stack overflow exception would also cause the files to be missing from the database. That was the same underlying issue for the problem reported at the start of this thread, but since then we increased the extractor's stack size while we investigate a longer-term fix for the excessive stack usage. If you're seeing that in 2.2.6, then even the increased stack size is not sufficient. I haven't observed this problem on my own Windows machine while building V8 databases, but I'll do another build of Chromium to try and reproduce the exception. |
When trying to build a database for Chromium on Windows, I found that a significant number of source files were missing from the database. One specific example would be
src/services/network/p2p/socket_manager.cc
.I'm not sure if its relevant, but during the build process, the extractor crashed many times, in what appears to be an access violation.
I'm wondering if this is something the developers would be willing to look into?