Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metabug: Improving C-level coverage #94808

Open
83 of 225 tasks
mdboom opened this issue Jul 13, 2022 · 17 comments
Open
83 of 225 tasks

Metabug: Improving C-level coverage #94808

mdboom opened this issue Jul 13, 2022 · 17 comments
Labels
easy type-bug An unexpected behavior, bug, or error

Comments

@mdboom
Copy link
Contributor

mdboom commented Jul 13, 2022

This bug is going to be used to track work in a other bugs to improve the C-level coverage of the CPython test suite.

There is a set of baseline coverage results on main that can be used to find coverage gaps.

The plan, discussed on discuss.python.org is as follows:

  • Read through the coverage report and record any notable gaps in the checklist below. The goal is not 100% coverage, and each area of improvement will probably require some judgement calls. For example, covering all cases where memory exhaustion can occur is probably not worth the effort. On the other hand, detailed coverage in the eval loop may be worth the effort.
  • When someone has "read through" a particular source file and added created subitems for any interesting gaps, they should check it off on the list below and add links to any issues created.

Related work:

There is related work to publish coverage results from CPython on a regular basis, but this issue is concerned with using those results to actually reduce our gaps in coverage.

List of source files:

  • Include/internal/pycore_asdl.h
  • Include/internal/pycore_bitutils.h
  • Include/internal/pycore_call.h
  • Include/internal/pycore_code.h
  • Include/internal/pycore_frame.h
  • Include/internal/pycore_moduleobject.h
  • Include/internal/pycore_object.h
  • Include/internal/pycore_pymath.h
  • Include/internal/pycore_pymem.h
  • Include/internal/pycore_pystate.h
  • Include/object.h
  • Include/pydtrace.h
  • Objects/abstract.c
    • Buffer related functions: PyBuffer_FromContiguous, PyObject_CopyData, PyBuffer_FillContiguousStrides
    • PyNumber_Check doesn't test complex
    • PySequence_Repeat and PySequence_InPlaceRepeat have no coverage
    • PySequence_SetItem with a negative index is untested
    • PySequence_SetSlice and PySequence_DelSlice are untested
    • PyMapping_HasKey and PyMapping_HasKeyString are untested
  • Objects/accu.c
  • Objects/boolobject.c
  • Objects/bytearrayobject.c
  • Objects/bytes_methods.c
  • Objects/bytesobject.c
  • Objects/call.c
    • PyEval_CallObjectWithKeywords has no coverage
    • _PyObject_CallMethodId_SizeT has no coverage
  • Objects/capsule.c
  • Objects/cellobject.c
  • Objects/classobject.c
  • Objects/codeobject.c
  • Objects/complexobject.c
  • Objects/descrobject.c
  • Objects/dictobject.c
  • Objects/enumobject.c
  • Objects/exceptions.c
  • Objects/fileobject.c
    • PyFile_FromFd has no coverage
    • PyFile_GetLine over bytes input has no coverage
  • Objects/floatobject.c
  • Objects/frameobject.c
  • Objects/funcobject.c
  • Objects/genericaliasobject.c
  • Objects/genobject.c
    • gen_new_with_qualname and API PyGen_NewWithQualName and PyGen_New have no coverage.
    • PyCoro_New has no coverage
    • PyAsyncGen_New has no coverage
    • async_gen_athrow_send has poor coverage
  • Objects/interpreteridobject.c
  • Objects/iterobject.c
  • Objects/listobject.c
  • Objects/longobject.c
    • _PyLong_Sing_t_Converter has no coverage
    • long_format_binary doesn't test outputting to UCS2 or UCS4
    • int_bit_length_impl and int_bit_count_impl doesn't cover the case where expression overflows
  • Objects/memoryobject.c
  • Objects/methodobject.c
  • Objects/moduleobject.c
    • PyModule_GetFilename has no coverage
  • Objects/namespaceobject.c
  • Objects/object.c
    • PyObject_Print has no coverage
    • PyObject_Bytes does not test the case where there is a __bytes__
    • gh-94808: Add test coverage for PyObject_HasAttrString #96627
    • PyObject_SetAttrString doesn't test when object has a tp_setattr
    • PyObject_GetAttrString doesn't test when object has a tp_getattr
    • _PyObject_LookupAttr doesn't test when object has a tp_getattr
  • Objects/obmalloc.c
  • Objects/odictobject.c
  • Objects/picklebufobject.c
    • PyPickleBuffer_FromObject, PyPickleBuffer_Release has no coverage
  • Objects/rangeobject.c
  • Objects/setobject.c
  • Objects/sliceobject.c
    • PySlice_GetIndices/PySlice_GetIndicesEx has no coverage
  • Objects/stringlib/codecs.h
  • Objects/stringlib/count.h
  • Objects/stringlib/ctype.h
  • Objects/stringlib/eq.h
  • Objects/stringlib/fastsearch.h
  • Objects/stringlib/find.h
  • Objects/stringlib/find_max_char.h
  • Objects/stringlib/join.h
  • Objects/stringlib/localeutil.h
  • Objects/stringlib/partition.h
  • Objects/stringlib/replace.h
  • Objects/stringlib/split.h
  • Objects/stringlib/transmogrify.h
  • Objects/stringlib/undef.h
  • Objects/stringlib/unicode_format.h
  • Objects/structseq.c
  • Objects/tupleobject.c
  • Objects/typeobject.c
    • wrap_sq_setitem has no coverage
  • Objects/unicodectype.c
  • Objects/unicodeobject.c
    • xmlcharrefreplace doesn't test for codepoints < 100 (This seems almost impossible to occur).
    • resize_inplace has no coverage
    • unicode_kind_name when !PyUnicode_IS_COMPACT isn't covered -- low priority used by consistency check only
    • unicode_write_cstr doesn't test writing into UCS2 or UCS4
    • gh-94808: Cover %p in PyUnicode_FromFormat #96677
    • PyUnicode_AsDecodedObject, PyUnicode_AsDecodedUnicode, PyUnicode_AsEncodedObject, PyUnicode_AsEncodedUnicode has no coverage
    • _Py_DecodeUTF8Ex and _Py_EncodeUTF8Ex has no coverage for error == surrogateescape
    • PyUnicode_BuildEncodingMap doesn't handle the need_dict case
    • ucs1lib_find_slice and ucs1lib_rfind_slice aren't covered.
    • PyUnicode_Count has no coverage
    • gh-94808: Cover str.rsplit for UCS1, UCS2 or UCS4 #98228
    • PyUnicode_CompareWithASCIIString has no coverage for comparing with UCS2 or UCS4
    • _PyUnicode_EqualToASCIIId has no coverage
  • Objects/unicodetype_db.h
  • Objects/unionobject.c
  • Objects/weakrefobject.c
  • Parser/action_helpers.c
    • _PyPegen_set_expr_context doesn't cover "starred kind"
    • _PyPegen_get_expr_name switch statement coverage is non-exhaustive
  • Parser/myreadline.c (N/A Windows-only)
  • Parser/parser.c
  • Parser/peg_api.c
  • Parser/pegen.c
  • Parser/pegen.h
  • Parser/pegen_errors.c
  • Parser/string_parser.c
  • Parser/tokenizer.c
  • Python/Python-ast.c Generated code
  • Python/Python-tokenize.c
  • Python/_warnings.c
    • show_warning doesn't cover the case where there is a sourceline.
    • PyErr_WarnExplicit has no coverage
  • Python/asdl.c
  • Python/ast.c
    • ensure_literal_* functions aren't covered
    • validate_pattern_match_value doesn't cover all elements of switch
  • Python/ast_opt.c
    • check_complexity doesn't cover the frozenset case
    • ast_foldbody isn't covered
  • Python/ast_unparse.c
  • Python/bltinmodule.c
  • Python/bootstrap_hash.c
  • Python/ceval.c
  • Python/ceval_gil.h
  • Python/codecs.c
  • Python/compile.c
    • write_instr is not handling the case where ilen > 2. It might be that those are never seen in practice...? If so, feel free to close this bug.
    • check_ann_subscr doesn't have any coverage for slice or tuple kinds.
    • optimize_basic_block has some opcodes that aren't covered in the JUMP_IF_FALSE_OR_POP and the JUMP_IF_TRUE_OR_POP cases.
  • Python/condvar.h
  • Python/context.c
    • PyContext_Copy, PyContext_Enter, PyContext_Exit have no coverage
  • Python/deepfreeze/deepfreeze.c
  • Python/dtoa.c
  • Python/dup2.c
  • Python/dynamic_annotations.c
  • Python/errors.c
  • Python/fileutils.c
    • is_valid_wide_char doesn't test error branches
    • encode_ascii/decode_ascii has no coverage (probably very low priority -- comment says only for platforms with a broken mbstowcs (FreeBSD, OpenIndiana)
    • _Py_stat has no coverage
  • Python/formatter_unicode.c
  • Python/frame.c
  • Python/frozenmain.c
  • Python/future.c
  • Python/getargs.c
  • Python/getopt.c
  • Python/hamt.c
  • Python/hashtable.c
  • Python/import.c
  • Python/importdl.c
  • Python/initconfig.c
  • Python/marshal.c
  • Python/modsupport.c
  • Python/mysnprintf.c
  • Python/mystrtoul.c
  • Python/pathconfig.c
  • Python/preconfig.c
  • Python/pyarena.c
  • Python/pyfpe.c
  • Python/pyhash.c
  • Python/pylifecycle.c
  • Python/pystate.c
  • Python/pystrcmp.c
  • Python/pystrhex.c
  • Python/pystrtod.c
  • Python/pythonrun.c
  • Python/pytime.c
  • Python/specialize.c
  • Python/structmember.c
  • Python/suggestions.c
  • Python/symtable.c
  • Python/sysmodule.c
  • Python/thread.c
  • Python/traceback.c
    • tracebacks with angle-bracketed filenames [coverage ] Missing test for tracebacks with angle bracketed filename #95259
    • tb_printinternal with depth > limit
    • _PyTraceBack_Print_Indented with overflowing tracebacklimit
    • No coverage for _Py_DumpDecimal, _Py_DumpHexadecimal, _Py_DumpASCII, dump_frame, dump_traceback, _Py_DumpTraceback, write_thread_id, _Py_DumpTracebackThreads -- possibly they have tests which are disabled under some circumstances.
@mdboom mdboom added the type-bug An unexpected behavior, bug, or error label Jul 13, 2022
@ezio-melotti
Copy link
Member

ezio-melotti commented Jul 13, 2022

@mdboom: I think a better approach would be to create sub-checklists or a single issue per file with another checklist for the paths that need coverage, instead of opening dozen of issues for each individual path.

Even having an issue per file will result in 548 new issues though, so I'm not sure if we want to do that preemptively for each file. I'd say it's better to keep the checklist with the files and paths here, and then directly create PRs for each (or multiple) paths.

Another option is to create a project, and handle it there. You can add a new custom field to specify the file, and create draft issues for each path without creating actual issues here (let me know if you need more help with that).

@mdboom
Copy link
Contributor Author

mdboom commented Jul 13, 2022

I think a better approach would be to create sub-checklists or a single issue per file with another checklist for the paths that need coverage, instead of opening dozen of issues for each individual path.

I thought about that, but most of the uncovered areas will have independent fixes, and bug-per-area sets up that work to happen.

I'm starting with files that have seen a lot of changes lately, so we're seeing quite a few issues in them. I suspect most files will not be that way -- most will probably have no issues, and we can just check the box here and not create a flood of issues where they aren't needed.

Even having an issue per file will result in 548 new issues though, so I'm not sure if we want to do that automatically for each file. I'd say it's better to keep this checklist here, and create PRs.

Agreed.

Another option is to create a project, and handle it there (let me know if you need help with that).

I'm happy to use a project instead if you'd prefer. IIUC, it's pretty easy to move the existing issues already created into it. I know I can't create a project, but once it's created, I don't know what my limited permissions will allow me to do.

@ezio-melotti
Copy link
Member

ezio-melotti commented Jul 13, 2022

Maybe it would be better to start by trimming down the list to remove files that have no issues, and see how many files are left first.

The remaining files and their paths could be listed here, and if someone starts working on them and wants to discuss the approach they could create the issues lazily or directly create PRs that refers to this meta issues if the fix is straightforward enough.

I don't think having lot of almost empty issues (like #94817) will help this effort.

@mdboom
Copy link
Contributor Author

mdboom commented Jul 13, 2022

Maybe it would be better to start by trimming down the list to remove files that have no issues, and see how many files are left first.

I programmatically removed files that have 100% coverage, and then also (manually) removed anything platform specific and all of Modules (which can be handled easily in separate waves later -- they aren't a priority now). This gets us down to 136 tasks from 500+, which is a lot more manageable.

The remaining files and their paths could be listed here, and if someone starts working on them and wants to discuss the approach they could create the issues lazily or directly create PRs that refers to this meta issues if the fix is straightforward enough.

My thinking was this checklist was to say "someone has read through the file and identified all of the potential issues". Some of the issues will have simple tests that can be written, some are dead code, some might reveal bugs, but we need a place to have those discussions and deal with them individually (not here, ideally). For really simple ones, if it's ok to just reference this bug, that's fine by me, but we still need a way to keep track of which files have been vetted to track progress.

I don't think having lot of almost empty issues (like #94817) will help this effort.

Maybe not in general. The ones I've filed so far are directly tied to the faster CPython work, and the ability to move with more confidence there. So there are motivated people who want to close these bugs. But I empathize with the concern of just creating bugs for the sake of creating them.

@ezio-melotti
Copy link
Member

ezio-melotti commented Jul 14, 2022

This gets us down to 136 tasks from 500+, which is a lot more manageable.

This is much better, thanks for doing this!

My thinking was this checklist was to say "someone has read through the file and identified all of the potential issues".

My idea was to do something like:

  • Python/compile.c
    • write_instr is not handling the case where ilen > 2. It might be that those are never seen in practice...? If so, feel free to close this bug.
    • check_ann_subscr doesn't have any coverage for slice or tuple kinds.
    • optimize_basic_block has some opcodes that aren't covered in the JUMP_IF_FALSE_OR_POP and the JUMP_IF_TRUE_OR_POP cases.
  • Python/condvar.h
  • Python/context.c
  • ...

Once people start working on these items, there are different options. They can:

  1. create a single PR to fix all 3, and add it to the checklist (maybe next to the filename)
  2. create multiple PRs to fix them individually, and add them to the checklist (as sub-items)
  3. create a single issue for the file with its own checklist where to discuss all 3, add it to the checklist, and then create PRs linked to the issue
  4. create multiple issues for each of the problems in case they need to be discussed individually (what you were doing)

The first two options work for straightforward cases, the third works if the problems are similar, and the last for more complex cases that require discussions. You can also do this incrementally, i.e. start with a list like the one above, then convert some to PRs and some to issues as needed.

In addition, if you hover with the mouse over a checklist item, a ⨀ icon will appear to the right which will allow you to quickly convert the item to an actual issue if/when you decide to open a discussion about a specific file or problem.

So there are motivated people who want to close these bugs.

That's great to hear, I just wanted to avoid creating a bunch of issues that might end up sitting there indefinitely :)

@mdboom
Copy link
Contributor Author

mdboom commented Jul 14, 2022

Thanks. The revised plan you suggested should work just fine. I didn't know about the feature to automatically open an issue from a checklist item.

@brandtbucher
Copy link
Member

brandtbucher commented Jul 14, 2022

@pablogsal, should we be backporting these coverage-improving tests to 3.11? I think we should be consistent in our handling of all of them.

I don't think we usually backport test-only PRs, but perhaps these are sort of a special case.

@pablogsal
Copy link
Member

pablogsal commented Jul 15, 2022

Test only PRs are generally ok to backport 👍

miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jul 15, 2022
…gits is provided (pythonGH-94860)

(cherry picked from commit 625ba9b)

Co-authored-by: Michael Droettboom <mdboom@gmail.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jul 15, 2022
`bool_new` had no coverage.

Automerge-Triggered-By: GH:brandtbucher
(cherry picked from commit df4d53a)

Co-authored-by: Michael Droettboom <mdboom@gmail.com>
miss-islington added a commit that referenced this issue Jul 15, 2022
…igits is provided (GH-94860) (GH-94882)

(cherry picked from commit 625ba9b)


Co-authored-by: Michael Droettboom <mdboom@gmail.com>

Automerge-Triggered-By: GH:brandtbucher
sobolevn added a commit to sobolevn/cpython that referenced this issue Oct 12, 2022
sobolevn added a commit to sobolevn/cpython that referenced this issue Oct 13, 2022
JelleZijlstra pushed a commit to JelleZijlstra/cpython that referenced this issue Oct 15, 2022
(cherry picked from commit f01b56c)

Co-authored-by: Nikita Sobolev <mail@sobolevn.me>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Oct 15, 2022
…8228)

(cherry picked from commit b7dd2ca)

Co-authored-by: Nikita Sobolev <mail@sobolevn.me>
JelleZijlstra added a commit that referenced this issue Oct 15, 2022
(cherry picked from commit f01b56c)

Co-authored-by: Nikita Sobolev <mail@sobolevn.me>
sobolevn added a commit to sobolevn/cpython that referenced this issue Oct 15, 2022
JelleZijlstra pushed a commit that referenced this issue Oct 16, 2022
JelleZijlstra pushed a commit to JelleZijlstra/cpython that referenced this issue Oct 16, 2022
…bals`, `PyFunction_GetModule` (pythonGH-98158).

(cherry picked from commit 7b48d02)

Co-authored-by: Nikita Sobolev <mail@sobolevn.me>
carljm added a commit to carljm/cpython that referenced this issue Oct 17, 2022
* main: (31 commits)
  pythongh-95913: Move subinterpreter exper removal to 3.11 WhatsNew (pythonGH-98345)
  pythongh-95914: Add What's New item describing PEP 670 changes (python#98315)
  Remove unused arrange_output_buffer function from zlibmodule.c. (pythonGH-98358)
  pythongh-98174: Handle EPROTOTYPE under macOS in test_sendfile_fallback_close_peer_in_the_middle_of_receiving (python#98316)
  pythonGH-98327: Reduce scope of catch_warnings() in _make_subprocess_transport (python#98333)
  pythongh-93691: Compiler's code-gen passes location around instead of holding it on the global compiler state (pythonGH-98001)
  pythongh-97669: Create Tools/build/ directory (python#97963)
  pythongh-95534: Improve gzip reading speed by 10% (python#97664)
  pythongh-95913: Forward-port int/str security change to 3.11 What's New in main (python#98344)
  pythonGH-91415: Mention alphabetical sort ordering in the Sorting HOWTO (pythonGH-98336)
  pythongh-97930: Merge with importlib_resources 5.9 (pythonGH-97929)
  pythongh-85525: Remove extra row in doc (python#98337)
  pythongh-85299: Add note warning about entry point guard for asyncio example (python#93457)
  pythongh-97527: IDLE - fix buggy macosx patch (python#98313)
  pythongh-98307: Add docstring and documentation for SysLogHandler.createSocket (pythonGH-98319)
  pythongh-94808: Cover `PyFunction_GetCode`, `PyFunction_GetGlobals`, `PyFunction_GetModule` (python#98158)
  pythonGH-94597: Deprecate child watcher getters and setters (python#98215)
  pythongh-98254: Include stdlib module names in error messages for NameErrors (python#98255)
  Improve speed. Reduce auxiliary memory to 16.6% of the main array. (pythonGH-98294)
  [doc] Update logging cookbook with an example of custom handling of levels. (pythonGH-98290)
  ...
JelleZijlstra added a commit that referenced this issue Oct 19, 2022
#98317)

[3.11] gh-94808: Cover `PyFunction_GetCode`, `PyFunction_GetGlobals`, `PyFunction_GetModule` (GH-98158).
(cherry picked from commit 7b48d02)

Co-authored-by: Nikita Sobolev <mail@sobolevn.me>
sobolevn added a commit to sobolevn/cpython that referenced this issue Oct 19, 2022
JelleZijlstra pushed a commit that referenced this issue Oct 20, 2022
…#98291)

gh-94808: Cover `str.rsplit` for UCS1, UCS2 or UCS4 (GH-98228)
(cherry picked from commit b7dd2ca)

Co-authored-by: Nikita Sobolev <mail@sobolevn.me>
sobolevn added a commit to sobolevn/cpython that referenced this issue Oct 20, 2022
sobolevn added a commit to sobolevn/cpython that referenced this issue Oct 22, 2022
@sobolevn
Copy link
Member

sobolevn commented Oct 22, 2022

Question from #98545

Do we care to cover deprecated API?

@gvanrossum
Copy link
Member

gvanrossum commented Oct 22, 2022

No, just like we don’t care to fix bugs in deprecated code (security excepted).

@sobolevn
Copy link
Member

sobolevn commented Oct 22, 2022

Thank you, #98545 is now closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
easy type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

9 participants