Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C API] Convert a few stdlib extensions to the limited C API (PEP 384) #85283

Open
vstinner opened this issue Jun 25, 2020 · 36 comments
Open

[C API] Convert a few stdlib extensions to the limited C API (PEP 384) #85283

vstinner opened this issue Jun 25, 2020 · 36 comments
Labels
extension-modules C modules in the Modules dir topic-C-API

Comments

@vstinner
Copy link
Member

vstinner commented Jun 25, 2020

BPO 41111
Nosy @freddrake, @smontanaro, @rhettinger, @vstinner, @tiran, @encukou, @pmp-p, @ericsnowcurrently, @corona10, @shihai1991
PRs
  • bpo-41111: Move the Py_LIMITED_API macro of xxlimited module from setup.py to xxlimited.c. #25115
  • bpo-41111: xxlimited.c defines Py_LIMITED_API #25151
  • bpo-41111: Build limited C API without Py_TRACE_REFS macro. #25180
  • gh-85283: Extending Argument Clinic to support to use the limited C API. #26080
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2020-06-25.08:54:20.867>
    labels = ['extension-modules', 'expert-C-API', '3.11']
    title = '[C API] Convert a few stdlib extensions to the limited C API (PEP 384)'
    updated_at = <Date 2022-02-17.16:21:09.570>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2022-02-17.16:21:09.570>
    actor = 'eric.snow'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Extension Modules', 'C API']
    creation = <Date 2020-06-25.08:54:20.867>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 41111
    keywords = ['patch']
    message_count = 27.0
    messages = ['372307', '372309', '372310', '372311', '372315', '372316', '372321', '372324', '372325', '372327', '372332', '372386', '376506', '376507', '380415', '389925', '389959', '390068', '390159', '390161', '390179', '390326', '390333', '390389', '393541', '393635', '393640']
    nosy_count = 11.0
    nosy_names = ['fdrake', 'skip.montanaro', 'rhettinger', 'gustavo', 'vstinner', 'christian.heimes', 'petr.viktorin', 'pmpp', 'eric.snow', 'corona10', 'shihai1991']
    pr_nums = ['25115', '25151', '25180', '26080']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue41111'
    versions = ['Python 3.11']

    Linked PRs

    @vstinner
    Copy link
    Member Author

    Python stdlib has around 139 extension modules. Not all of them *need* to use internal C API. I'm interested to try to convert a bunch of them to the limited C API, as part of my work on the PEP-620. IMHO "Eating your own dog food" is a good practice to ensure that the limited C API is usable.

    Currently, the stdlib has exactly one extension module using the limited C API: the xxlimited module built with Py_LIMITED_API macro defined as 0x03050000 in setup.py. By the way, maybe Py_LIMITED_API should be defined in xxlimited.c, rather than in setup.py.

    xxlimited.c is not built if Python is built in debug mode. I'm not sure why.

    The main limitation to use the limited C API for stdlib is Argument Clinic which attempts to always emit the most efficient code, like using the METH_FASTCALL calling convention and use private functions like _PyArg_CheckPositional() or "static _PyArg_Parser _parser".

    Argument Clinic could be modified to have an option to only use C API of the limited C API. Cython is working on a similar option (restraint emitted code to the limited C API).

    I already tried to convert stdlib extensions to the limited C API in bpo-39573. I found other issues:

    • PyTypeObject is opaque and so it's not possible to implement a deallocator function (tp_dealloc) which calls tp_free like: Py_TYPE(self)->tp_free((PyObject*)self);
    • _Py_IDENTIFIER() is not part of the limited C API

    https://bugs.python.org/issue39573#msg361514

    @vstinner vstinner added 3.10 only security fixes extension-modules C modules in the Modules dir labels Jun 25, 2020
    @vstinner
    Copy link
    Member Author

    See also bpo-27499 "PY_SSIZE_T_CLEAN conflicts with Py_LIMITED_API".

    @vstinner
    Copy link
    Member Author

    Other missing features of the limited C API:

    @vstinner
    Copy link
    Member Author

    PyType_FromSpec() doesn't support 11 slots:

    • tp_dict
    • tp_mro
    • tp_cache
    • tp_subclasses
    • tp_weaklist
    • tp_vectorcall
    • tp_weaklistoffset (see PyMemberDef)
    • tp_dictoffset (see PyMemberDef)
    • tp_vectorcall_offset (see PyMemberDef)
    • bf_getbuffer
    • bf_releasebuffer

    https://docs.python.org/dev/c-api/type.html#c.PyType_Slot

    But it is possible to define tp_weaklistoffset, tp_dictoffset and tp_vectorcall_offset via Py_tp_members slot:
    https://docs.python.org/dev/c-api/structures.html#pymemberdef-offsets

    • "__dictoffset__" => tp_dictoffset
    • "__weaklistoffset__" => tp_weaklistoffset
    • "__vectorcalloffset__" => tp_vectorcall_offset

    Maybe we can do add new members for the 8 missing slots, especially bf_getbuffer and bf_releasebuffer?

    commit f7c4e23 of bpo-40724 added Py_bf_getbuffer and Py_bf_releasebuffer slots to the C API, but these slots are still not available in the limited C API: see bpo-10181.

    @vstinner
    Copy link
    Member Author

    See also bpo-39123: "PyThread_xxx() not available when using limited API".

    @vstinner
    Copy link
    Member Author

    See also bpo-23903: "Generate PC/python3.def by scraping headers". Steve Dower suggests to convert macros to functions. I'm not sure which functions are still implemented as macro in the limited C API.

    @vstinner
    Copy link
    Member Author

    See also bpo-16731: "xxlimited/xxmodule docstrings ambiguous".

    @vstinner
    Copy link
    Member Author

    See also bpo-41073: "[C API] PyType_GetSlot() should accept static types".

    @vstinner
    Copy link
    Member Author

    See also ./Tools/scripts/abitype.py script and bpo-10943: "abitype: Need better support to port C extension modules to the stable C API".

    @vstinner
    Copy link
    Member Author

    See also bpo-40077: "Convert static types to PyType_FromSpec()".

    @vstinner
    Copy link
    Member Author

    See also bpo-29086: "Document C API that is not part of the limited API".

    @rhettinger
    Copy link
    Contributor

    While this might make for an interesting validation experiment, I don't think the results should be checked in.

    Code churn is to be avoided when there is no visible user benefit. It risks destabilizing code, introducing errors, and making the code less recognizable to the other developers who created and maintained it.

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 7, 2020

    Some related changes in modules: posix, _hashopenssl, _ast (Python-ast.c), zlib, _struct, tkinter, _curses_panel.

    commit b396663
    Author: Eddie Elizondo <eelizondo@fb.com>
    Date: Tue Nov 5 07:16:14 2019 -0800

    bpo-35381 Remove all static state from posixmodule (GH-15892)
    
    
    
    After bpo-9665, this moves the remaining types in posixmodule to be heap-allocated to make it compatible with PEP-384 as well as modifying all the type accessors to fully make the type opaque.
    
    The original PR that got messed up a rebase: https://github.com/python/cpython/pull/10854. All the issues in that commit have now been addressed since https://github.com/python/cpython/pull/11661 got committed.
    
    This change also removes any state from the data segment and onto the module state itself.
    
    
    https://bugs.python.org/issue35381
    
    
    
    Automerge-Triggered-By: @encukou
    

    commit df69e75
    Author: Christian Heimes <christian@python.org>
    Date: Wed Sep 25 23:03:30 2019 +0200

    bpo-38142: Updated _hashopenssl.c to be PEP-384 compliant (bpo-16071)
    
    * Updated _hashopenssl.c to be PEP-384 compliant
    * Remove refleak test from test_hashlib. The updated type no longer accepts random arguments to __init__.
    

    commit ac46eb4
    Author: Dino Viehland <dinoviehland@fb.com>
    Date: Wed Sep 11 10:16:34 2019 -0700

    bpo-38113: Update the Python-ast.c generator to PEP-384 (gh-15957)
    
    Summary: This mostly migrates Python-ast.c to PEP-384 and removes all statics from the whole file. This modifies the generator itself that generates the Python-ast.c. It leaves in the usage of _PyObject_LookupAttr even though it's not fully PEP-384 compatible (this could always be shimmed in by anyone who needs it).
    

    commit a1ffad0
    Author: Dino Viehland <dinoviehland@gmail.com>
    Date: Tue Sep 10 11:27:03 2019 +0100

    bpo-38074: Make zlib extension module PEP-384 compatible (GH-15792)
    
    Updated zlibmodule.c to be PEP-384 compliant.
    

    commit 4f384af
    Author: Dino Viehland <dinoviehland@gmail.com>
    Date: Tue Sep 10 11:18:37 2019 +0100

    bpo-38076: Make struct module PEP-384 compatible (bpo-15805)
    
    * PEP-384 _struct
    
    * More PEP-384 fixes for _struct
    
    Summary: Add a couple of more fixes for `_struct` that were previously missed such as removing `tp_*` accessors and using `PyBytesWriter` instead of calling `PyBytes_FromStringAndSize` with `NULL`. Also added a test to confirm that `iter_unpack` type is still uninstantiable.
    
    * 📜🤖 Added by blurb_it.
    

    commit d2217a8
    Author: Andrew Svetlov <andrew.svetlov@gmail.com>
    Date: Tue Oct 30 22:49:16 2012 +0200

    Issue bpo-15721: apply PEP-384 Refactoring to tkinter module.
    

    commit bc07cb8
    Author: Martin v. Löwis <martin@v.loewis.de>
    Date: Thu Jun 14 16:01:23 2012 +0200

    Issue bpo-14936: curses_panel was converted to PEP-3121 and PEP-384 API.
    Patch by Robin Schreiber.
    

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 7, 2020

    See also bpo-40077: "Convert static types to PyType_FromSpec()".

    In bpo-1635741, many extension modules have been ported to the multi-phase initilization API (PEP-489), and many static types have been converted to heap types.

    @encukou
    Copy link
    Member

    encukou commented Nov 5, 2020

    Maybe we can do add new members for the 8 missing slots

    See: https://bugs.python.org/issue41618?#msg380414

    @vstinner vstinner changed the title Convert a few stdlib extensions to the limited C API Convert a few stdlib extensions to the limited C API (PEP 384) Nov 19, 2020
    @vstinner vstinner changed the title Convert a few stdlib extensions to the limited C API (PEP 384) [C API] Convert a few stdlib extensions to the limited C API (PEP 384) Nov 19, 2020
    @shihai1991
    Copy link
    Member

    By the way, maybe Py_LIMITED_API should be defined in xxlimited.c, rather than in setup.py.

    +1. Defining Py_LIMITED_API in xxlimited.c is more direct than in setup.py. so I created the PR-25115.

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 1, 2021

    See also bpo-43688: [C API] Support the limited C API in debug mode (Py_INCREF and Py_DECREF).

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 2, 2021

    New changeset 240bcf8 by Victor Stinner in branch 'master':
    bpo-41111: xxlimited.c defines Py_LIMITED_API (GH-25151)
    240bcf8

    @smontanaro
    Copy link
    Contributor

    The latest commit seems to break the build if configured --with-trace-refs.

    @smontanaro
    Copy link
    Contributor

    I should revise that comment. The xxlimited and xxlimited_35 modules fail to build. That seems suboptimal, but perhaps is to be expected. Perhaps it would be better that compiling them not be attempted with configuring --with-trace-refs?

    @shihai1991
    Copy link
    Member

    Perhaps it would be better that compiling them not be attempted with configuring --with-trace-refs?

    +1. The limited C API can't be build under Py_TRACE_REFS now.

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 6, 2021

    New changeset 5787ba4 by Hai Shi in branch 'master':
    bpo-41111: Don't build xxlimited with Py_TRACE_REFS macro (GH-25180)
    5787ba4

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 6, 2021

    Skip Montanaro: "The latest commit seems to break the build if configured --with-trace-refs."

    Oops, I forgot about this special build mode. Thanks for the reminder Skip, and thanks for the fix Hai ;-)

    Skip: By the way, I'm curious, why do you use --with-trace-refs?

    @smontanaro
    Copy link
    Contributor

    Skip: By the way, I'm curious, why do you use --with-trace-refs?

    I'm still horsing around with register opcodes and got in the habit of
    building with pydebug and trace refs enabled.

    @shihai1991
    Copy link
    Member

    The main limitation to use the limited C API for stdlib is Argument Clinic which attempts to always emit the most efficient code, like using the METH_FASTCALL calling convention and use private functions like _PyArg_CheckPositional() or "static _PyArg_Parser _parser".

    PR-26080 adds the feature that Argument Clinic to support to use the Limited C API.
    METH_FASTCALL is the part of the stable ABI in PR-23009.
    Do we need convert METH_VARARGS and METH_KEYWORDS as the part of the stable ABI too(something like METH_FASTCALL)?

    vstinner added a commit to vstinner/cpython that referenced this issue Aug 28, 2023
    The _stat C extension is now built with the limited C API.
    vstinner added a commit to vstinner/cpython that referenced this issue Aug 28, 2023
    This function can be used to convert the syslog extension to the
    limited C API. The PyInterpreterState_Main() is not available in the
    limited C API.
    vstinner added a commit to vstinner/cpython that referenced this issue Aug 28, 2023
    This function can be used to convert the syslog extension to the
    limited C API. The PyInterpreterState_Main() is not available in the
    limited C API.
    @vstinner
    Copy link
    Member Author

    To start the work on this issue, I propose by converting the _stat extension since the only change that it requires is to define the Py_LIMITED_API macro: PR #108573. I hesitate between documenting this change between the "C API" section and the "Build" section. Moreover, I'm not sure if it should be documented as a "Build Changes" in What's New in Python 3.13.

    @vstinner
    Copy link
    Member Author

    One issue is that the TraceRefs build is incompatible with the limited C API 😬

    @erlend-aasland
    Copy link
    Contributor

    To start the work on this issue, I propose by converting the _stat extension since the only change that it requires is to define the Py_LIMITED_API macro: PR #108573. I hesitate between documenting this change between the "C API" section and the "Build" section. Moreover, I'm not sure if it should be documented as a "Build Changes" in What's New in Python 3.13.

    I think Build is a better fit than C API, though I agree it is not optimal.

    @AlexWaygood AlexWaygood removed the 3.11 bug and security fixes label Aug 29, 2023
    vstinner added a commit to vstinner/cpython that referenced this issue Aug 29, 2023
    The _stat C extension is now built with the limited C API.
    @vstinner
    Copy link
    Member Author

    One issue is that the TraceRefs build is incompatible with the limited C API 😬

    I created issue #108634 for that.

    vstinner added a commit to vstinner/cpython that referenced this issue Aug 31, 2023
    The _stat C extension is now built with the limited C API.
    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 1, 2023

    Extensions which look interesting to be converted to the limited C API:

    • _bisect: use PyObject_CallOneArg(), _PyObject_CallMethod() which can be replaced with PyObject_CallMethodOneArg(). Argument Clinic uses _Py_convert_optional_to_ssize_t(), it is going to be fixed.
    • fcntl: need PySys_Audit(), Argument Clinic uses private _PyLong_FileDescriptor_Converter()
    • syslog: need PySys_Audit(), Py_XSETREF() can be replaced with Py_XDECREF(), PyUnicode_AsUTF8() should be replaced with PyUnicode_AsUTF8String() (replace S_ident_o global with S_ident_utf8).
    • md5: replace _PyType_GetModuleState() with PyType_GetModuleState(). _Py_strhex() can be rewritten.
    • pwd: need PyMem_RawRealloc(), posixmodule.h should not exclude the limited C API.
    • grp: need PyMem_RawRealloc(), replace PyStructSequence_SET_ITEM() with PyStructSequence_SetItem(), posixmodule.h should not exclude the limited C API.
    • resource: need PySys_Audit(), replace PyStructSequence_SET_ITEM() with PyStructSequence_SetItem().

    md5: replace _Py_strhex():

        const char *hexdigits = "0123456789abcdef";
        char digest_hex[MD5_DIGESTSIZE * 2];
        char *str = digest_hex;
        for (size_t i=0; i < MD5_DIGESTSIZE; i++) {
            unsigned char byte = digest[i];
            *str++ = hexdigits[byte >> 4];
            *str++ = hexdigits[byte & 0x0f];
        }
        return PyUnicode_FromStringAndSize(digest_hex, sizeof(digest_hex));

    vstinner added a commit to vstinner/cpython that referenced this issue Sep 5, 2023
    PySys_Audit() now raise an exception if the event argument is NULL or
    if the "N" format is used in the *format* argument.
    vstinner added a commit to vstinner/cpython that referenced this issue Sep 5, 2023
    PySys_Audit() now raise an exception if the event argument is NULL or
    if the "N" format is used in the *format* argument.
    vstinner added a commit to vstinner/cpython that referenced this issue Sep 5, 2023
    PySys_Audit() now raises an exception if the event argument is NULL
    or if the "N" format is used in the *format* argument.
    
    Add tests on PySys_AuditTuple() and on new PySys_Audit() error code
    paths.
    vstinner added a commit to vstinner/cpython that referenced this issue Sep 5, 2023
    PySys_Audit() now raises an exception if the event argument is NULL
    or if the "N" format is used in the *format* argument.
    
    Add tests on PySys_AuditTuple() and on new PySys_Audit() error code
    paths.
    vstinner added a commit to vstinner/cpython that referenced this issue Sep 5, 2023
    PySys_Audit() now raises an exception if the event argument is NULL
    or if the "N" format is used in the *format* argument.
    
    Add tests on PySys_AuditTuple() and on new PySys_Audit() error code
    paths.
    vstinner added a commit to vstinner/cpython that referenced this issue Sep 5, 2023
    PySys_Audit() now raises an exception if the event argument is NULL
    or if the "N" format is used in the *format* argument.
    
    Add tests on PySys_AuditTuple() and on new PySys_Audit() error code
    paths.
    vstinner added a commit to vstinner/cpython that referenced this issue Sep 5, 2023
    PySys_Audit() now raises an exception if the event argument is NULL
    or if the "N" format is used in the *format* argument.
    
    Add tests on PySys_AuditTuple() and on new PySys_Audit() error code
    paths.
    vstinner added a commit to vstinner/cpython that referenced this issue Sep 5, 2023
    This function was added in Python 3.8 by the PEP 578 "Python Runtime
    Audit Hooks". It is needed to convert some stdlib extensions to the
    limited C API, like fcntl, resource and syslog.
    
    Move also non-limited "PerfMap" C API from Include/sysmodule.h to
    Include/cpython/sysmodule.h.
    vstinner added a commit to vstinner/cpython that referenced this issue Sep 6, 2023
    PySys_Audit() now raises an exception if the event argument is NULL
    or if the "N" format is used in the *format* argument.
    
    Add tests on PySys_AuditTuple() and on new PySys_Audit() error code
    paths.
    vstinner added a commit to vstinner/cpython that referenced this issue Sep 18, 2023
    PySys_Audit() now raises an exception if the event argument is NULL
    or if the "N" format is used in the *format* argument.
    
    Add tests on PySys_AuditTuple() and on new PySys_Audit() error code
    paths.
    vstinner added a commit to vstinner/cpython that referenced this issue Sep 19, 2023
    PySys_Audit() now raises an exception if the event argument is NULL
    or if the "N" format is used in the *format* argument.
    
    Add tests on PySys_AuditTuple() and on new PySys_Audit() error code
    paths.
    vstinner added a commit to vstinner/cpython that referenced this issue Sep 29, 2023
    PySys_Audit() now raises an exception if the event argument is NULL
    or if the "N" format is used in the *format* argument.
    
    Add tests on PySys_AuditTuple() and on new PySys_Audit() error code
    paths.
    vstinner pushed a commit to vstinner/cpython that referenced this issue Sep 30, 2023
    Add PyMem_RawMalloc(), PyMem_RawCalloc(), PyMem_RawRealloc() and
    PyMem_RawFree() to the limited C API.
    
    These functions were added by Python 3.4 and are needed to port
    stdlib extensions to the limited C API, like grp and pwd.
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir topic-C-API
    Projects
    None yet
    Development

    No branches or pull requests

    8 participants