Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C API] Removed private _PyArg_Parser API has no replacement in Python 3.13 #112136

Open
vstinner opened this issue Nov 16, 2023 · 4 comments
Open

Comments

@vstinner
Copy link
Member

Copy of @tacaswell's message:

aio-libs/multidict is also using _PyArg_Parser and friends (e.g. https://github.com/aio-libs/multidict/blob/18d981284b9e97b11a4c0cc1e2ad57a21c82f323/multidict/_multidict.c#L452-L456 but there are many)

The _PyArg_Parser API is used by Argument Clinic to generate efficient code parsing function arguments.

Since Python 3.12, when targeting Python internals, Argument Clinic initializes _PyArg_Parser.kwtuple with singletons objects using the _Py_SINGLETON(name) API. This API cannot be used outside Python.

Private functions with a _PyArg_Parser argument:

  • _PyArg_ParseStackAndKeywords()
  • _PyArg_ParseTupleAndKeywordsFast()
  • _PyArg_UnpackKeywords()
  • _PyArg_UnpackKeywordsWithVararg()

cc @serhiy-storchaka

@vstinner
Copy link
Member Author

A code search on _PyArg_Parser in PyPI top 8,000 projects (2023-11-15) found 5 projects:

  • multidict (6.0.4)
  • multiprocess (0.70.15)
  • mypy (1.7.0)
  • orjson (3.9.10)
  • pickle5 (0.0.12)

@nickelpro
Copy link

nickelpro commented Nov 16, 2023

These APIs are effectively required for using METH_FASTCALL | METH_KEYWORDS which is a documented part of the API.

Notably METH_FASTCALL is part of the Stable ABI and requires the not-mentioned-here-but-related _PyArg_ParseStack().

"Private, but documented, and partially considered part of the stable ABI" is not the same thing as "Private". Removing these functions leaves no API to interact with these documented call mechanisms. Unless you intend to remove METH_FASTCALL and its derivatives from the documentation and the stable ABI, something must replace these private functions.

Anecdotally, we make extensive use of _PyArg_ParseStack() and _PyArg_ParseStackAndKeywords() internally at NYU in high-traffic CPython methods. I don't think a code search of PyPI is a responsible way to gauge the existence of C API consumers.

@vstinner
Copy link
Member Author

What I'm trying to do here is to list projects using these projects, see how they use the API, and which public API is needed.

Anecdotally, we make extensive use of _PyArg_ParseStack() and _PyArg_ParseStackAndKeywords() internally at NYU in high-traffic CPython methods.

Do you have some examples? Which calling convention do you use? METH_FASTCALL, METH_FASTCALL | METH_KEYWORDS, and/or something else?

"Private, but documented, and partially considered part of the stable ABI" is not the same thing as "Private". Removing these functions leaves no API to interact with these documented call mechanisms.

This situation is unfortunate. METH_FASTCALL was created as a private API but apparently, it was adopted outside Python before it was standardized by PEP 590 – Vectorcall: a fast calling protocol for CPython.

As the author of METH_FASTCALL, I would like to add a clean public, documented and tested API to make these calling convention fully usable :-)

  • METH_FASTCALL
  • METH_FASTCALL | METH_KEYWORDS
  • METH_METHOD | METH_FASTCALL | METH_KEYWORDS

In terms of functions, so far the following functions were mentioned:

  • _PyArg_ParseStack()
  • _PyArg_ParseStackAndKeywords() -- use _PyArg_Parser
  • _PyArg_ParseTupleAndKeywordsFast() -- use _PyArg_Parser
  • _PyArg_UnpackKeywords() -- use _PyArg_Parser
  • _PyArg_UnpackKeywordsWithVararg() -- use _PyArg_Parser

Is the number of arguments always a signed Py_ssize_t type? Or do we have to care about PyVectorcall_NARGS() which uses an unsigned size_t type? If I understood correctly, unsigned size_t is only used to call a function. But inside a function implementation, the calling convention is to pass a number of arguments as a signed Py_ssize_t.

@nickelpro
Copy link

nickelpro commented Nov 16, 2023

Do you have some examples? Which calling convention do you use? METH_FASTCALL, METH_FASTCALL | METH_KEYWORDS, and/or something else?

I don't think we do anything fancy, basically the same straightforward code you would see pop out of argument clinic:

// Tiny METH_FASTCALL example function
static PyObject *print_func(PyObject *self,
    PyObject *const *args, Py_ssize_t nargs) {
  const char *str;
  if(!_PyArg_ParseStack(args, nargs, "s", &str))
    return NULL;
  puts(str);
  Py_RETURN_NONE;
}

or

// Tiny METH_FASTCALL | METH_KEYWORDS example function
static PyObject* prefix_print(PyObject* self, PyObject* const* args,
    Py_ssize_t nargs, PyObject* kwnames) {
  static const char* _keywords[] = {"", "prefix", NULL};
  static _PyArg_Parser _parser = {.keywords = _keywords,
      .format = "s|s:prefix_print"};

  const char* str;
  const char* prefix = NULL;
  if(!_PyArg_ParseStackAndKeywords(args, nargs, kwnames, &_parser, &str,
         &prefix))
    return NULL;

  if(prefix)
    printf("%s", prefix);
  puts(str);

  Py_RETURN_NONE;
}

for METH_FASTCALL and METH_FASTCALL | METH_KEYWORDS respectively. I imagine if we converted the latter usage to vectorcall it would be identical with the exception of replacing nargs with PyVectorcall_NARGS(nargsf), so vectorcall vs METH_FASTCALL isn't really the problem here unless I'm completely missing some public API for parsing vectorcall arguments.

A quick Ctrl-F shows no one using kwtuple yet, but I think kwtuple is a basically good feature (and we use pre-initialized global string objects in some other places), so I would be sad to see that capacity go away as well.

I would like to add a clean public, documented and tested API to make these calling convention fully usable

Ya 100%. I understand I'm just some random and you're the core dev here, just wanted to raise the objection that breaking this functionality without having that replacement ready doesn't seem like a win to me. The APIs are already internal, there's no stability contract, but if they exist anyway in an unbroken state, what's the harm in leaving them accessible until a replacement is available?

As a practical matter if these disappeared we would be replicating the declarations in our own headers for the Python versions where no solutions existed, and that seems like a pointless maintenance burden to push out into the world.

Is the number of arguments always a signed Py_ssize_t type? Or do we have to care about PyVectorcall_NARGS() which uses an unsigned size_t type?

PyVectorcall_NARGS() accepts a size_t but returns an ssize_t, from the point of view of the parser nargs is always signed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants