gh-148171: Convert CALL_BUILTIN_FAST to leave inputs on the stack for refcount elimination in JIT#148172
gh-148171: Convert CALL_BUILTIN_FAST to leave inputs on the stack for refcount elimination in JIT#148172Fidget-Spinner wants to merge 9 commits intopython:mainfrom
Conversation
|
Results are pretty good. 7% faster on call microbenchmark: |
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
|
Rather than adding an extra stack item for all calls everywhere, I think it would make more sense to keep within stack bounds for this one uop. The callable is almost always going to be immortal, and For example: means we can optimize away the decrefs on or, for function calls where we know |
Python/optimizer.c
Outdated
| assert(next->op.code == STORE_FAST); | ||
| operand = next->op.arg; | ||
| } | ||
| else if (uop == _POP_TOP_OPARG) { |
There was a problem hiding this comment.
We should be special casing as few instructions as possible in the front-end.
This can probably be better handled in the optimizer, as we might want to leave a single _POP_TOP_OPARG rather than a long series of _POP_TOPs.
|
@markshannon I chose to swap out callable, as it's the most convenient location, and I suspect |
| // To maintain a consistent view of the stack in the GC, callable's stack location | ||
| // must be overwritten before we close it. | ||
| _PyStackRef *callable_ptr = args - 2; | ||
| PyStackRef_XSETREF(*callable_ptr, PyStackRef_NULL); |
There was a problem hiding this comment.
This is a bit strange. Does the code generator force you to do this?
Would this work?
PyObject *res_o = _Py_BuiltinCallFast_StackRef(...)
if (res_o == NULL) {
ERROR_NO_POP();
}
s = self_or_null;
DEAD(self_or_null);
DEAD(args);
res = PyStackRef_FromPyObjectSteal(res_o);
PyStackRef_CLOSE(callable);
There was a problem hiding this comment.
If I do that, the code generator shrinks the "real" stack to nothing, sets the stack pointer to before callable, and then calls the close.
This is incorrect, as we need the self_or_null and args to still be visible from the stack when closing callable, as on FT, the stack is a GC root.
Uh oh!
There was an error while loading. Please reload this page.