Python Source Code Analysis: Where does self go when monkey patch a class member function to eval()?
In the process of getting a shell in Python, we can hijack a member function of a class and turn it into eval(). At first glance, everything seems fine. But upon closer examination, when calling a member function of a class, isn’t self passed as an argument, like func(clazz.self, parameter)? Then why isn’t self being passed as the first argument to eval(), and why isn’t there an error?
To truly understand the situation, let’s take a look at the Python 3.7.8 source code. Configuring the environment for debugging Python source code is similar to debugging PHP source code, so you can refer to relevant articles on setting up a PHP source code debugging environment.
Pro tip: The “Doc” folder in the Python source code contains official documentation (some .rst files). If you don’t know the purpose of a function in the source code, you can search and view it in the Doc folder. In CLion, you can use Ctrl+Shift+F to perform a full search, covering the entire project and even library functions’ source code.
Example
Let’s take a look at the following code. We know that the definition of eval() is eval(expression[, globals[, locals]]).
If we put eval() in __eq__, when executing a=="bb", the expression should be self, and the globals should be "bb". If that’s the case, there will definitely be an error and the execution cannot continue:
|
|
However, in reality, why is there no error when putting a.__class__.__eq__ = eval in __eq__, and it executes normally instead?
Analysis
0x01 builtin_eval
In the Python language, eval() belongs to python’s builtin_function. Its implementation is in builtin_eval.
|
|
So, let’s set a breakpoint on this function and see the call stack.
When evaluating a == 'bb', because the do_richcompare function is triggered during the == comparison and op=2 indicates that == is being performed.
In the design philosophy of the Python language, an object has many “slots”, such as __str__ which is a slot function that can be overridden. __eq__ is also one of them.
https://docs.python.org/3.8/c-api/typeobj.html?highlight=slots
a.__class__.__eq__ = eval, so it can be understood that eval is placed in the slot corresponding to eq, and this is how it enters slot_tp_richcompare.
If eval is not placed, then Python would perform the comparison according to the normal process during richcompare.
|
|
lookup_maybe_method extracts the eval in __eq__, then executes it using call_unbound.
But notice that self is still passed into call_unbound. So, where is self being discarded?
Because unbound=0, self is discarded here.
|
|
Now that we know where self is being discarded, let’s dig deeper and find out how unbound=0 is set.
Let’s continue reading:
0x02 unbound
By following the code, we can find _PyObject_FastCallDict(), which calls _PyCFunction_FastCallDict(),
and this CFunction is indeed the eval we are looking for.
Then, we enter the execution of builtin_eval().
|
|
So, how does unbound=0 come about? Let’s see what lookup_maybe_method does.
|
|
Macro definitions related to PyFunction_Check:
|
|
&PyFunction_Type can be understood as PyFunction_Type[0], the PyFunction_Type array:
|
|
The stuff in front of PyVarObject_HEAD_INIT(&PyType_Type, 0) "function" is a type conversion; ignore it.
What it means here is that the ob_type needs to be "function" for PyFunction_Check to return 1. Because the ob_type of eval is builtin_function_or_method, it will return 0.
This can be verified through a simple test. In the following example, the ob_type is function, and the return value of unbound is 1:
|
|
Then, we clearly didn’t define the __get__ for class A, so descrgetfunc = NULL. After that, lookup_maybe_method finishes, and it returns the eval, incidentally setting unbound = 0.
Conclusion
In studying web security, many language tricks might seem ordinary. However, it’s crucial to understand their underlying principles. Gaining insight into the rationale behind these tricks is often more rewarding than merely memorizing them.
When reading the source code, it’s often helpful to refer to the official documentation. This assists in understanding the design philosophy, giving an overview of the architecture, and facilitates subsequent analysis."