Stack overflows in Python seem to kill the demo apps

I set up a test doing a recursive function call to trigger a stack overflow.

cnt = 0
def MyFunction(message, x, y):
    global cnt
    cnt = cnt + 1
    MyFunction(message + str(cnt), x + 1, y + 1)
    
print ("Starting overflow test")
MyFunction("0", 0, 0)

I tried this in the AlterNet Studio demos for both Python and Iron Python and in both cases, the app crashed. Is it possible to detect and recover from a stack overflow in user code? I can see how that would probably be impossible for C# or VB.NET scripting but it seems like the Python engines might be able to contain that.
Thanks,
Nathan

Hi Nathan,

I don’t think that you can catch and handle the StackOverflow exception once it’s thrown.

However, since in Python/IronPython, you can set a callback for the execution of every line or function (this is what our ScriptDebugger does), it should be possible to detect that a StackOverflow exception is about to occur.

Here’s the code that I use to test it (it’s a private method inside ScriptDebugger, so it might not be easy to replicate it), but the idea is to calculate a stack depth on every call (which will have a negative impact on the script performance). If this is something that works for you, we can add an event like OnTraceCallback so it will be possible to handle it similarly.

  private TracebackDelegate TraceCallback(dynamic frame, dynamic @event, object arg)
       {
           int GetFrameCount(dynamic frame)
           {
               int result = 0;
               while (frame != null)
               {
                   result++;
                   frame = frame.f_back;
               }

               return result;
           }

           if (@event == TraceEvents.Call && GetFrameCount(frame) > 100)
               throw new Exception("Detected stack overflow");

           using (((PythonScriptHost)ScriptRun.ScriptHost).Engine.Gil)
               return TraceCallbackCore(new TraceBackFrame(frame), (string)@event, arg);
       }

Kind regards,
Dmitry

Interesting. Python must be managing its stack so you would think it would have a way to manage this. Is the overhead of the callback just because of the existence of the callback or the overhead of the stack check? If the callback itself is not too slow perhaps the stack could be checked on a counter. For example, check every 2000 statements. Stack overflow is not going to happen quickly, so that should be enough to catch it in the act. Ideally, we would be able to configure the frequency of the checks and the maximum allowed stack depth.

The only other solution I can think of would be running Python in another process but that would be very difficult and probably even more overhead.

Hi Nathan,

I looked at it further, and it seems that running the Python script without debugging correctly handles recursion errors.

However, it seems that due to the fact that TraceCallback is called for every line of code, a stack overflow exception is raised before the default recursion level (1000) is reached.

If you execute the following line somewhere in your script before the code that goes into recursion, this exception will be handled correctly.

sys.setrecursionlimit(500)

We can make a default recursion level as a property of the script debugger and set it internally.

However, this approach does not seem to work for IronPython.

Kind regards,
Dmitry

Thanks Dmitry! That is encouraging. I admit I never tried infinite recursion outside of the debugger. Too bad about IronPython but that should be OK for us.
Regards,
Nathan

Hi Nathan,

We will include this change in the next minor update, I will keep you posted.

It also works with IronPython, but on my tests, the recursion level needs to be set to a fairly low number (100) to handle this sort of error correctly.

Kind regards,
Dmitry