Summary

Title:Heap corruption while rendering
Category:Crash/Critical
Status:Open
Posted By:magdreams ( Don Culwell )
Date Created:28 August 2017

Problem

Description:

Hi,

We are using Ornatrix 1.3.1 with Redshift 2.5.27 and Maya 2017 Update 3. The first two frames render fine, but if we attempt to send batches of more than two frames to each render machine we are receiving errors. In the RoyalRender error logs we see:

R126| ++++ Executable returned -1073740940 (0x c0000374) as exit code for frame 1064.
R127| ++++ 'A heap has been corrupted.'

R128| ++++ Executable crashed

 

It's almost like the memory resources are not being purged before the next frame renders, and this leads to memory heap corruption. We are rendering 16-bit Half-Float EXR frames, with default compression. Note, confusingly the frame number referenced in the error, is actually the first frame that actually rendered successfully.

 

If we limit our renderfarm to only render just two frames per system, then the errors go away (but this takes longer and uses more network bandwidth).

 

Any ideas?

 

Thanks.

Steps to Reproduce:

Hello,

Does this happen only on the render farm? What if you render more than two frames on a local computer, do you also get the heap corruption error? Would be also interesting to see what the memory usage is like during the rendering process. If it increases we might have a leak, otherwise something is writing to a protected memory.

Marsel Khadiyev (Software Developer, EPHERE Inc.)

Hi Marsel,

The RAM usage on a 16GB render machine averages around 12GB while rendering, and 14GB on a 32GB machine. So, it's not close to max, especially on the 32GB machines (which also experienced the errors). I also checked the VRAM usage during rendering, and it uses only 1-2GB on our 12GB Titan-X's.

We just tried a local render in Maya on 5 frames and it rendered without producing any errors. So the memory leak could be isolated to the Maya Batch.

 



-Don

Anyone?



-Don

Thank you for trying these. Did you try rendering in batch mode on your local machine too?

Marsel Khadiyev (Software Developer, EPHERE Inc.)

Yes, the local test render was in batch mode. No errors were received, and the RAM/VRAM memory consumption was well below max.



-Don

Very odd that it'd be specific to the computers on the farm. Are they running the same OS as your local machine?

Marsel Khadiyev (Software Developer, EPHERE Inc.)

Correct, both local and render farm machines all run Win10 Pro creators edition.



-Don

Ok, thank you. We'll discuss what can be done for further diagnosis of the problem.

Marsel Khadiyev (Software Developer, EPHERE Inc.)

Some of the other scenes we render get Assert Failed errors. We also received this erorr locally just now, when we opened a scene...

 



-Don

If you open that scene again after restarting Maya, do you still get the same assertion? From the code it looks like Maya has trouble finding the shape of the hair node for some reason.

If it is reproducible I'd like to take a look at the scene if possible.

Marsel Khadiyev (Software Developer, EPHERE Inc.)

Unfortunatly, we just re-opened the scene and the error did not appear again.

I can send you a scene if it helps you to diagnose, however client confidentiality prevents me posting it on a forum. Do you have a direct email I can send it to?

 



-Don

Yes, I was hoping the issue would be reproducible. Please send the scene to marsel.khadiyev (at) ephere.com

Marsel Khadiyev (Software Developer, EPHERE Inc.)

Just emailed you a scene.



-Don