David Eng
David Eng2mo ago

Kernel dies in a notebook that manipulates a large Pandas dataframe

The notebook loads a couple of Pandas dataframes, each with 5-10M rows, filters each them down to 3-5M rows, samples 10% of them, and plots various charts. Unsure whether to allocate more resources, make my notebook more resource efficient, or something else. I can provide the stacktrace if helpful.
2 Replies
Hall
Hall2mo ago
Someone will reply to you shortly. In the meantime, this might help:
Akshay
Akshay2mo ago
The stacktracr would be helpful. If you could make a notebook that reproduces the issue (using synthetic data if your data is private) that would be very helpful We did push some improvements to dataframe performance. 1. Curious if you're still seeing issues on latest (0.9.32 or above) 2. If you have a repro or stack trace that would help

Did you find this page helpful?