Kernel dies in a notebook that manipulates a large Pandas dataframe
The notebook loads a couple of Pandas dataframes, each with 5-10M rows, filters each them down to 3-5M rows, samples 10% of them, and plots various charts. Unsure whether to allocate more resources, make my notebook more resource efficient, or something else. I can provide the stacktrace if helpful.
2 Replies
Someone will reply to you shortly. In the meantime, this might help:
The stacktracr would be helpful. If you could make a notebook that reproduces the issue (using synthetic data if your data is private) that would be very helpful
We did push some improvements to dataframe performance.
1. Curious if you're still seeing issues on latest (0.9.32 or above)
2. If you have a repro or stack trace that would help