how to disable “defined by another cell” error?
While experimenting and testing models - I usually reuse some variables from cell to cell. For example, when training models - I can set device variable to cuda for better coding experience.
device = torch.device("cuda”)
And when I want to use this model later in the same notebook and leave GPU for another processes, I’ll do:
device = torch.device(“cpu”)
But marimo forbids this type of operations!
Same applies to situations with sklearn models, where I train models in cells reusing “model” variable.
model = TypeOfModel()
model.fit()
Of course, it’s not clear or any close to production looking code, but this is what I want from notebooks - fast and simple “proof of concept” code.
P.s. forgive me for code blocks here, mobile discord sucks
16 Replies
This isn't something we'll support, because it makes the notebook not a dataflow graph, which breaks a lot of assumptions that marimo needs to make in order to eliminate hidden state and make notebooks executable as apps/scripts.
As you may know, you can use local variables (
_device
), or better yet encapsulate code in functions and/or give variables meaningful names.Kinda sad, but okay, I can live with that
UPD. Its getting more uncomfortable...
If i do
in one cell, I'm forced to use
_file
, __file
, ___file
, etc. on and on in other cells
Is there anything we could do with this? This is really annoyingVariables starting with an underscore are local to a cell. In this example
_file
is local to a cell. You can do
in every cell. No need to keep adding underscores.
Is that good enough?Hm, so why my marimo was arguing about already defined _file?
Well, I double checked now and there is no problem with
_file
right now. Maybe that was a nightmare, idk. Sorry for botheringhaha no worries, it happens
@Akshay Well... another problem...
How do I use
for i in range(...)
loops in different cells if I always get i was defined by another cell
? using _i
is very ugly and less readableI personally use _i (was a bit hard getting accustomed to it migrating from traditional jupyter notebooks); not sure what other work around there is. If you want more insights, I would recommend referring to best practices to follow here - https://docs.marimo.io/guides/best_practices/index.html#best-practices
Yeah, but this is ugly looking and raises too many questions when I want to export my code somewhere.
Can I just disable “dataflow graph” when I don’t need it? Experimenting in notebooks should be easier
Not sure if that's a existing/viable option (being embedded in core values and principles upon which marimo was built); may be the core contributors might be knowing a suitable workaround/alternative.
There's some more discussion here about read-only variables: https://github.com/marimo-team/marimo/issues/1477
But if you utilize the scratchpad- there are no definition constraints. Normally, I iterate/ play around there- and make the minor changes when I insert into the notebook
GitHub
Relax Uniqueness Constraint for Write-only Variables · Issue #1477 ...
Description I am hoping to be able to define cells like this, which is a common pattern in existing notebooks. cell1: fig, ax = plt.subplots() ax.plot(...) cell2: fig, ax = plt.subplots() ax.plot(....
I tried implementing relaxing the uniqueness twice — have two branches on my machine — and ran into many edge cases. It also introduces a tradeoff, increasing memory pressure because marimo has to keep a copy of each duplicated variable. Fine for loop indices, not fine for
df = my_big_dataset()
, X = my_cuda_tensor
. In the end, in the spirit of Python, I think simple is better than magical, explicit better than implicit
Disabling the graph would mean disabling many other features that marimo provides — running as apps, elimination of hidden state, module hotreloading. We don't want to bifurcate our users, and the notebooks they share, into "graph" users and "non-graph" users.
Hopefully the scratchpad helps. If we can find a way to workaround the memory pressure for relaxing the memory pressure, we could consider relaxing the uniqueness constraint. But we need to keep every copy of duplicated variables around, otherwise deleting a cell would mark all other cells using the duplicated variable as stale, which would then increase compute pressure, which also is unacceptable (imagine all your for loops rerunning just because you deleted a totally unrelated cell).@Akshay I’m sorry, but this is now showstopper for me
And inplace=True is not a great way to overcome this issue at all
Stack Overflow
In pandas, is inplace = True considered harmful, or not?
This has been discussed before, but with conflicting answers:
in-place is good!
in-place is bad!
What I'm wondering is:
Why is inplace = False the default behavior?
When is it good to change it?...
I’m really sorry, but I’m forced to quit marimo because of this 😦
Sorry to hear it! I'm glad you gave it a try.