dmad Comments - marimo

dmad

•Created by Ilya I. Lubenets on 10/1/2024 in #help-support

how to disable “defined by another cell” error?

There's some more discussion here about read-only variables: https://github.com/marimo-team/marimo/issues/1477 But if you utilize the scratchpad- there are no definition constraints. Normally, I iterate/ play around there- and make the minor changes when I insert into the notebook

22 replies

•Created by Cumulus on 8/8/2024 in #help-support

jupyter-marimo-proxy issues with kubernetes deployed jupyterhub

marimo restricts the host too. You might have to patch jupyter-marimo-proxy to pass in a --host flag too, or do some proxy forwarding Just a thought, but would explain why things work locally (127.0.0.1 is the default allowed host)

5 replies

•Created by ftc45 on 7/30/2024 in #help-support

"kernel not found" when port forwarding

This one should work marimo run --port=8080 --proxy=ec2-3-19-54-159.us-east-2.compute.amazonaws.com:80 lvof.py What are your apache rules again?

30 replies

•Created by ftc45 on 7/30/2024 in #help-support

"kernel not found" when port forwarding

Cool! Nice to see @ftc45 how are you spinning up marimo? Through the FastAPI or directly with a background command / service? If it is the latter, you should be able to do

marimo run --proxy=userside.domain:80 --port=8080  myapp.py

marimo run --proxy=userside.domain:80 --port=8080  myapp.py

30 replies

•Created by ananis on 7/5/2024 in #help-support

Unable to connect to github copilot

Thanks!

15 replies

•Created by ananis on 7/5/2024 in #help-support

Unable to connect to github copilot

Did you check your firewall? Also, if you pop developer tools, it should let know know why the request fails

15 replies

•Created by ananis on 7/5/2024 in #help-support

Unable to connect to github copilot

marimo -d -l DEBUG edit --host 100.106.136.158 --no-token --proxy 100.106.136.158 --port=2718 For full verbosity. It could also be a firewall issue if 27180 is blocked

15 replies

•Created by ananis on 7/5/2024 in #help-support

Unable to connect to github copilot

Yup. So, copilot runs on another server in the background, and assumes that the way marimo is connecting is the same way it should connect. --host=localhost is the default, and restricts access to the server unless the request comes from localhost. So that's why you can't connect now, you still need --host={private IP} --proxy was set up for domains (e.g. your server is notebooks.company.co). Actually since you are running with a raw IP, I don't think this is needed, but give it a shot. Do you have node installed on your server?

15 replies

•Created by ananis on 7/5/2024 in #help-support

Unable to connect to github copilot

You have to run the --proxy={private-ip} --host just restricts the ip interface for access to the server

15 replies

•Created by dmad on 6/26/2024 in #help-support

Advanced memory management

I guess you can do this now, without the persistent cache blocks actually. Persistence would just potentially make things faster on secondary runs / restarted kernels

10 replies

•Created by dmad on 6/26/2024 in #help-support

Advanced memory management

Here's another suggestion with the proposed persistent_cache feature

# cell
@functools.cache
def load_huge():
    return pl.read_parquet("huge.parquet")

# cell
with mo.persistent_cache(name="figures") as figures:
    plot_fig1, _plot_ax1 = plt.figure()
    plot_fig2, _plot_ax2 = plt.figure()
    # Build diagrams without displaying
    sns.histplot(load_huge(), ..., ax=_plot_ax1)
    sns.boxplot(load_huge() ..., ax=_plot_ax2)

# cell
with mo.persistent_cache(name="partial_df"):
  partial_df = load_huge().filter(...)

# cell
with mo.persistent_cache(name="partial_df2"):
  partial_df2 = load_huge().group_by(...).agg(...)

# cell
figures, partial_df, partial_df2 # Ensure the above cells run first
load_huge.clear_cache()

# cell
@functools.cache
def load_huge():
    return pl.read_parquet("huge.parquet")

# cell
with mo.persistent_cache(name="figures") as figures:
    plot_fig1, _plot_ax1 = plt.figure()
    plot_fig2, _plot_ax2 = plt.figure()
    # Build diagrams without displaying
    sns.histplot(load_huge(), ..., ax=_plot_ax1)
    sns.boxplot(load_huge() ..., ax=_plot_ax2)

# cell
with mo.persistent_cache(name="partial_df"):
  partial_df = load_huge().filter(...)

# cell
with mo.persistent_cache(name="partial_df2"):
  partial_df2 = load_huge().group_by(...).agg(...)

# cell
figures, partial_df, partial_df2 # Ensure the above cells run first
load_huge.clear_cache()

In theory, on a secondary run, load_huge will never have to be called, and the cells will auto rerun/ reload huge_df if the code changes

10 replies

•Created by dmad on 6/26/2024 in #help-support

Advanced memory management

If in a single cell

# For further exploration, we actually only need a subset
partial_df = huge_df.filter(...)
partial_df2 = huge_df.group_by(...).agg(...)
del huge_df

# For further exploration, we actually only need a subset
partial_df = huge_df.filter(...)
partial_df2 = huge_df.group_by(...).agg(...)
del huge_df

Works in both modes

# cell
huge_df = pl.read_parquet("huge.parquet")

# cell
# For further exploration, we actually only need a subset
partial_df = huge_df.filter(...)
partial_df2 = huge_df.group_by(...).agg(...)
required_del_ref = None # Trick marimo to always run this cell first

# cell
required_del_ref # included to ensure correct run order
del globals()["huge_df"]

# cell
huge_df = pl.read_parquet("huge.parquet")

# cell
# For further exploration, we actually only need a subset
partial_df = huge_df.filter(...)
partial_df2 = huge_df.group_by(...).agg(...)
required_del_ref = None # Trick marimo to always run this cell first

# cell
required_del_ref # included to ensure correct run order
del globals()["huge_df"]

Not recommended but possible. Won't work in strick mode

# cell
required_del_ref # included to ensure correct run order
huge_df.drop(huge_df.index, inplace=True)

# cell
required_del_ref # included to ensure correct run order
huge_df.drop(huge_df.index, inplace=True)

Still not recommended, particular to dataframes. Will not work in strict mode

# cell
huge_df = zero_copy(pl.read_parquet("huge.parquet"))
# cell
required_del_ref # included to ensure correct run order
huge_df.drop(huge_df.index, inplace=True)

# cell
huge_df = zero_copy(pl.read_parquet("huge.parquet"))
# cell
required_del_ref # included to ensure correct run order
huge_df.drop(huge_df.index, inplace=True)

Will work in strict mode (not recommended) --- mo.drop is not easily possible since static analysis primarily works on variable name. You could just restructure your code though:

# cell
huge_df = pl.read_parquet("huge.parquet")
plot_fig1, _plot_ax1 = plt.figure()
plot_fig2, _plot_ax2 = plt.figure()
# Build diagrams without displaying
sns.histplot(huge_df, ..., ax=_plot_ax1)
sns.boxplot(huge_df, ..., ax=_plot_ax2)

# Export partial views
partial_df = huge_df.filter(...)
partial_df2 = huge_df.group_by(...).agg(...)

del huge_df

# cell
plot_fig1

# cell
plot_fig2

# cell
huge_df = pl.read_parquet("huge.parquet")
plot_fig1, _plot_ax1 = plt.figure()
plot_fig2, _plot_ax2 = plt.figure()
# Build diagrams without displaying
sns.histplot(huge_df, ..., ax=_plot_ax1)
sns.boxplot(huge_df, ..., ax=_plot_ax2)

# Export partial views
partial_df = huge_df.filter(...)
partial_df2 = huge_df.group_by(...).agg(...)

del huge_df

# cell
plot_fig1

# cell
plot_fig2

That way huge_df is confined to a single cell. But this is annoying- because any change to partial_df or partial_df2 requires a rerun.

10 replies

•Created by dmad on 6/26/2024 in #help-support

Advanced memory management

Can you add the cell divisions or is this all in one cell?

10 replies

•Created by dmad on 6/26/2024 in #help-support

Advanced memory management

Can we do even better Maybe. One of the current experimental features of marimo is "strict mode" enabled with:

[experimental]
execution_type = "strict"

[experimental]
execution_type = "strict"

This mode actively manages the exposed globals to the cell, creating cell specific "global" environments, and has additional active cleanup. To prevent cross cell memory mutation (which is possible but discouraged in marimo normal mode)- strict mode implicitly copies variables between cells (you can wrap variables with zero_copy in this mode to disable this behavior). One advantage to strict mode, is that this build up of any hidden state doesn't occur, but at the cost of copy overhead. One of the edge cases normal mode marimo does not catch is the following (maybe this is actually a bug @Akshay?) _my_var = 1 Then remove the reference to _my_var, and it will still remain secretly in memory. marimo doesn't clean this up since it has no context wrt the rest of the graph. Since strict mode accounts for all references, private or not, it removes _my_var if it determines it is not needed. Is strict mode worth it? I think it depends on your use case. You can try it out, and worst case disable it. It's experimental for a reason, but the more feedback it gets the better. If you frequently are prototyping with various private variables, strict mode will prevent this variable build up, but potentially at the cost of the "copy" in other cases. You can fight against this with "zero_copy" but lose some of the mutation protections. Best case you barely notice strict mode and have a possible memory boost due to the active gc, worst case there's a performance issue.

10 replies

•Created by Eloi Torrents on 6/22/2024 in #archived-help-and-support

Python env packages

What happens when you run:

import sys

sys.path

import sys

sys.path

in a cell?

7 replies

•Created by Eloi Torrents on 6/22/2024 in #archived-help-and-support

Python env packages

Did you activate the environment and then install Tensorflow? You can also manually add to PYTHONPATH, but that's not recommended

7 replies

•Created by Matteo on 6/12/2024 in #archived-help-and-support

What's the best way to remove unused

Not right now but see my suggestion for marimo lint: https://github.com/marimo-team/marimo/issues/1543

1 replies

•Created by nellyg on 5/23/2024 in #archived-help-and-support

We are exploring using marimo for more

You could use markdown mode, it renders nicely: https://github.com/marimo-team/marimo/blob/main/marimo/_tutorials/markdown_format.md You can also go from markdown -> pdf or markdown with inline outputs with pandoc

5 replies

•Created by dmad on 5/20/2024 in #archived-help-and-support

With the new auth, is there a correct

Yes, the hostname redirect isn't always desirable since the access hostname isn't bound to be the one provided to marimo --hostname=marimo.dev:80 --port=2718 Would be fine though. I just ended up turning off auth since just using over my home VPN

2 replies