mcburton
mcburton•10mo ago

This might be more of a Pyodide question

This might be more of a Pyodide question, but I am trying to load a CSV file using Pandas read_csv in a WASM notebook. I've got the file on Github and loading it via the raw URL. In the WASM notebook I'm getting a URLError: <urlopen error unknown url type: https> and if I change to to an http url I get a URLError: <urlopen error [Errno 26] Operation in progress>. anyone know the issue here?
16 Replies
Myles Scolnick
Myles Scolnick•10mo ago
Can you share your notebook? if you are able to. or send a minimal repro
mcburton
mcburtonOP•10mo ago
GitHub
GitHub - mcburton/you-vs-mmlu: How do YOU fair against the MMLU ben...
How do YOU fair against the MMLU benchmark? Contribute to mcburton/you-vs-mmlu development by creating an account on GitHub.
mcburton
mcburtonOP•10mo ago
warning, it is a HACK job 😉
mcburton
mcburtonOP•10mo ago
marimo | a next-generation Python notebook
Explore data and build apps seamlessly with marimo, a next-generation Python notebook.
Myles Scolnick
Myles Scolnick•10mo ago
you can do:
import pyodide
csv = pyodide.http.open_url("https://raw.githubusercontent.com/mcburton/you-vs-mmlu/main/data.csv")
questions = pd.read_csv(csv, header=None, names=columns)
import pyodide
csv = pyodide.http.open_url("https://raw.githubusercontent.com/mcburton/you-vs-mmlu/main/data.csv")
questions = pd.read_csv(csv, header=None, names=columns)
mcburton
mcburtonOP•10mo ago
that worked! would that code work when not running in pyodide? or do I need some logic for the notebook to decide which approach to load the data?
Myles Scolnick
Myles Scolnick•10mo ago
yea would need some logic - you can check if you are in pyodide or not by trying to import pyodide and catching the error we can add this to our docs, or maybe patch it pd.read_csv if its not too hacky
mcburton
mcburtonOP•10mo ago
Is this a Marimo issue or a more general pyodide issue with Pandas...
Myles Scolnick
Myles Scolnick•10mo ago
pyodide. they have particular ways to request data. and whatever pandas is using isnt compatible. but pyodide may add more compat to fetching libraries in future e.g. maybe this is what we need https://github.com/koenvo/pyodide-http it supports url and requests
mcburton
mcburtonOP•10mo ago
yeah, might be good to note this in the docs. I would think fetching data from the web would be a common pattern for WASM notebooks.
Myles Scolnick
Myles Scolnick•10mo ago
most definitely
mcburton
mcburtonOP•10mo ago
sweet, well I got my little POC running for my noon meeting. Thanks! https://marimo.app/l/m93ysb?mode=read
marimo | a next-generation Python notebook
Explore data and build apps seamlessly with marimo, a next-generation Python notebook.
Akshay
Akshay•10mo ago
@mcburton , very glad to see you trying out WASM notebooks! Thanks for being patient with these kinds of issues. Offer still stands -- I'd love to chat with you over a video call sometime. You have my email 🙂
mcburton
mcburtonOP•10mo ago
Yes! It's on my todo list. I've got spring break next week so I can catch up on things. I'll be in touch soon
Mady
Mady•10mo ago
Just wanted to add on, I also had this issue and struggled to figure it out 🙂
Akshay
Akshay•10mo ago
Thanks for letting us know — sorry about that! We will improve our documentation and work on finding a fix