Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

map.get with no rows throws error, while struct.get does not #3891

Open
universalmind303 opened this issue Mar 3, 2025 · 0 comments · May be fixed by #3892
Open

map.get with no rows throws error, while struct.get does not #3891

universalmind303 opened this issue Mar 3, 2025 · 0 comments · May be fixed by #3892
Assignees
Labels
bug Something isn't working

Comments

@universalmind303
Copy link
Contributor

Discussed in #3890

Originally posted by akshay-okahu March 3, 2025
During dataframe operations if there are no rows after filtering, and I use map.get, I get a daft.exceptions.DaftCoreException: DaftError::ValueError Need at least 1 series to perform concat exception

The only way I found to avoid the exception is to check df.count_rows() > 0 before using map.get, but count_rows() materialises the dataframe.

Is there a better way to avoid the exception which avoids materialising the dataframe? Is it possible that map.get can avoid throwing the exception in case of no rows

Sample:

import pyarrow as pa
import daft

data = pa.array([[("a", 1)], [("a", 2)]], type=pa.map_(pa.string(), pa.int64()))
table = pa.table({"map_col": data})
df = daft.from_arrow(table)

df = df.where(df["map_col"].map.get("a") == 3)
if df.count_rows() > 0:    # exception is thrown without this check
    df = df.with_column("a", df["map_col"].map.get("a"))
df.show()

Stacktrace:

  File "test/dafttest.py", line 25, in <module>
    df.show()
  File "test/.venv/lib/python3.11/site-packages/daft/api_annotations.py", line 26, in _wrap
    return timed_method(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "test/.venv/lib/python3.11/site-packages/daft/analytics.py", line 199, in tracked_method
    result = method(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "test/.venv/lib/python3.11/site-packages/daft/dataframe/dataframe.py", line 2891, in show
    dataframe_display = self._construct_show_display(n)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "test/.venv/lib/python3.11/site-packages/daft/dataframe/dataframe.py", line 2848, in _construct_show_display
    for table in get_context().get_or_create_runner().run_iter_tables(builder, results_buffer_size=1):
  File "test/.venv/lib/python3.11/site-packages/daft/runners/native_runner.py", line 90, in run_iter_tables
    for result in self.run_iter(builder, results_buffer_size=results_buffer_size):
  File "test/.venv/lib/python3.11/site-packages/daft/runners/native_runner.py", line 85, in run_iter
    yield from results_gen
  File "test/.venv/lib/python3.11/site-packages/daft/execution/native_executor.py", line 37, in <genexpr>
    return (
           ^
daft.exceptions.DaftCoreException: DaftError::ValueError Need at least 1 series to perform concat

I noticed struct.get doesn't throw the exception:

import daft

df = daft.from_pydict({
    "data": [
        {"a": 1},
        {"a": 2}
    ]
})

df = df.where(df["data"].struct.get("a") == 3)
df = df.with_column("a", df["data"].struct.get("a"))
df.show()
```</div>
@universalmind303 universalmind303 self-assigned this Mar 3, 2025
@universalmind303 universalmind303 added the bug Something isn't working label Mar 3, 2025
@universalmind303 universalmind303 linked a pull request Mar 3, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant