Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

agg_concat doesn't work on strings #2768

Closed
Vince7778 opened this issue Aug 29, 2024 · 3 comments
Closed

agg_concat doesn't work on strings #2768

Vince7778 opened this issue Aug 29, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@Vince7778
Copy link
Contributor

The docs seem to claim that agg_concat works on strings, when in fact it only works on List and Python types. A workaround is doing agg_list() then .list.join(), but we should really add support for agg_concat directly on strings.

In [35]: df3 = daft.from_pydict({"a": ["the", " quick", " brown", " fox"]})
In [36]: df3.agg(col("a").agg_concat()).show()
---------------------------------------------------------------------------
DaftCoreException                         Traceback (most recent call last)
Cell In[36], line 1
----> 1 df3.agg(col("a").agg_concat()).show()

File ~/Documents/Programming/daft/Daft/daft/api_annotations.py:26, in DataframePublicAPI.<locals>._wrap(*args, **kwargs)
     24 type_check_function(func, *args, **kwargs)
     25 timed_method = time_df_method(func)
---> 26 return timed_method(*args, **kwargs)

File ~/Documents/Programming/daft/Daft/daft/analytics.py:198, in time_df_method.<locals>.tracked_method(*args, **kwargs)
    195 @functools.wraps(method)
    196 def tracked_method(*args, **kwargs):
    197     if _ANALYTICS_CLIENT is None:
--> 198         return method(*args, **kwargs)
    200     start = time.time()
    201     try:

File ~/Documents/Programming/daft/Daft/daft/dataframe/dataframe.py:2226, in DataFrame.agg(self, *to_agg)
   2223     if not isinstance(expr, Expression):
   2224         raise ValueError(f"DataFrame.agg() only accepts expression type, received: {type(expr)}")
-> 2226 return self._agg(to_agg_list, group_by=None)

File ~/Documents/Programming/daft/Daft/daft/dataframe/dataframe.py:1996, in DataFrame._agg(self, to_agg, group_by)
   1991 def _agg(
   1992     self,
   1993     to_agg: Iterable[Expression],
   1994     group_by: Optional[ExpressionsProjection] = None,
   1995 ) -> "DataFrame":
-> 1996     builder = self._builder.agg(list(to_agg), list(group_by) if group_by is not None else None)
   1997     return DataFrame(builder)

File ~/Documents/Programming/daft/Daft/daft/logical/builder.py:227, in LogicalPlanBuilder.agg(self, to_agg, group_by)
    221 def agg(
    222     self,
    223     to_agg: list[Expression],
    224     group_by: list[Expression] | None,
    225 ) -> LogicalPlanBuilder:
    226     group_by_pyexprs = [expr._expr for expr in group_by] if group_by is not None else []
--> 227     builder = self._builder.aggregate([expr._expr for expr in to_agg], group_by_pyexprs)
    228     return LogicalPlanBuilder(builder)

DaftCoreException: DaftError::External Unable to create logical plan node.
Due to: DaftError::TypeError We can only perform List Concat Agg on List or Python Types, got dtype Utf8 for column "a"
@Vince7778 Vince7778 added the bug Something isn't working label Aug 29, 2024
@vicky1999
Copy link
Contributor

vicky1999 commented Sep 13, 2024

@Vince7778 Can I work on this issue?

@samster25
Copy link
Member

@vicky1999 assigning the issue to you!

colin-ho pushed a commit to vicky1999/Daft that referenced this issue Sep 25, 2024
colin-ho added a commit that referenced this issue Sep 25, 2024
Solves #2768

---------

Co-authored-by: Colin Ho <chiuhong@usc.edu>
Co-authored-by: Colin Ho <colinho@Colins-MacBook-Pro.local>
@colin-ho
Copy link
Contributor

colin-ho commented Oct 7, 2024

Closed with #2847

@colin-ho colin-ho closed this as completed Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants