Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query: Translate Distinct to SQL when applied after custom projection #7234

Closed
adzhiljano opened this issue Dec 12, 2016 · 6 comments
Closed
Assignees
Labels
closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-enhancement
Milestone

Comments

@adzhiljano
Copy link

I looked up for this problem and didn't find anything. Sorry if I'm making a dup or I messed up something.

This

context.Blogs
    .Select(t => new BlogDto
    {
        Url = t.Url
    })
    .Distinct()
    .ToList();

resulted in the following query:

SELECT [t].[Url]
FROM [Blogs] AS [t]

and I expected:

SELECT DISTINCT [t].[Url]
FROM [Blogs] AS [t]   

So, now I'm using the following workaround:

context.Blogs
    .Select(t => new
    {
        Url = t.Url
    })
    .Distinct()
    .Select(t => new BlogDto
    {
        Url = t.Url
    })
    .ToList();

which results in:

SELECT [t1].[Url]
FROM (
    SELECT DISTINCT [t0].[Url]
    FROM [Blogs] AS [t0]
) AS [t1]

Steps to reproduce

    public class BlogDto
    {
        public string Url { get; set; }
    }

    public class BlogContext : DbContext
    {
        public DbSet<Blog> Blogs { get; set; }

        protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        {
            optionsBuilder.UseSqlServer(@"Data Source=.;Initial Catalog=Blogging;Integrated Security=True;MultipleActiveResultSets=True");
            //optionsBuilder.UseSqlServer(@"Server=(localdb)\mssqllocaldb;Database=CollectionsTest;Trusted_Connection=True;ConnectRetryCount=0");
        }
    }

    public class Program
    {
        private static void Main()
        {
            using (var context = new BlogContext())
            {
                var results = context.Blogs
                    .Select(t => new BlogDto
                    {
                        Url = t.Url
                    })
                    .Distinct()
                    .ToList();
            }
        }
    }

Just to be clear, the following works as expected:

context.Blogs
    .Select(t => new
    {
        Url = t.Url
    })
    .Distinct()
    .ToList();

produces:

SELECT DISTINCT [t].[Url]
FROM [Blogs] AS [t]

I have one simple question regarding terminology. When you say 'projection' ('projecting' to some object different from the model), does that include anonymous objects? If so, the title of this issue should be "Projecting to a specific object(class) with Distinct is not working as expected"

Thank you!

Further technical details

EF Core version: 1.1.0
Database Provider: Microsoft.EntityFrameworkCore.SqlServer
Operating system: Win 8.1 64bit
IDE: Visual Studio 2015 Update 3

@smitpatel
Copy link
Contributor

Based on documentation of Distinct()

The Distinct(IEnumerable) method returns an unordered sequence that contains no duplicate values. It uses the default equality comparer, Default, to compare values.
The default equality comparer, Default, is used to compare values of the types that implement the IEquatable generic interface. To compare a custom data type, you need to implement this interface and provide your own GetHashCode and Equals methods for the type.

For anonymous type in C#,

Because the Equals and GetHashCode methods on anonymous types are defined in terms of the Equals and GetHashCode methods of the properties, two instances of the same anonymous type are equal only if all their properties are equal.

Select Clause in T-Sql

DISTINCT
Specifies that only unique rows can appear in the result set. Null values are considered equal for the purposes of the DISTINCT keyword.

For database, the rows are the same if all columns have same values.
The key here is the Equals function.
For anonymous type, Equals compares all property values, which is same as database, hence it is possible to evaluate Distinct on server.
When it comes to a class type (like BlogDto above), we don't really know the details of Equals on the class. 2 BlogDto can be equal even if their corresponding URLs are different or 2 BlogDto can be different even with same URL. Therefore doing Distinct URL on server may not give you correct results and it will require client side evaluation.
Therefore Distinct is actually working as expected to give you correct results.

@adzhiljano
Copy link
Author

Thanks for the response! You are right, but when working with a DbContext, I personally think of Distinct(IQueryable) as a way to force a distinct to be executed on the server(query provider). I'm sure that a lot of people are used to this way of query construction.

Also, if you run the following using EF 6

context.Blogs
    .Select(t => new BlogDto
    {
        Url = t.Url
    })
    .Distinct()
    .ToList();

you will get the following translation:

SELECT 
    [Distinct1].[C1] AS [C1], 
    [Distinct1].[Url] AS [Url]
    FROM ( SELECT DISTINCT 
        [Extent1].[Url] AS [Url], 
        1 AS [C1]
        FROM [dbo].[Blogs] AS [Extent1]
    )  AS [Distinct1]

which is the same query as the one I mentioned in the workaround and again I think a lot of people are used to this mindset when constructing a query. It's just confusing for me that somehow the server should be aware of comparing c# classes.

I saw the following

EF Core keeps the developer experience from EF6.x

at https://docs.microsoft.com/en-us/ef/efcore-and-ef6/
which is not true in this case. If you are transitioning from EF6 to EFCore you will have to refactor some of your queries including Distinct.

Even more, if you run:

context.Blogs
  .Distinct()
  .ToList();

you will get

SELECT DISTINCT [b].[BlogId], [b].[Url]
FROM [Blogs] AS [b]

and I didn't find an override of the Equals method in the DbSet.
Maybe I'm missing something conceptually or I'm just too used to EF6.
Thanks, again!

@divega divega added this to the 2.0.0 milestone Jan 4, 2017
@divega
Copy link
Contributor

divega commented Jan 4, 2017

EF Team Triage: We discussed this and we believe we should try to support this query and the translation to DISTINCT on the server.

@smitpatel smitpatel changed the title Projection with Distinct not working as expected Query: Translate Distinct to SQL when applied after custom projection Feb 3, 2017
@tuespetre
Copy link
Contributor

I think I may have addressed this at least somewhat with #7543:

https://github.com/aspnet/EntityFramework/pull/7543/files#diff-2c68636402e93e86a3d0108bbd4ba1ddR2131

@smitpatel
Copy link
Contributor

@tuespetre - If you think the PR resolves this thing then can you please add a test case.

@tuespetre
Copy link
Contributor

@smitpatel the linked spot in the diff shows two applicable tests, looks like they would need to be renamed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-enhancement
Projects
None yet
Development

No branches or pull requests

6 participants