LINQ thinks I need an extra INNER JOIN, but why?

I have a LINQ query that creates an extra / duplicate INNER JOIN for some reason. This results in the query not returning the expected result. If I manually comment out this extra JOIN from the generated SQL, I get the seemingly correct output.

Can you discover what I could have done in this LINQ to invoke this additional JOIN?

Thanks.

Here is my LINQ example

predicate=predicate.And(condition1);
predicate1=predicate1.And(condition2);
predicate1=predicate1.And(condition3);
predicate2=predicate2.Or(predicate1);
predicate=predicate.And(predicate2);
var ids = context.Code.Where(predicate); 
            var rs = from r in ids 
                     group r by r.PersonID into g 
                     let matchcount=g.Select(p => p.phonenumbers.PhoneNum).Distinct().Count() 
                     where matchcount ==2 
                     select new 
                  { 
                      personid = g.Key 
                  };

      

and here is the generated SQL (duplicate connection - [t7])

Declare @p1 VarChar(10)='Home'
Declare @p2 VarChar(10)='111'
Declare @p3 VarChar(10)='Office'
Declare @p4 VarChar(10)='222'
Declare @p5 int=2

SELECT [t9].[PersonID] AS [pid]
FROM (
    SELECT [t3].[PersonID], (
        SELECT COUNT(*)
        FROM (
            SELECT DISTINCT [t7].[PhoneValue]
            FROM [dbo].[Person] AS [t4]
            INNER JOIN [dbo].[PersonPhoneNumber] AS [t5] ON [t5].[PersonID] = [t4].[PersonID]
            INNER JOIN [dbo].[CodeMaster] AS [t6] ON [t6].[Code] = [t5].[PhoneType]
            INNER JOIN [dbo].[PersonPhoneNumber] AS [t7] ON [t7].[PersonID] = [t4].[PersonID]
            WHERE ([t3].[PersonID] = [t4].[PersonID]) AND ([t6].[Enumeration] = @p0) AND ((([t6].[CodeDescription] = @p1) AND ([t5].[PhoneValue] = @p2)) OR (([t6].[CodeDescription] = @p3) AND ([t5].[PhoneValue] = @p4)))
            ) AS [t8]
        ) AS [value]
    FROM (
        SELECT [t0].[PersonID]
        FROM [dbo].[Person] AS [t0]
        INNER JOIN [dbo].[PersonPhoneNumber] AS [t1] ON [t1].[PersonID] = [t0].[PersonID]
        INNER JOIN [dbo].[CodeMaster] AS [t2] ON [t2].[Code] = [t1].[PhoneType]
        WHERE ([t2].[Enumeration] = @p0) AND ((([t2].[CodeDescription] = @p1) AND ([t1].[PhoneValue] = @p2)) OR (([t2].[CodeDescription] = @p3) AND ([t1].[PhoneValue] = @p4)))
        GROUP BY [t0].[PersonID]
        ) AS [t3]
    ) AS [t9]
WHERE [t9].[value] = @p5

      

+2


a source to share


3 answers


They are not duplicated. You are requesting two different values ​​from a data source.

let matchcount=g.Select(p => p.phonenumbers.PhoneNum).Distinct().Count()

      

causes

        SELECT COUNT(*) 
        FROM ( 
            SELECT DISTINCT [t7].[PhoneValue] 
            FROM [dbo].[Person] AS [t4] 
            INNER JOIN [dbo].[PersonPhoneNumber] AS [t5] ON [t5].[PersonID] = [t4].[PersonID] 
            INNER JOIN [dbo].[CodeMaster] AS [t6] ON [t6].[Code] = [t5].[PhoneType] 
            INNER JOIN [dbo].[PersonPhoneNumber] AS [t7] ON [t7].[PersonID] = [t4].[PersonID] 
            WHERE ([t3].[PersonID] = [t4].[PersonID]) AND ([t6].[Enumeration] = @p0) AND ((([t6].[CodeDescription] = @p1) AND ([t5].[PhoneValue] = @p2)) OR (([t6].[CodeDescription] = @p3) AND ([t5].[PhoneValue] = @p4))) 
            ) AS [t8] 

      

and



                     from r in ids   
                     group r by r.PersonID into g 

      

causes

    SELECT [t0].[PersonID]    
    FROM [dbo].[Person] AS [t0]    
    INNER JOIN [dbo].[PersonPhoneNumber] AS [t1] ON [t1].[PersonID] = [t0].[PersonID]    
    INNER JOIN [dbo].[CodeMaster] AS [t2] ON [t2].[Code] = [t1].[PhoneType]    
    WHERE ([t2].[Enumeration] = @p0) AND ((([t2].[CodeDescription] = @p1) AND ([t1].[PhoneValue] = @p2)) OR (([t2].[CodeDescription] = @p3) AND ([t1].[PhoneValue] = @p4)))    
    GROUP BY [t0].[PersonID]    
    ) AS [t3]   

      

as for INNER JOINS, the reason you get them has to do with the relationship between these tables. For example, Person is 1..1 with PersonPhoneNumber (or 1 .. *). Anyway, I am assuming the PersonID on PersonPhoneNumber is FK and PK value. Therefore, in this case, the data source must go out to this external table to see if the value of the PersonPhoneNumber navigation property actually exists. It does this by executing an INNER JOIN on this table.

0


a source


I get the feeling that .DISTINCT (). COUNT () is handled separately by translating linq to sql.



I'd also say the SQL execution plan just threw out the trick.

0


a source


Try to rewrite with an explicit condition instead of an abstract "predicate" construct. From what I can see in SQL, the composition may look odd to the parser in isolation, and one [t5] join you just named dupe :-) should serve that condition.

Also, try to tell us what you really want to find with this query and try to write regular SQL that does what you wanted. I have to be human :-) and it looks strange to me too :-))

Technically speaking, you are forcibly doubling the sharing by using a condition in two separate requests (each var assignment is technically a separate request).

Also doing a group by column without any aggregation is not always equivalent to selecting a single one. Specifically, choose an allocation on a join that allows it to take precedence over joins - the queries are declarative (can be reordered), and you were trying to get it to be procedural. So LINQ gave you the exact procedural one :-) and then the SQL was reordered according to the SQL rules :-))

So, write plain SQL first, and if you can't LINQ-iize put it in the sproc - that will make it faster anyway :-)

0


a source







All Articles