LINQ thinks I need an extra INNER JOIN, but why?
I have a LINQ query that creates an extra / duplicate INNER JOIN for some reason. This results in the query not returning the expected result. If I manually comment out this extra JOIN from the generated SQL, I get the seemingly correct output.
Can you discover what I could have done in this LINQ to invoke this additional JOIN?
Thanks.
Here is my LINQ example
predicate=predicate.And(condition1);
predicate1=predicate1.And(condition2);
predicate1=predicate1.And(condition3);
predicate2=predicate2.Or(predicate1);
predicate=predicate.And(predicate2);
var ids = context.Code.Where(predicate);
var rs = from r in ids
group r by r.PersonID into g
let matchcount=g.Select(p => p.phonenumbers.PhoneNum).Distinct().Count()
where matchcount ==2
select new
{
personid = g.Key
};
and here is the generated SQL (duplicate connection - [t7])
Declare @p1 VarChar(10)='Home'
Declare @p2 VarChar(10)='111'
Declare @p3 VarChar(10)='Office'
Declare @p4 VarChar(10)='222'
Declare @p5 int=2
SELECT [t9].[PersonID] AS [pid]
FROM (
SELECT [t3].[PersonID], (
SELECT COUNT(*)
FROM (
SELECT DISTINCT [t7].[PhoneValue]
FROM [dbo].[Person] AS [t4]
INNER JOIN [dbo].[PersonPhoneNumber] AS [t5] ON [t5].[PersonID] = [t4].[PersonID]
INNER JOIN [dbo].[CodeMaster] AS [t6] ON [t6].[Code] = [t5].[PhoneType]
INNER JOIN [dbo].[PersonPhoneNumber] AS [t7] ON [t7].[PersonID] = [t4].[PersonID]
WHERE ([t3].[PersonID] = [t4].[PersonID]) AND ([t6].[Enumeration] = @p0) AND ((([t6].[CodeDescription] = @p1) AND ([t5].[PhoneValue] = @p2)) OR (([t6].[CodeDescription] = @p3) AND ([t5].[PhoneValue] = @p4)))
) AS [t8]
) AS [value]
FROM (
SELECT [t0].[PersonID]
FROM [dbo].[Person] AS [t0]
INNER JOIN [dbo].[PersonPhoneNumber] AS [t1] ON [t1].[PersonID] = [t0].[PersonID]
INNER JOIN [dbo].[CodeMaster] AS [t2] ON [t2].[Code] = [t1].[PhoneType]
WHERE ([t2].[Enumeration] = @p0) AND ((([t2].[CodeDescription] = @p1) AND ([t1].[PhoneValue] = @p2)) OR (([t2].[CodeDescription] = @p3) AND ([t1].[PhoneValue] = @p4)))
GROUP BY [t0].[PersonID]
) AS [t3]
) AS [t9]
WHERE [t9].[value] = @p5
a source to share
They are not duplicated. You are requesting two different values from a data source.
let matchcount=g.Select(p => p.phonenumbers.PhoneNum).Distinct().Count()
causes
SELECT COUNT(*)
FROM (
SELECT DISTINCT [t7].[PhoneValue]
FROM [dbo].[Person] AS [t4]
INNER JOIN [dbo].[PersonPhoneNumber] AS [t5] ON [t5].[PersonID] = [t4].[PersonID]
INNER JOIN [dbo].[CodeMaster] AS [t6] ON [t6].[Code] = [t5].[PhoneType]
INNER JOIN [dbo].[PersonPhoneNumber] AS [t7] ON [t7].[PersonID] = [t4].[PersonID]
WHERE ([t3].[PersonID] = [t4].[PersonID]) AND ([t6].[Enumeration] = @p0) AND ((([t6].[CodeDescription] = @p1) AND ([t5].[PhoneValue] = @p2)) OR (([t6].[CodeDescription] = @p3) AND ([t5].[PhoneValue] = @p4)))
) AS [t8]
and
from r in ids
group r by r.PersonID into g
causes
SELECT [t0].[PersonID]
FROM [dbo].[Person] AS [t0]
INNER JOIN [dbo].[PersonPhoneNumber] AS [t1] ON [t1].[PersonID] = [t0].[PersonID]
INNER JOIN [dbo].[CodeMaster] AS [t2] ON [t2].[Code] = [t1].[PhoneType]
WHERE ([t2].[Enumeration] = @p0) AND ((([t2].[CodeDescription] = @p1) AND ([t1].[PhoneValue] = @p2)) OR (([t2].[CodeDescription] = @p3) AND ([t1].[PhoneValue] = @p4)))
GROUP BY [t0].[PersonID]
) AS [t3]
as for INNER JOINS, the reason you get them has to do with the relationship between these tables. For example, Person is 1..1 with PersonPhoneNumber (or 1 .. *). Anyway, I am assuming the PersonID on PersonPhoneNumber is FK and PK value. Therefore, in this case, the data source must go out to this external table to see if the value of the PersonPhoneNumber navigation property actually exists. It does this by executing an INNER JOIN on this table.
a source to share
Try to rewrite with an explicit condition instead of an abstract "predicate" construct. From what I can see in SQL, the composition may look odd to the parser in isolation, and one [t5] join you just named dupe :-) should serve that condition.
Also, try to tell us what you really want to find with this query and try to write regular SQL that does what you wanted. I have to be human :-) and it looks strange to me too :-))
Technically speaking, you are forcibly doubling the sharing by using a condition in two separate requests (each var assignment is technically a separate request).
Also doing a group by column without any aggregation is not always equivalent to selecting a single one. Specifically, choose an allocation on a join that allows it to take precedence over joins - the queries are declarative (can be reordered), and you were trying to get it to be procedural. So LINQ gave you the exact procedural one :-) and then the SQL was reordered according to the SQL rules :-))
So, write plain SQL first, and if you can't LINQ-iize put it in the sproc - that will make it faster anyway :-)
a source to share