Found an interesting little problem recently. We had a query we wanted to upgrade for the next release, and added an OR statement to allow us to pre-release the functionality for the new code drop. This had a really big impact on performance and caused lots of issues.
When we investigated it turns out that OR queries in a join are really expensive. Consider the following statement.
Select
child.FirstName,
child.Surname,
parent.FirstName,
parent.Surname
From Children child
Inner Join Parents parent
On child.MothersOriginalSurname = parent.MaidenName
Or child.Surname = parent.SurnameThe sql query analyser can’t do anything to optimize this query, and effectively runs the following…
Select
child.FirstName,
child.Surname,
parent.FirstName,
parent.Surname
From Children child
Inner Join Parents parent
On child.MothersOriginalSurname = parent.MaidenName
Union
Select
child.FirstName,
child.Surname,
parent.FirstName,
parent.Surname
From Children child
Inner Join Parents parent
On child.Surname = parent.SurnameIf you want to squeeze performance out of it - consider the following alternative…
Select
child.FirstName,
child.Surname,
Coalesce(father.FirstName, mother.FirstName) As Firstname,
Coalesce(father.Surname, mother.Surname) As Surname
From Children child
Left Outer Join Parents mother
On child.MothersOriginalSurname = mother.MaidenName
Left Outer Join Parents father
On child.Surname = mother.SurnameThis runs more quickly.
Obviously the most ideal scenario is you refactor the data and add a proper key!
Select
child.FirstName,
child.Surname,
Coalesce(father.FirstName, mother.FirstName) As Firstname,
Coalesce(father.Surname, mother.Surname) As Surname
From Children child
Left Outer Join Parents father
On child.FatherID = father.ID
Left Outer Join Parents mother
On child.MotherID = mother.ID