Not-equal joins and subqueries

Sometimes a not-equal join query is insufficiently restrictive and must be replaced by a subquery. For example, suppose you want to list the names of authors who live in a city where no publisher is located. For the sake of clarity, let us also restrict this query to authors whose last names begin with “A”, “B”, or “C”. A not-equal join query might be:

select distinct au_lname, authors.city 
from publishers, authors 
where au_lname like "[ABC]%" 
and publishers.city != authors.city 

The results are not an answer to the question that was asked:

au_lname             city 
----------------     ------------ 
Bennet               Berkeley  
Carson               Berkeley  
Blotchet-Halls       Corvallis 
 
(3 rows affected) 

The system interprets this query as: “find the names of authors who live in a city where a publisher is not located.” Only the authors who live in Berkeley and Corvalis qualify – towns that do not have publishers.

In this case, the way that the system handles joins (first finding every eligible combination before evaluating other conditions) causes this query to return undesirable results. Use a subquery to get the results you want. A subquery can eliminate the ineligible rows first and then perform the remaining restrictions.

Here is the correct statement:

select distinct au_lname, city 
from authors 
where au_lname like "[ABC]%" 
and city not in 
(select city from publishers 
where authors.city = publishers.city) 

Now, the results are what you want:

au_lname             city                  
-------------        ------------
Blotchet-Halls       Corvallis             
 
(1 row affected) 

Subqueries are covered in greater detail in Chapter 5, “Subqueries: Using Queries Within Other Queries.”