Not-equal joins and subqueries

Sometimes a not-equal join query is not sufficiently restrictive and needs to be replaced by a subquery. For example, suppose you want to list the names of authors who live in a city where no publisher is located. For the sake of clarity, let us also restrict this query to authors whose last names begin with “A”, “B”, or “C”. A not-equal join query might be:

select distinct au_lname, authors.city 
from publishers, authors 
where au_lname like "[ABC]%" 
and publishers.city != authors.city 

The results are not an answer to the question that was asked:

au_lname             city 
----------------     ------------ 
Bennet               Berkeley  
Carson               Berkeley  
Blotchet-Halls       Corvallis 
 
(3 rows affected) 

The system interprets this version of the SQL statement to mean: “find the names of authors who live in a city where some publisher is not located.” All the excluded authors qualify, including the authors who live in Berkeley, home of the publisher Algodata Infosystems.

In this case, the way that the system handles joins (first finding every eligible combination before evaluating other conditions) causes this query to return undesirable results. You must use a subquery to get the results you want. A subquery can eliminate the ineligible rows first and then perform the remaining restrictions.

Here is the correct statement:

select distinct au_lname, city 
from authors 
where au_lname like "[ABC]%" 
and city not in 
(select city from publishers 
where authors.city = publishers.city) 

Now, the results are what you want:

au_lname             city                  
-------------        ------------
Blotchet-Halls       Corvallis             
 
(1 row affected) 

Subqueries are covered in greater detail in Chapter 4, “Subqueries: Using Queries Within Other Queries.”