20130701

Filled Under:

IN = EXISTS, but beware: NOT IN is not the same as NOT EXISTS!

In SQL queries, IN and EXISTS are interchangeable. Many of us assume therefore, that NOT IN and NOT EXISTS are also interchangeable.
A big mistake.
See how NOT IN and NOT EXISTS behave differently in this small example.

Writing a query to find the number of employees in emp table who are not managers. The logic used is: get count where the employee id is not present in as mgr id in the same table.

The NOT IN version:

SQL> -- Using NOT IN
SQL> -- Count employees who
SQL> -- are not managers
SQL> select count(*)
  from emp e1
  where e1.empno not in
  4    (select e2.mgr
  5     from emp e2);
 
  COUNT(*)
----------
         0
OK, that says there are no such employees. Let’s try it with NOT EXISTS now.

The NOT EXISTS version:

 
              
SQL> -- Using NOT EXISTS
SQL> -- Count employees who
SQL> -- are not managers
SQL> select count(*)
  from emp e1
  where not exists
  4    (select 'Y'
  5     from emp e2
  6     where e2.mgr = e1.empno)
 
  COUNT(*)
----------
         8
Whoa! 8 such employees! Which answer is correct?
Let’s verify with the help of a flag against each employee – yes_mgr or no_mgr. We calculate this flag with help of the CASE statement.

SQL> select e.empno
  2       , e.mgr
  3       , case when
  4             (select 'Y' from emp m
  5              where m.mgr = e.empno
  6              and rownum = 1) is null
  7            then 'no_mgr'
  8            else 'yes_mgr'
  9         end is_mgr
 10  from emp e;
 
     EMPNO        MGR IS_MGR
---------- ---------- -------
      7369       7902 no_mgr
      7499       7698 no_mgr
      7521       7698 no_mgr
      7566       7839 yes_mgr
      7654       7698 no_mgr
      7698       7839 yes_mgr
      7782       7839 yes_mgr
      7788       7566 yes_mgr
      7839            yes_mgr
      7844       7698 no_mgr
      7876       7788 no_mgr
      7900       7698 no_mgr
      7902       7566 yes_mgr
      7934       7782 no_mgr
 
14 rows selected.
It turns out that there actually are eight employees who are flagged no_mgr, so the NOT EXISTS version is all right.

Why does NOT IN give a “wrong” result?!

Well, it isn’t a wrong result, that’s how it is supposed to work! The column ‘mgr’ in the emp table is nullable. As per Oracle’s treatment of NULLs:
  • where 3 in (3, null) is TRUE as 3 is in (3).
  • where 3 not in (1, null) is UNKNOWN as it is not known whether 3 is in null or not.
So, the NOT IN condition does not equate to true if the value being matched has nulls.

Conclusion

  • NOT IN and NOT EXISTS are not identical if the subquery can potentially return nulls.
  • Use NOT IN only if you are 100% certain that the subquery will not return NULLs.





0 comments:

Post a Comment