![]() |
![]() |
|
![]() |
Local SQL Part 10 Subqueries: Queries on Queries © 2002 Lawrence G. DiGiovanni Digico, Inc. www.digicoinc.com Preface Local SQL is a very powerful yet commonly underutilized tool that is provided with Paradox (as part of the Borland Database Engine). It exposes the power and flexibility of the SQL sublanguage to both Paradox developers as well as interactive end-users. This is the tenth in a planned series of articles on using Local SQL with Paradox/BDE applications. This article introduces the SQL subquery, which is another way of including multiple tables (or the results of multiple queries) into a single query. Introduction In Part 5 of this series, we saw how to include multiple tables in a query through the use of joins. Sometimes, you need to include information from multiple tables using more complicated constraints than can be accomplished using a single join. Subqueries provide this additional flexibility. Subqueries allow us to constrain a query to the result set of another query. The other query can be expressed such that it executes independently of the outer query, or it can correlate to specific rows in the outer queries result set. We will explore the different types of subqueries in this article, discuss some of the potential problems commonly associated with subqueries. As with the last article, our examples will use the DBDEMOS database, which contains the Biolife example database. Subquery: Simple Example Let's say we need to provide a list of all of our customers that have placed orders for more than $1000. We don't need the order information, just the customer information. We could accomplish this with a subquery as follows: SELECT * FROM customer WHERE CustNo IN (SELECT CustNo FROM orders WHERE ItemsTotal > 1000)The subquery is the portion enclosed in parentheses. The above expression tells Local SQL to execute the subquery first, which forms a list of customer numbers for all orders greater than $1000. The other query uses this result as the criteria list for the IN function. In this case, the same result could've been accomplished with an INNER JOIN and a SELECT DISTINCT. We were illustrating a simple example - not all subqueries are so trivial. Subselect vs Subquery Note that the subquery above (the portion in parentheses) is subordinate to a SELECT query (the outer query). A subquery to a SELECT query is also referred to as a SUBSELECT. Subqueries that are subordinate to other DML queries (UPDATE, DELETE, INSERT) are simply referred to as subqueries. Subquery In DELETE Since we brought it up, yes, subqueries can also be used to direct the criteria of UPDATE, DELETE, and INSERT queries. For example, say we wanted to (for whatever reason) to delete the customers who had never placed orders. We could accomplish this with a subquery as follows: DELETE FROM customer WHERE CustNo NOT IN (SELECT CustNo FROM orders)The above query instructs Local SQL to create a list of customer numbers from the orders table (customers who have placed orders), and delete from the customers table all customers whose customer number is not in this list (customers who have not placed orders). This operation could not, incidentally, have also been performed in Local SQL using a join as the earlier query could. Correlated Subquery As was indicated earlier, sometimes, you want the subquery to execute fully, one time only, and its result set supplied to the outer query in its entirety. That's not always the case. Consider the following example. Say we want to produce a list of the customers and the details of their most recent order. SELECT c.CustNo, c.Company, o.SaleDate, o.ItemsTotal FROM customer c, orders o WHERE c.CustNo = o.CustNo AND o.SaleDate = (SELECT MAX(o1.SaleDate) FROM orders o1 WHERE o1.CustNo = c.CustNo)This limits the join condition to only the order for which the order date is the latest order date of all orders for that customer. Let's break this whole query down and look at each line: SELECT c.CustNo, c.Company, o.SaleDate, o.ItemsTotalThis begins the outer query. FROM customer c, orders oThese are the two tables that the outer query is selecting from. Note that we've provided table alias names for the tables. This is required in this type of subquery. WHERE c.CustNo = o.CustNoThis is the beginning of a simple join condition as we've seen before, just relates the two tables by CustNo. AND o.SaleDate =Here's where it gets a little complicated. First, we have this additional condition for the outer query, which limits result set rows to those with a specific sale date (o.SaleDate). (SELECT MAX(o1.SaleDate) FROM orders o1Now it gets a little hairy. Note the use of the MAX function, which gets the highest date of the SaleDate column. But we need to get the highest date of the order for the corresponding customer, so we also have to limit it by the current customer number. The current customer number is available in the outer query only. WHERE o1.CustNo = c.CustNo)So we reference it here by using the table alias we assigned to the customer table in the outer query. Local SQL knows that this is now a correlated subquery, and resolves the subquery for each row returned by the outer query. This is a far cry from our earlier examples where the subquery executed independently of the outer query, and as you might expect, this type of query has significant resource requirements. We will discuss this and other performance considerations later in this article. Subqueries in a Column-List Subqueries can also be used in the column-list of the SELECT clause itself, just like any other projector. For instance, say we wanted to rank our customers to see who has supplied the greatest percentage of our revenue. That requires aggregating revenue by customer (summarize orders.ItemsTotal by customer number), and divide that by the total revenue for all customers (summarize orders.ItemsTotal across all customer numbers). The below query shows one approach: SELECT o.CustNo, SUM(o.ItemsTotal), (SUM(o.ItemsTotal)/ (SELECT SUM(o1.ItemsTotal) FROM orders o1)) * 100This is the projector with the subquery. It's an eyeful, but it basically tells Local SQL to execute a one time query producing a single result, the sum of all orders, and use that result in the calculations for each of the rows in the outer query's result set. The rest of the query is not new if you've been following these articles. FROM orders o GROUP BY CustNo ORDER BY 3 DESCThis last line tells Local SQL to order the result set in descending order of the 3rd column, using ordinal notation. Subquery and (In-Line) Views To this point, we've looked at subqueries in the WHERE clause of an outer query, and in the column list of the SELECT clause of an outer query. What about the FROM clause of an outer query? Can a subquery be used as one of the terms of a table list in the FROM clause? Yes and no. Most SQL dialects support the literal use of a query instead of an actual named table in the FROM clause, as in: SELECT c.CustNo, omax.SaleDate FROM customer c, (SELECT o1.CustNo, MAX(o1.SaleDate) FROM orders o1 GROUP BY o1.CustNo) omax WHERE c.CustNo = omax.CustNoThis is typically referred to as an in-line view, or implicit view. Local SQL does not support this directly. Local SQL does support the use of named query files, so if you were to save the SELECT o1.CustNo, MAX(o1.SaleDate) FROM orders o1 GROUP BY o1.CustNoas maxord.sql, you could include it in the FROM clause as follows: SELECT c.CustNo, omax.SaleDate FROM customer c, Maxord.sql omax WHERE c.CustNo = omax.CustNoAlso note that this approach can be used to include QBE files in the same fashion. Performance Considerations The use of a subquery will obviously impact the performance of a query, so subquery use should be limited, and if an equivalent simple query or compact tcursor approach is available, it would probably be faster. The use of a correlated subquery, at least in Local SQL, is almost always slower than an equivalent non-correlated subquery, which is almost always slower than an equivalent simple query or compact tcursor code. If an approach has already been assessed as requiring multiple single queries, an equivalent non-correlated subquery should offer similar performance. The author has never tested this assertion and would appreciate any feedback with examples that either support or refute this statement. With all of the above in mind, in cases where the performance difference is insignificant, subqueries do offer a compact approach to a wide variety of query needs. Summary Subqueries are SQL DML statements (SELECT, INSERT, UPDATE, and DELETE) that include a subordinate SELECT statement. If the SELECT is subordinate to a SELECT statement, the subquery is referred to as a subselect. Subqueries may form the right-hand-side of a WHERE criteria or may be used as a projector in the column list of an outer SELECT statement. Local SQL does not support the use of a literal SELECT statement in the FROM clause of an outer SELECT statement (also called an in-line or implicit view), but there is a workaround for nesting queries in this fashion. Subqueries may be formed to either execute once for the whole outer query or may execute once for every row evaluated by the whole outer query. The second case is referred to as a correlated subquery. Depending on how they are used, subqueries can impact performance adversely. Discussion of this article |
![]() Feedback | Paradox Day | Who Uses Paradox | I Use Paradox | Downloads ![]() |
|
![]() The information provided on this Web site is not in any way sponsored or endorsed by Corel Corporation. Paradox is a registered trademark of Corel Corporation. ![]() |
|
![]() Modified: 15 May 2003 Terms of Use / Legal Disclaimer ![]() |
![]() Copyright © 2001- 2003 Paradox Community. All rights reserved. Company and product names are trademarks or registered trademarks of their respective companies. Authors hold the copyrights to their own works. Please contact the author of any article for details. ![]() |
![]() |
|