![]() |
![]() |
|
![]() |
Subject: TIP:PdoxWin:Comparing SQL, QBE, and ObjectPAL:2001.05.26 Version 1.0 (2000.04.01) edited by Paradox FAQ Team Reposted by Mike Irwin: 2001.02.10 Version 1.1 (2001.05.26) Contributions from "Stu Bailey" <stu@maximarketing.co.nz> "Kevin Zawicki" <numberjack@asapnet.net> "Tom Krieg" <tomkrieg@bigpond.com> "Craig Futterman, MD" <craig.futterman@inova.com> edited by Paradox FAQ Team ==================== 0. Introduction ==================== This FAQ addresses the matter of deciding what sort of general technique to use when performing an operation on a number of records. ------------------------------- 0.1 Legal Info and Disclaimers ------------------------------- Paradox is a trademark of Corel. Borland Database Engine (BDE) is a trademark of Inprise. The information provided in this FAQ is provided "as is" and is not warranted in any way. The information provided in this FAQ is not endorsed or authorized by Corel or Inprise in any shape, form, or manner. The editors claim NO responsibility for ANY illegal activity regarding this file, or as a result of someone reading this file. You may distribute this file, as long as the copies are complete, unaltered, and are in electronic form only. ------------- 0.2 Feedback ------------- Please send feedback in a Corel Paradox newsgroup or the news:comp.databases.Paradox newsgroup to any of the FAQ Team mentioned in the "FAQ: FAQ FAQ" document. Please preface the subject of your post with the string "PDXWIN FAQ" to alert Team members to the function of the message. Please specify the FAQ name and section number the comment applies to, if any. ============================== 1. General Information ============================== In Paradox one can perform a repetitive operation on a RecordSet in a number of ways. These include: 1. Use a tCursor in a Scan loop 2. Use a tCursor in a While loop 3. Use a tCursor in a For loop 4. Use a QBE query 5. Use an SQL command Techniques 1, 2, and 3 are all, essentially, the same, in that you either will be scanning every record or else using either locate()/locateNext() or qLocate() to find subsequent records in a tCursor. Techniques 4 and 5 are radically different, both from the ObjectPAL techniques and also from each other. Note that up until Paradox moved to a 32-bit environment Borland devoted R&D to improving the performance of the QBE Engine. Since that time, considerable improvements have been made to the (newer) SQL Engine, so that it rivals the QBE Engine in many ways and even surpasses it in some cases. ======== 2. Problem Definition ======== The question asked by many is one of when to use one technique and when to use another. In many cases the circumstances define the answer, since very often they forbid the use of one or other of the techniques. For example, you cannot alter a substring using any QBE technique because the syntax just doesn't exist to do it with. However, if you look at the SQL Help files you will find a perfectly usable syntax that works very well. This means that you have a choice between set-based processing (SQL) and procedural processing (ObjectPAL). ------------------------------------- The syntax of SQL gives it a number of other advantages over QBE, in addition to SubString. For example, Update <TableName> set Name = upper(name) isn't possible in QBE, and takes the following code in ObjectPAL: tcTableName.open(<tcTableName>) tcTableName.edit() scan <tcTableName> : tcTableName."Name" = tcTableName."Name".upper() tcTableName.postRecord() endScan tcTableName.close() ------------------------------------- Section 5 of this FAQ includes a number of anecdotal accounts of tests made on various Paradox systems. ========== 3. Discussions ========== In old versions of Paradox (eg 5) this discussion has been held in several books. It was often said that in small organizations, with networks of about 10 Mb/s the use of a query was faster when it involved the retrieval of 100 or more records. This was because the use of the network to pull down the records dragged down the performance of the system severely, and that pulling them from a server one at a time was far more expensive than pulling them all down at one time. This argument has not vanished. In fact, with the ever-growing datasets involved, it has become stronger, despite increases in network speed. ========== 3a. Notes ========== The BDE Local SQL engine expects dates to be in mm/dd/yyyy format That is the SQL 92 standard. BDE doesn't use date modes for SQL. It does use date modes when you run a QBE query, In SQL the BDE uses mm/dd/yyyy format, but using mm/dd/yyyy format in the query builder with the BDE Date mode not set to 0 will fail. For example, if your date format is dd/mm/yyyy, the query builder does't allow you to enter a date. However having BDE configured to use a dd.mm.yyyy date will work, although some QBE queries may fail. Confused ? So are we at times ! ========== 4. Using Filters ========== In Spring 2001, various participants in the Paradox 10 NewsGroup again considered this subject. Stu Bailey raised the proposition of "preprocessing" the data being operated onby use of Filters before using ObjectPAL loops. -------------- 4.1 Foreword -------------- The rest of this section is from Stu. Please note that he is concentrating on the case where it is the BDE that is the database engine and the tables are all either Native or have Direct ODBC connections. When this happens, queries can be as slow as, or even slower than, using tCursors. You should not try this sort of operation, however, on a system where the backing database is, for example,MS SQL Server, as tCursors become very inefficient with Server systems (they are converted into SQL operations behind the scenes). Stealing Stu's example of the nested tCursors and scans, one might also want to look at running just the first loop from the net drive, ending in a local results table. One might then run SQL commands against that (locally). It is very possible that the end results might then not be that far apart, time-wise, but only real-world testing could tell. The other scenario, of course, is that the database involved is not Paradox, but a server system such as Oracle. Then you would try to use temporary tables on the server, and reduce net traffic to almost zero. -------------- 4.2 Introduction -------------- The subject matter is tCursors versus queries. In my humble opinion, using tCursors to scan, with switchIndex() and setRange, a number of tables to populate a local "Answer" table beats running queries in a multiuser application. The bigger the tables, and the higher the concurrent user count, the bigger the performance payoff. Coding isn't fast or easy, but the rewards with an application running quickly almost no matter the load on the network and server makes the whole thing more valuable to the Customer, and is a strong reason for the Customer to prefer a Paradox application over Access. -------------- 4.3 Hint -------------- One hint for the "Answer" table to be populated - delete it (IF istable()) at the start of the code, create it from scratch (build indexes if need be), and start with a fresh table. The main reason for this is that empty(), particularly if indexes are involved, generates a lot of activity locally, doesn't recover disk space from the biggest record count for the table ever run, and can leave you from time to time with corrupted tables and/or out of date indexes. Even better: store empty "shell" tables in a subdirectory and do a copy over the existing versions. This also gives you a stored copy of structure and indexes for use in emergencies. -------------- 4.4 Example 1 -------------- As an example of the multi-user performance benefits of TCursors vs queries, I have two clinets using the same application in two very different environments. In both cases, the size of their databases are within 10% of each other, and both happen to have the same number of concurrent users. Site1 runs NT Server on a server with FWSCSI drives, with a 100bT switched network, the users PCs are 64Mb RAM PIIs with IDE drives. Site2 runs peer-to-peer on Windows, 64Mb RAM PIIs, 100bT hubbed network, with the database host also running as user client workstation - and all with IDE drives. With spec differences like that, you'd expect Site2 to be noticeably slower. Fact is, there's less than 10% difference in time to accomplish even large tasks when TCursors are used for data retreival. But if "outside the app" queries are used, Site2 takes up to four times as long as Site1. About the only performance penalty Site2 has is up to a worst-case 10 seconds delay opening the main data editing form (lots of tables in it). Big plus for Paradox and TCursors versus any query-based platform. -------------- 4.5 Example 2 -------------- Multitable queries on networks, especially if the tables are large, put a high load on the network as much of the sorting to get to the final "Answer" table is done on the local PC - so that means large quantities of data have to be tranferred to the local machine. If you have multiple users trying to do the same thing, both the network and the server disks are having to cope with a high load. Using TCursors requires much less data transfer as they can pick up data blocks selectively from the server as and when required. In broad terms (and let's limit the example to three source data tables and one destination table - more complex stuff is really only adding more, similar code) it goes like this: priTC.open("primarysource.db") src1TC.open("detail1.db") src1TC.switchIndex("q1idx") src2TC.open("detail2.db") src2TC.switchIndex("q1idx") destTC.open("dest1.db") destTC.edit() ;.................................. scan priTC [for conditions]: src1TC.setRange(priTC.Fld2, priTC.Fld3, priTC.Fld3) scan src1TC [for conditions] src2TC.setRange(priTC.Fld4, src1TC.Fld3, src1TC.Fld3) scan src2TC: ;code the transfer of values of priTC, ;src1TC & src2TC to destTC here endscan endscan endscan Where you know you have 1:1 relationships, use qLocate instead of the setRange/scan loop process for a source data TCursor. I usually include a check on the number of records within a detail Tcursor's active set with nRecords(). If I want to emulate a symmetric join query, the code to write the data to the destTC is "buried" at the lowest level, with IF tc.nRecords()>0 THEN (next scan loop) ELSE quitLoop() ENDIF. If you want to emulate an assymetric join query, transfer the data to destTC at each level of scan loops. Because of the amount of coding required, and the level of indexing needed on large tables to make if really fly, it's really only worth doing for multiuser apps, or if the tables are huge and you run the query you used on a frequent basis. However, as an example of the payoffs, a group of 14 users using one of my apps with this approach can generate a "Customer Profile" report on-screen in 1.5 seconds - with the primary source table being 85,000 records, with three detail tables ranging from 285,000 to 650,000 records in a 1:M relationship with the primary table - and all concurrently. The coding can be mindbending - but it's worth it. ========== 5. Examples ========== In this section we include some examples, with narrative and code. Each is attributed as posted on the Paradox newsgroups, so that you can post messages requesting an answer from that specific person. Please do not abuse the email addresses by emailing people directly, because, although they have posted these addresses freely, they are also entitled to a private life ! If you have examples which, you believe, can show when using one technique or another has helped matters in your programming, please submit them and we'll test them and add them here. ========== 5a. QBE vs SQL ========== Frédéric BROUARD <frederic.brouard@amaris.fr> In an app I made for health care, I had to code a very long program to calculate the amount to pay for a service, based on 13 constraint parameters and tables of history or "sliced" age. The first version I made took about 8 hours to calculate this for about 120000 accounts. It was made only with TCursor and running on a P75 PC. After a complete review, it was recoded using as many queries as possible, leaving the use tCursors on only five occasions (maximum was three open simultaneously). The calculation time fell to 50 minutes. That was exactly the same time as an AS 400 did it, where the AS400 had only seven constraint parameters. After further optimization, it was running in just over 37 minutes, and the decision to terminate the use of the AS 400 was made... Take a look at this example : QBE version of CUSTOMER that live in state 92.. and have no order in January : Query commande.DB | NO_CLI | DATE_COM | ENSEMBLE | _nocli | @@/01/@@@@ | Client.db | NO_CLI | NOM_CLI | PRENOM_CLI | CP_CLI | | aucun _nocli | Check | Check | 92.. | EndQuery running time : 10,25 seconds The same in SQL : SELECT DISTINCT Client.NO_CLI FROM Client, commande WHERE (client.CP_CLI LIKE "92%") AND (client.NO_CLI = commande.NO_CLI) AND (client.NO_CLI not in (SELECT NO_CLI FROM commande WHERE EXTRACT(MONTH FROM DATE_COM) = 1 ) ) running time : 0,87 second Cutomer (client in french) table had 5 000 records ( and order (commande in french) table had 50 000 records Running over a PII 300 with 112 MB RAM. (BDE optimized) In another query involved 3 tables, the two sames plus one (lines of order) containing 500 000 records, the comparative time was in about 2 time faster in SQL rather than QBE... BDE configuration modifications included: FILL FACTOR: between 75 and 95. MAXBUFSIZE : can be set to 2/3 of the PC RAM MINBUFSIZE : always the minimum (small tables don't need big buffers !!!) MAXFILEHANDLES : 150 or more and you can also find reasons to modify : LOW MEMORY USAGE LIMITS MEMSIZE SHAREDMEMSIZE ========== 5b. ========== "Mark Jean Do" <neiscime@email.dot.si> I tested the speed of some ObjectPAL statements and found: 1. qLocate with 2 fields is disproportionate slower then with one field 2. qLocate one field when index contains two fields is slower than when index contains only one field 3. setGenFilter + scan is faster than setRange + scan. 4. setGenFilter + scan with suitable switchIndex is faster than without useing index Comments: Kasey Chang <kaseyc@nospam.discopy.com> 3. Disagree: SetGenfilter does NOT use index data. SetRange, once set, is MUCH faster during SCAN. However, the initial time will be longer. Mark's findings may be true with a small table. 4. Disagree: SetGenfilter does NOT use index data, see #3. Paradox Community Newsgroups |
![]() Feedback | Paradox Day | Who Uses Paradox | I Use Paradox | Downloads ![]() |
|
![]() The information provided on this Web site is not in any way sponsored or endorsed by Corel Corporation. Paradox is a registered trademark of Corel Corporation. ![]() |
|
![]() Modified: 15 May 2003 Terms of Use / Legal Disclaimer ![]() |
![]() Copyright © 2001- 2003 Paradox Community. All rights reserved. Company and product names are trademarks or registered trademarks of their respective companies. Authors hold the copyrights to their own works. Please contact the author of any article for details. ![]() |
![]() |
|