Welcome To The Home Of The Visual FoxPro Experts  
home. signup. forum. archives. search. google. articles. downloads. faq. members. weblogs. file info. rss.
 From: Ken Murphy
  Where is Ken Murphy?
 Springhill
 Canada
 Ken Murphy
 To: Yull67
  
 
 
 Yull67
 Tags
Subject: RE: How do i make queries Rush More?
Thread ID: 154045 Message ID: 154614 # Views: 17 # Ratings: 0
Version: Visual FoxPro 9 Category: Databases, Tables and SQL Server
Date: Tuesday, December 18, 2007 12:17:54 PM         
   


> >
> > John,
> >
> > The best way to utilize Rushmore is to understand how it works. You can then make design decisions that will best optimize your queries. A little overview might help.
> >
> > Lets start with the following query as an example
> >
> >
> > SELECT * FROM MyTable ;
> >    WHERE MyTable.nSomeField = lnSomeValue ;
> >    INTO CURSOR csrResults
> > 

> >
> > Now lets assume that we do not have any indexes over nSomeField. If you were to look at your table on the disk, you wouldn't actually see rows and columns. Indeed, what you would see would simply look like a long string of data. VFP knows the structure of the table (this is held in the header record) so it knows how long each record is. To move from record to record, it knows to move the disk drive's read head exactly that many bytes. For example, if you have a table with a record width of 500 bytes, to move to the next record, you must move the disk drive's read head 500 bytes. (More precisely, you would need to move the read head from where it currently is to the beginning of the next 500 byte block.)
> >
> > So far, this sounds reasonable, but here is where it gets a little more complex. When you add a record to a table, it goes on the end. If you browse a table with no index set and you append a new record, that record is added to the end of the table and you will see it as the last record in the BROWSE window. Because records are added to the end of the table, this means they are not sorted. To go from RecordID 1 to RecordID 2, the read head could need to move from the first physical record to the last physical record. We are talking about moving the read head here - mechanical motion (in other words, SLOW!) If the read head is skipping back and forth from the first to the last physical record in a file, and if it has to do this a lot of times, it is bound to take a fair amount of time.
> >
> > Worse still, on top of that mechanical slowness, VFP must read EVERY SINGLE RECORD! These records are not sorted, so there is no logic that VFP can apply to deterime if the next record will meet that WHERE criteria. The only way to determine this is to actually read that record.
> >
> > Rushmore doesn't look at the physical table. Instead, it looks at an index. An index, unlike the table is sorted. Not only is it sorted, it is also normally a lot smaller than the table. Let's say that we have two indexes on this table:
> >
> >
> > INDEX ON MyTable.nSomeField TAG SomeField
> > INDEX ON DELETED() TAG DelTag BINARY
> > 

> >
> > An index is basically a pointer to the record plus a reference to the field value. For example in your index you might see something like:
> >
> >
> > Pointer	FieldReference
> > 987654  10001
> > 123456  10002
> > ...
> > 

> >
> > Note that the index is actually sorted in FieldReference order. The pointer simply identifies where in the table that record is located. If you LOCATE 10002 on that table, Rushmore knows to move the read head to the beginning of that record and to do so, it needs to go to the disk address at 123456. Just like a DBF file, a CDX file or IDX file is just that, a physical file. The difference is that instead of this file having a record width of 500 bytes, in this index, it is only the width of two integers (8 Bytes.) Obviously, if you have less distance to move the read head, it is going to be faster. Indeed, that index may be small enough that it will fit into memory. Now VFP doesn't have to worry about moving the read head back and forth. It simply takes one pass over the index, stores the whole thing into memory and now we are working at light speed - not mechanical speed.
> >
> > That the index file is smaller is most definately a help, but the real speed comes from the fact that the index is sorted. Rushmore uses something called a b-tree algorythm to search for records. Instead of starting at record 1 in the index, Rushmore starts at the middle record. It then asks the question "Is the field reference for this index record greater than or less than the value I am searching for?" Remember, the index is sorted. If it is less than that value, Rushmore knows that the record it is searching for lies in the second half of the index. If it is more than that value, then the record it is searching for lies in the first half of the index. In the very first index read, Rushmore just elimiated half of the index.
> >
> > Rushmore then does the same thing in a recursive loop over the remaining records in that index. At the end of the second read, Rushmore will have eliminated 75% of the index records. At read 5, Rushmore only has about 3% of the index left that could contain the desired record. It continues to do this recursively until it finds the record it is looking for, or it runs out of records. If on read number 1, it finds that the field reference is exactly equal to the value it is searching for, then it it has completed the search in only one read so it returns a true. If, after only 100 reads (and 100 reads would represent a "large-ish" table - 2**100 records) it finds that there are no more records left to search, it returns a false. Obviously, with the index sitting in memory and by using a b-tree algorythm, Rushmore can VERY quickly find the record it needs.
> >
> > That tells you roughly "how" Rushmore works, but it doesn't tell you "where." Rushmore is used in two places - in the JOIN clauses, and in the WHERE clause. Take a look at the following queries:
> >
> >
> > * Query 1
> > SELECT * FROM MyTable1 ;
> >    INNER JOIN MyTable2 ON MyTable2.PKField = MyTable1.PKField ;
> >    WHERE MyTable.nSomeField = lnSomeValue ;
> >    INTO CURSOR csrResults
> > 
> > * Query 2
> > SELECT * FROM MyTable1, MyTable2 ;
> >    WHERE MyTable.nSomeField = lnSomeValue AND ;
> >          MyTable2.PKField = MyTable1.PKField ;
> >    INTO CURSOR csrResults
> > 

> >
> > Both of these queries return the same results. As can be seen, the JOIN condition acts as a sort of filter. Rushmore basically handles JOIN conditions and WHERE conditions the same way.
> >
> > In a JOIN however, Rushmore goes one step further. In a join, it looks at the indexes, and figures out which table to start with. It only wants to loop through the large table once (as you would, in nested SCAN loops.) It wants to loop though the smaller tables multiple times. This also speeds up the works, but unfortunately, in a complex SELECT statement, where you are using sub-queries and intermediate cursors, Rushmore can, at times, get this wrong. In some cases you can end up with it looping through the smaller cursor once and the larger cursor many times. Remember, those intermediate cursors that you generate in your sub-queries don't have any indexes for Rushmore to work with. For this reason, there are times when you are better off forcing the order in which the search is made yourself by breaking it down into a series of queries or by using the FORCE clause.
> >
> > Tamar mentioned that an index on DELETED() depends on the number of deleted records in the table. When you have SET DELETED ON, VFP actually adds an implied WHERE NOT DELETED() condition to your query. Having an index over DELETED() allows Rushmore to optimize that condition. Now, if you only have two or three deleted records in your table, will this make any difference in speed? Certainly not a noticable difference - it would probably be measured as "a few clock ticks." Certainly, she is correct about using a binary index for DELETED(). This gives you an index that is only 33 bits wide (4 bytes for the pointer at 8 bits each plus one bit for the field reference.)
> >
> > Hope this helps.
> >
> > Ken
> > You shall know the truth - and the truth shall set you free. (John 8:33)
>
>
> This looks like a good candidate for an Article/FAQ/Blog post. Just a thought.
>
> Yull

Yull,

Thank you for the kind words. I will see if I can gather up some time to do this.

Ken
You shall know the truth - and the truth shall set you free. (John 8:33)

ENTIRE THREAD

How do i make queries Rush More? Posted by John McLellan @ 12/11/2007 3:50:16 PM
RE: How do i make queries Rush More? Posted by Mike Gagnon @ 12/11/2007 5:18:14 PM
RE: How do i make queries Rush More? Posted by tushar @ 12/11/2007 5:19:15 PM
RE: How do i make queries Rush More? Posted by Tamar Granor @ 12/11/2007 10:55:14 PM
RE: How do i make queries Rush More? Posted by John McLellan @ 12/13/2007 10:52:15 AM
RE: How do i make queries Rush More? Posted by Vincent Byrne @ 12/14/2007 2:38:13 PM
RE: How do i make queries Rush More? Posted by Olaf Doschke @ 12/12/2007 11:05:50 PM
RE: How do i make queries Rush More? Posted by Ken Murphy @ 12/13/2007 2:14:29 AM
RE: How do i make queries Rush More? Posted by John McLellan @ 12/13/2007 10:50:01 AM
RE: How do i make queries Rush More? Posted by Ken Murphy @ 12/13/2007 12:44:17 PM
RE: How do i make queries Rush More? Posted by Yull67 @ 12/18/2007 4:48:42 AM
RE: How do i make queries Rush More? Posted by Ken Murphy @ 12/18/2007 12:17:54 PM
RE: How do i make queries Rush More? Posted by Ken Murphy @ 1/4/2008 1:22:51 AM
RE: How do i make queries Rush More? Posted by Yull67 @ 1/4/2008 11:27:00 PM
RE: How do i make queries Rush More? Posted by Ken Murphy @ 1/5/2008 12:15:32 AM
RE: How do i make queries Rush More? Posted by John McLellan @ 12/13/2007 10:59:27 AM