Notes 11:
Addendum to Essentials of Data Structures, Cpt 11
[Return to Math 61 Homepage |
Return to Notes Listing Page]
Contents
(DCS sec. 11.1, pp 128ff) (Cf. F&T, sec 13.2, pp 688ff)
The definition of the class Heap is found in F&T p 690 and
the core section of the HeapSort code is found on p 703
with other methods on previous pages.
(DCS sec. 11.2, pp 135ff) (Cf. F&T, sec 14.2, pp 786ff)
The QuickSort code is found on pg 791 in F&T. This
book implements the Quicksort algorithm in an "integrated" fashion in
this sense. The code that many books separate into an independent "split"
procedure has been included into the body of the Quicksort code.
The "split" section of the code works this way:
- First a "pivot" (i.e., comparand) item is chosen to be the
value in the middle of the array (i.e., pivot = A[(low+high)/2]).
- That value is exchanged with the value in the first location
of the array section.
- Other values in the array section are compared against the "pivot"
value and are relocated into the "correct" half of the array as
necessary.
- When the two pointers ("ScanUp" and "ScanDown"
corresponding to i and j) meet, this is the
"break" point of the two segments. The value at this location is
exchanged with the value in the initial location of the array.
Thus, for the initial array used in the other examples, we get the
following states:
- Starting state: 3 2 8 5 9 4 1.
The middle value, 5, is exchanged with the first value, 3.
- Next state: 5 2 8 3 9 4 1.
Pointers starting at location 1 (i.e.,
the value of 2) and location 6 (i.e., the value of 1), scan toward
the middle looking for items in the wrong "half" of the array (i.e.,
when compared to the "pivot" value of 5: everything in the first "half"
should be less than 5 and everything in the second "half" greater than
5).
first two identified are 8 and 1. These two values are exchanged.
- Next state: 5 2 1 3 9 4 8.
Pointers now start at location 2 (i.e.,
the value of 1) and location 6 (i.e., the value of 8), continue scanning toward
the middle looking for items in the wrong "half" of the array. The
next two identified are 9 and 4. These two values are exchanged.
- Next state: 5 2 1 3 4 9 8.
Pointers now start at location 4 (i.e.,
the value of 4) and location 5 (i.e., the value of 9), and continue
scanning toward the middle. They "meet" at location 4 which becomes the
"cut" point of this segment of the array. The initial item (which is
the "pivot" point) is
exchanged with the element in this location.
- Final state: 4 2 1 3 5 9 8.
- Quicksort is recursively invoked on the two "halves" of the
array: the "first" segment (i.e., 4 2 1 3), and the "second" segment
(i.e., 9 8). The "pivot" item is in its proper place and can be
ignored from further consideration.
The following chart and accompanying notes are
taken from Niklaus Wirth's first Data Structures
book, Data Structures + Algorithms = Programs.
(Wirth is the Swiss Computer Scientist who wrote the
language Pascal and its successors.) There are similar
figures given in F&T on pagees 794-95.
The numbers refer to the time (in
milliseconds) consumed by the various sorting methods executed in
Pascal on a CDC 6400 computer (in the early 1970's).
In each of the three categories,
the left column refers to a list of 256 items, while the right column refers
to a list of 512 items.
(It is an interesting exercise to approximate the difference
in time between the two numbers, based on the fact that the second
refers to a list that is double the first.)
It should be noted from examining the data that
- Bubblesort is the Worst sorting method.
- Quicksort is the Best sorting method.
- It is easy to identify the simple (i.e., O(n2))
methods versus the advanced (i.e., O(n log2n)) methods.
Sorting Method | Ordered | Random |
Inversely Ordered |
Straight insertion | 12 | 23 |
366 | 1444 |
704 | 2836 |
Binary insertion | 56 | 125 |
373 | 1327 |
662 | 2490 |
Straight selection | 489 | 1907 |
509 | 1956 |
695 | 2675 |
Bubblesort | 540 | 2165 |
1026 | 4054 |
1492 | 5931 |
Bubblesort w/flag | 5 | 8 |
1104 | 4270 |
1645 | 6542 |
Shakersort | 5 | 9 |
961 | 3642 |
1619 | 6520 |
Shellsort | 58 | 116 |
127 | 349 |
157 | 492 |
Heapsort | 116 | 253 |
110 | 241 |
104 | 226 |
Quicksort | 31 | 69 |
60 | 146 |
37 | 79 |
Mergesort | 99 | 234 |
102 | 242 |
99 | 232 |
Noteworthy are the following points:
- The improvement of binary insertion over straight insertion is
marginal indeed, and even negative in the case of an already existing
order.
- Bubblesort is definitely the worst sorting method among all
compared. Its improved version "Shakersort" is still worse than
straight insertion and straight selection (except in the
pathological case of sorting a sorted array).
- Quicksort beats Heapsort by a factor of 2 to 3. It sorts the
inversely ordered array with speed practically identical to the one
which is already sorted.
(Not in DCS) (cf. F&T, p. 645. prog 12.5)
Supppose we have two arrays, both sorted, and we would like to
combine them into another sorted array.
One option would be to copy one array to a third array, then
copy the other array, then sort the new array from scratch using an
efficient soring routine. This would be a straight-forward method, but
it would ignore the fact that the original arrays were already sorted.
Instead, the common mergin algorihtm used is similar to this:
1) set pointers of all 3 arays to 1;
WHILE (more items still exist in BOTH input arrays) DO
| 2) compare lead items to find the smaller of
| the input arrays;
| 3) copy the smaller item to the merged array;
| 4) increment pointer of the merged array,
|_________ and of the appropriate input array;
5) copy rest of unused input arrays to merged array.
For example,
Array A: 1 2 5 6 8 9
Array B: 3 4 7 10 11 12
void mergarr(int a[], int b[], int c[],
int an, int bn, int & cn, int climit)
{ int aptr, bptr, cptr;
if(an+bn > climit)
{ cerr << "too many elements to be merged" << endl;
exit(1);
}
else {
cn = an + bn;
// initialize pointers
aptr = 0; bptr = 0; cptr = 0;
// major loop -- test to see if we have
// reached the end of one of the input arrays
while (aptr < an) && (bptr < bn){
if (a[aptr] < b[bptr])
{ c[cptr] = a[aptr];
aptr++; }
else
{ c[cptr] = b[bptr];
bptr++; }
cptr++;
}
// time to copy left-overs
// copy from a if a has anything left
while (aptr < an) {
c[cptr] = a[aptr];
cptr++; aptr++;
}
// else copy from b if b has anything left
while (bptr < bn) {
c[cptr] = b[bptr];
cptr++; bptr++;
}
}
}
This page is maintained by Dennis C. Smolarski, S.J.
© Copyright 1998, 1999 Dennis C. Smolarski, SJ, All rights reserved.
Last changed: 7 April 1999. Minor correction 9 June 1999