Math 61 -- Notes 11

Notes 11:
Addendum to Essentials of Data Structures, Cpt 11

Math 61 -- D. C. Smolarski, S.J.
Santa Clara University, Department of Mathematics and Computer Science

[Return to Math 61 Homepage | Return to Notes Listing Page]

Heapsort
Quicksort
Addendum: Statistics on Various Sorting Routines
Merging
Code for Merging

Heapsort

(DCS sec. 11.1, pp 128ff) (Cf. F&T, sec 13.2, pp 688ff)

The definition of the class Heap is found in F&T p 690 and the core section of the HeapSort code is found on p 703 with other methods on previous pages.

Quicksort

(DCS sec. 11.2, pp 135ff) (Cf. F&T, sec 14.2, pp 786ff)

The QuickSort code is found on pg 791 in F&T. This book implements the Quicksort algorithm in an "integrated" fashion in this sense. The code that many books separate into an independent "split" procedure has been included into the body of the Quicksort code.

The "split" section of the code works this way:

First a "pivot" (i.e., comparand) item is chosen to be the value in the middle of the array (i.e., pivot = A[(low+high)/2]).
That value is exchanged with the value in the first location of the array section.
Other values in the array section are compared against the "pivot" value and are relocated into the "correct" half of the array as necessary.
When the two pointers ("ScanUp" and "ScanDown" corresponding to i and j) meet, this is the "break" point of the two segments. The value at this location is exchanged with the value in the initial location of the array.

Thus, for the initial array used in the other examples, we get the following states:

Starting state: 3 2 8 5 9 4 1.
The middle value, 5, is exchanged with the first value, 3.
Next state: 5 2 8 3 9 4 1.
Pointers starting at location 1 (i.e., the value of 2) and location 6 (i.e., the value of 1), scan toward the middle looking for items in the wrong "half" of the array (i.e., when compared to the "pivot" value of 5: everything in the first "half" should be less than 5 and everything in the second "half" greater than 5). first two identified are 8 and 1. These two values are exchanged.
Next state: 5 2 1 3 9 4 8.
Pointers now start at location 2 (i.e., the value of 1) and location 6 (i.e., the value of 8), continue scanning toward the middle looking for items in the wrong "half" of the array. The next two identified are 9 and 4. These two values are exchanged.
Next state: 5 2 1 3 4 9 8.
Pointers now start at location 4 (i.e., the value of 4) and location 5 (i.e., the value of 9), and continue scanning toward the middle. They "meet" at location 4 which becomes the "cut" point of this segment of the array. The initial item (which is the "pivot" point) is exchanged with the element in this location.
Final state: 4 2 1 3 5 9 8.
Quicksort is recursively invoked on the two "halves" of the array: the "first" segment (i.e., 4 2 1 3), and the "second" segment (i.e., 9 8). The "pivot" item is in its proper place and can be ignored from further consideration.

Addendum: Statistics on Various Sorting Routines

The following chart and accompanying notes are taken from Niklaus Wirth's first Data Structures book, Data Structures + Algorithms = Programs. (Wirth is the Swiss Computer Scientist who wrote the language Pascal and its successors.) There are similar figures given in F&T on pagees 794-95. The numbers refer to the time (in milliseconds) consumed by the various sorting methods executed in Pascal on a CDC 6400 computer (in the early 1970's).

In each of the three categories, the left column refers to a list of 256 items, while the right column refers to a list of 512 items.

(It is an interesting exercise to approximate the difference in time between the two numbers, based on the fact that the second refers to a list that is double the first.)

It should be noted from examining the data that

Bubblesort is the Worst sorting method.
Quicksort is the Best sorting method.
It is easy to identify the simple (i.e., O(n²)) methods versus the advanced (i.e., O(n log₂n)) methods.

Sorting Method Ordered Random Inversely Ordered

Straight insertion 12 23 366 1444 704 2836

Binary insertion 56 125 373 1327 662 2490

Straight selection 489 1907 509 1956 695 2675

Bubblesort 540 2165 1026 4054 1492 5931

Bubblesort w/flag 5 8 1104 4270 1645 6542

Shakersort 5 9 961 3642 1619 6520

Shellsort 58 116 127 349 157 492

Heapsort 116 253 110 241 104 226

Quicksort 31 69 60 146 37 79

Mergesort 99 234 102 242 99 232

Sorting Method	Ordered	Random	Inversely Ordered
Straight insertion	12	23	366	1444	704	2836
Binary insertion	56	125	373	1327	662	2490
Straight selection	489	1907	509	1956	695	2675
Bubblesort	540	2165	1026	4054	1492	5931
Bubblesort w/flag	5	8	1104	4270	1645	6542
Shakersort	5	9	961	3642	1619	6520
Shellsort	58	116	127	349	157	492
Heapsort	116	253	110	241	104	226
Quicksort	31	69	60	146	37	79
Mergesort	99	234	102	242	99	232

Noteworthy are the following points:

The improvement of binary insertion over straight insertion is marginal indeed, and even negative in the case of an already existing order.
Bubblesort is definitely the worst sorting method among all compared. Its improved version "Shakersort" is still worse than straight insertion and straight selection (except in the pathological case of sorting a sorted array).
Quicksort beats Heapsort by a factor of 2 to 3. It sorts the inversely ordered array with speed practically identical to the one which is already sorted.

Merging

(Not in DCS) (cf. F&T, p. 645. prog 12.5)

Supppose we have two arrays, both sorted, and we would like to combine them into another sorted array.

One option would be to copy one array to a third array, then copy the other array, then sort the new array from scratch using an efficient soring routine. This would be a straight-forward method, but it would ignore the fact that the original arrays were already sorted.

Instead, the common mergin algorihtm used is similar to this:

     1)    set pointers of all 3 arays to 1;
     WHILE (more items still exist in BOTH input arrays) DO
     |     2)    compare lead items to find the smaller of
     |           the input arrays;
     |     3)    copy the smaller item to the merged array;
     |     4)    increment pointer of the merged array,
     |_________  and of the appropriate input array;
     5)    copy rest of unused input arrays to merged array.

For example,

Array A:   1  2  5  6  8  9
Array B:   3  4  7  10 11 12

Code for Merging

   void mergarr(int a[], int b[], int c[], 
                int an, int bn, int & cn, int climit)
   {  int aptr, bptr, cptr;
      if(an+bn > climit)
      {   cerr << "too many elements to be merged" << endl;
          exit(1);
       }
      else {
         cn = an + bn;
	 // initialize pointers
	 aptr = 0; bptr = 0; cptr = 0;
	 // major loop -- test to see if we have
	 // reached the end of one of the input arrays
	 while (aptr < an) && (bptr < bn){
	    if (a[aptr] < b[bptr])
	    {   c[cptr] = a[aptr];
	        aptr++; }
	    else
	    {   c[cptr] = b[bptr];
	        bptr++; }
	    cptr++;
	    }
	 // time to copy left-overs
	 // copy from a if a has anything left
	 while (aptr < an) {
	    c[cptr] = a[aptr];
	    cptr++; aptr++;
	    }
	 // else copy from b if b has anything left
	 while (bptr < bn) {
	    c[cptr] = b[bptr];
	    cptr++; bptr++;
	    }
	 }
    }

Notes 11: Addendum to Essentials of Data Structures, Cpt 11

Math 61 -- D. C. Smolarski, S.J. Santa Clara University, Department of Mathematics and Computer Science