Finding an item of information needn't be as difficult as looking for the proverbial needle in a haystack. When we want to find someone's phone number in phonebook or a word in a dictionary, we utilize our experience and knowledge of how the data is organized. If the data isn't arranged in some logical order it would be more difficult or time consuming.
Organizing data alphabetically (or numerically) allows us to find a desired item more easily and quickly. Although people used to do this sorting by hand (an evermore labor-intensive task as the number of items increases), the computer is ideally suited to performing what would be a mind-numbing job for us.
Here we will look at three different algorithms used to sort data. In keeping with the KISS principle, only five items of data will be used. Thus, when using a debugger to trace through the program code, it will be possible to view all of the data simultaneously and the trace will not be overly long or tedious.
The three programs will use the following algorithms:
All three will sort a set of five integers into ascending order [17 20 31 34 47], i.e., the smallest will be the first element and the largest will be the last. (The reverse order is know as descending order.) Each of the programs follows the following pseudocode:
It is instructive to perform a manual sort of the data. This is easily done by first writing each of the numbers on a piece of paper and ordering them as follows.
| 20 | 47 | 17 | 31 | 34 |
There are many ways that you could use to place these numbers in ascending order. Three will be presented here.
This technique consists of successively finding the smallest number remaining in the row and moving it into its proper position in a second row.
| Initial Stage | 20| 47 | 17 | 31 | 34 | |
| Stage 1 | 20| 47 | | 31 | 34 | |
17| | | | | |
| Stage 2 | | 47 | | 31 | 34 | |
17| 20 | | | | |
| Stage 3 | | 47 | | | 34 | |
17| 20 | 31 | | | |
| Stage 4 | | 47 | | | | |
| 17 | 20 | 31 | 34 | | |
| Stage 5 | | | | | | |
17| 20 | 31 | 34 | 47 | |
Implementing this algorithm on the computer is not a difficult task. We would simply set up a second array. If desired, we could copy the sorted data in the second array back to the first. A problem arises, however, when the amount of data becomes large -- to perform the sort, we would have to declare double the amount of memory required to hold just the original array.
This technique makes use of the human brain's ability to see shortcuts. Only two moves are necessary to get the numbers in the desired order.
| Initial Stage | 20| 47 | 17 | 31 | 34 | |
Step 1: Move the 17 (the smallest value) to the left end.
| Stage 1 | 17| 20 | 47 | | 31 | 34 | | |
Step 2: Move the 47 (the largest value) to the right end.
| Stage 2 | 17| 20 | | | 31 | 34 | 47 | |
A couple of objections regarding this algorthm come to mind. First, we now have a gap in the middle of the numbers. On a table top, this is easily corrected with a bit of slight of hand; on a computer, we would need to explicitly copy values to close the gap. Second, we are using more than the original number of spaces. On a table top, this is not a problem; on a computer, we would have to allocate extra array elements and we might not be able to determine ahead of time how many would be needed.
Thinking more about how an algorithm might be implemented on a computer, consider this third technique. Instead of simply moving the individual numbers, let us pick up a number while we make a space for it. This can be done by sliding the numbers on the left which are larger one space to the right. Or we could repeatedly swap different pairs of adjacent numbers.
| Initial Stage | 20| 47 | 17 | 31 | 34 | |
Step 1: Pick up the 17.
| Stage 1 | 20| 47 | | 31 | 34 | 17 in hand | |
Step 2: Slide the 47 to the right.
| Stage 2 | 20| | 47 | 31 | 34 | 17 in hand | |
Step 3: Slide the 20 to the right.
| Stage 3 | | 20 | 47 | 31 | 34 | 17 in hand | |
Step 4: Place the 17 in the empty space.
| Stage 4 | 17| 20 | 47 | 31 | 34 | |
Step 5: Swap the 47 and 31.
| Stage 5 | 17| 20 | 31 | 47 | 34 | |
Step 6: Swap the 47 and 34.
| Stage 6 | 17| 20 | 31 | 34 | 47 | |
As it turns out, both of the methods used in this third example are utilized in the algorithms provided below. However, there are important differences. First, we cannot slide values in RAM -- we need to copy them, so there are never any "empty" slots in the array. Second, we cannot simply switch two values which are located in RAM -- instead, we must do a rotating three-step copy, which means we need a place to temporarily store a value. These methods are exemplified in the Insertion Sort and the Bubble Sort, respectively.
The Insertion Sort can be viewed as simulating the sorting of playing cards in a person's hand. Moving from left to right, starting with the second card, each card is examined in turn. If it is lower than any of the cards to its left, the appropriate space is opened and the card is inserted.
The loop control variable of the outer loop, current, is the index of the value currently being checked.
The body of the loop consists of three operations: (1) Save the value being checked in a temporary storage location; (2) Slide (copy) all values which are greater than the one being checked one location to the right; (3) Copy the saved value into its proper position.
Note that the final copy is performed even if it is not smaller than any of the values to its left, i.e., it is already in its proper position. This could be avoided if the copy is only performed conditionally when current is not equal to i + 1. Although this would make sense from the human perspective, the execution time of the sort would actually increase, because the check would be performed for every value of current.
Program InsertionSort(output);
const
LO = 1;
HI = 5;
type
IntArrayType = array [LO..HI] of integer;
{ Insertion sort procedure; Sorts an array of ints in ascending order }
procedure insertion_sort(var intarray : IntArrayType; arrayLength : integer);
var
i, current : integer;
temp : integer;
begin
for current := 2 to arrayLength do
begin
temp := intarray[current];
{ Move all values larger than key to the right one position }
i := current - 1;
while( (i >= 1) AND (intarray[i] > temp) ) do
begin
intarray[i + 1] := intarray[i];
i := i - 1
end;
{ insert current value into its proper place }
intarray[i + 1] := temp
end
end;
{ Procedure that simply displays each element of intarray }
procedure display_array(intarray : IntArrayType; arrayLength : integer);
var
i : integer;
begin
for i := 1 to arrayLength do
write(intarray[i]: 3);
writeln;
end;
var
nums : IntArrayType;
nums_length : integer;
begin
nums[1] := 20;
nums[2] := 31;
nums[3] := 17;
nums[4] := 47;
nums[5] := 34;
nums_length := 5;
writeln('Unsorted Array:');
display_array(nums, nums_length);
insertion_sort(nums, nums_length);
writeln('Sorted Array:');
display_array(nums, nums_length)
end.
The Bubble Sort can be viewed as individual elements "bubbling" up to their proper positions. This is accomplished by repeatedly scanning the array looking for adjacent values which are out of order and swapping them. Unlike with the Insertion and Selection Sorts, here no value is ever moved more than one location at a time.
During each execution of the while-do loop, the largest value in the range being scanned is guaranteed to arrive at its proper position , although others may also. Since more than one value could be "bubbled" into position during a single execution of the outer loop, it would be nice to terminate its execution if it is known that all values are already properly positioned. This is accomplished by using the fullysorted flag. For each iteration of the loop, fullysorted is initially set to True. If no swaps are performed during the execution of the loop, it will still be True and the execution of the loop will be terminated. If any swaps are performed (even the one which finishes the sort), fullysorted will be set to False. This will then require that at least one more scan of the array will be performed.
This means that the outer loop will be executed a minimum of once (if the array was already in sorted order) and a maximum of (arrayLength - 1) times.
N.B. This sort simulates the second technique illustrated above in Hand Sort III.
Program BubbleSort(output);
const
LO = 1;
HI = 5;
type
IntArrayType = array [LO..HI] of integer;
{ Bubble sort procedure; Sorts an array of ints in ascending order }
procedure bubble_sort(var intarray : IntArrayType; arrayLength : integer);
var
last, j : integer;
fullysorted : boolean;
temp : integer;
begin
fullysorted := false; {to get into while-do loop initially}
last := arrayLength;
while( (last > 1) AND (not fullysorted) ) do
begin
fullysorted := true; {say we are done until a swap is performed}
for j := 1 to (last - 1) do
begin
if (intarray[j + 1] < intarray[j]) then
begin
temp := intarray[j + 1];
intarray[j + 1] := intarray[j];
intarray[j] := temp;
fullysorted := false; {swap made; don't know if we are done}
end {if}
end; {for}
last := last - 1
end {while}
end;
{ Procedure that simply displays each element of intarray }
procedure display_array(intarray : IntArrayType; arrayLength : integer);
var
i : integer;
begin
for i := 1 to arrayLength do
write(intarray[i]: 3);
writeln;
end;
var
nums : IntArrayType;
nums_length : integer;
begin
nums[1] := 20;
nums[2] := 31;
nums[3] := 17;
nums[4] := 47;
nums[5] := 34;
nums_length := 5;
writeln('Unsorted Array:');
display_array(nums, nums_length);
bubble_sort(nums, nums_length);
writeln('Sorted Array:');
display_array(nums, nums_length)
end.
The Selection Sort can be viewed as finding the largest value in the left part of the array and swapping it into its final resting place, a process which is repeated on an ever-decreasing left side.
The loop control value of the outer loop keeps track of the current final resting place for the currently largest remaining value.
The body of the loop consists of two operations: (1) Find the largest remaining value; and (2) Swap it into position.
Note that the "swap" is performed even if the largest element is already in its proper position. This could be avoided if the swap is only performed conditionally only when largest is not equal to current. If, however, the vast majority of the elements are not in their final resting places, the execution time of the sort would increase, because the check would be performed for every value of current.
Program SelectionSort(output);
const
LO = 1;
HI = 5;
type
IntArrayType = array [LO..HI] of integer;
{ Selection sort procedure; Sorts an array of ints in ascending order }
procedure selection_sort(var intarray : IntArrayType; arrayLength : integer);
var
current, j : integer;
temp, large : integer;
begin
for current := arrayLength downto 2 do
begin
large := 1; { Initialize large to index of first element. }
{ Find the largest element between the positions 1 and current. }
for j := 2 to current do
begin
if (intarray[j] > intarray[large]) then
large := j;
end;
{ Swap the largest element found with element in position current. }
temp := intarray[large];
intarray[large] := intarray[current];
intarray[current] := temp;
end
end;
{ Procedure that simply displays each element of intarray }
procedure display_array(intarray : IntArrayType; arrayLength : integer);
var
i : integer;
begin
for i := 1 to arrayLength do
write(intarray[i]: 3);
writeln;
end;
var
nums : IntArrayType;
nums_length : integer;
begin
nums[1] := 20;
nums[2] := 47;
nums[3] := 17;
nums[4] := 31;
nums[5] := 34;
nums_length := 5;
writeln('Unsorted Array:');
display_array(nums, nums_length);
selection_sort(nums, nums_length);
writeln('Sorted Array:');
display_array(nums, nums_length)
end.