**COE428
Lab 3: Sorting**

** **

**1. IMPORTANT: Two week lab**

Before coming to the lab you should:

· Read the lab. Try to prepare any questions you may have about the lab.

·
Refer to **Lab Guide**.

·
Create your lab directory for lab3. (i.e. use **mkdir
lab3** within your coe428 directory.)

·
Change to your `coe428/lab3` and unzip the lab3.zip file with the command:

unzip /home/courses/coe428/lab3/lab3.zip

·
Ensure that you have downloaded properly by
entering the command: **make**

No errors should be reported..

** **

**3. Requirements**

The requirements to complete the lab are summarized below.

- Implement
*InsertionSort*using the mySort() function in the file: insertionSort.c. - Implement
*MergeSort*using the mySort() function in the file mergeSort.c. - Edit your README file to include:

1. A brief summary of what you accomplished and what (if any) parts you did not complete or bugs that you are aware of but have not fixed.

2. Analysis (including equations for number of moves, swaps
and compares as a function of *n*) for the best-, average- and worst-case
behaviors of *InsertionSort *and *MergeSort*,

** **

**4. Introduction**

This
lab revisits *sorting *algorithms. You will:

- Implement and analyze two sort algorithms:
*InsertionSort**MergeSort*- Using programming conventions (or patterns) that allow the performance of your implementations of the algorithms to be measured.
- Use the make development tool to keep your project up to date.

** **

**5. Theory**

**5.1. The Insertion Sort Algorithm**

You
are required to analyze and implement another sorting strategy: *InsertionSort*.
The

*InsertionSort *algorithm applied to a deck of cards can be described as:

**Step 1:**

If there are no cards to sort, then **Stop**.

**Step 2:**

Otherwise, remove the top card from the unsorted pile, figure out where it should go in the sorted pile and insert it there.

**Step 3:**

Go back to step 1.

The important characteristics of the algorithm are:

- As it runs, it divides the deck into a sorted portion (initially empty) and an unsorted portion (initially the whole deck).
- Each time Step 2 is performed, the sorted portion has one more card and the unsorted portion has one less card.
- Step 2 is
performed
*n*times (where*n*is the number of cards in the deck). - Step 2 describes how this is done: in this case, get the first unsorted card and insert it into the proper position in the sorted part.

It
is useful to compare this algorithm with the similar *SelectionSort *algorithm.

**Step 1:**

If there are no cards to sort, then **Stop**.

**Step 2:**

Otherwise, find the smallest card, remove it and place it on top of the sorted card pile.

**Step 3:**

Go back to step 1.

It should be evident that both algorithms share some basic characteristics. Both increase the sorted portion by one each time Step 2 is performed. They differ only in the way this is done:

*SelectionSort*has to examine all the cards in the unsorted portion, remove the smallest one and then place it at the end of the sorted portion.*InsertionSort*takes the first card from the unsorted portion (i.e. it does not have to scan through the unsorted cards) and inserts it into the sorted portion (and it has to scan the sorted portion to figure out where it should go).

** **

**5.2. Implementing sort algorithms with
metrics**

The
algorithms described previously use the *deck-of-cards *metaphor. This may
be useful for human understanding of the algorithm, but is incomprehensible to
computers.

In this lab, we require that all sort algorithms be implemented in C using a function mySort that sorts an array (or a subarray) of int's. The signature for mySort() is:

void mySort(int data[], unsigned int first, unsigned int last); |

** **

Note |

The signature for mySort is in the file mySort.h. An implementation of the mySort function must modify the data array such that all elements in the range data[first], data[first+1]...data[last] are in sorted order. |

** **

**5.2.1. Example: SelectionSort implementation**

The
*SelectionSort *algorithm is described in Chapter 2 of *Introduction to
Algorithms*, exercises 2.2-2 on page 27. It is **strongly **recommended
that you review the idea of *SelectionSort*.

** **

**5.2.1.1. The initial implementation**

For
reference, the initial implementation of *SelectionSort *is shown below:

void mySort(int array[], unsigned int first, unsigned int last) { int i; /* Step 1: Is there nothing to sort? */ while (first < last) /* Step 2: Swap... */ for(i = first+1; i <= last; i++) { /* Find smallest one in rest of array */ if(array[first] > array[i])) { /Step 2..continued...swap them */ int tmp; tmp = array[first] array[first] = array[i]; array[i] = tmp; } first++; } return; |

** **

**5.2.2. Using metrics**

All
sort algorithms require that elements in the array to be sorted can be compared
with something else of the same data type. Indeed, the number of times an
algorithm needs to compare a data item with something else (of the same type)
is fundamental metric (i.e. *measure-of-performance *or *benchmark*)
that is often used to evaluate different sort algorithms.

You
are provided with a framework that counts the total number of times that data
elements are compared. Although you do not need to write (or even understand)
this *metrics *framework, you **must **respect its contract.

In order to respect the contact so that the total number of comparisons is tracked, it is forbidden to compare a data element with something else using C comparison operators (such as ==, <, >, etc.) Instead, you must use the myCompare() function.

** **

Note: Replacing “(data[j]<foo)” with myCompare() |

Consider the following code that directly compares a data element with something else: if (data[j] < foo) { The myCompare(x, y) function returns a negative number if and only if x < y. Hence, the previous code can transformed to: if (myCompare(data[i], foo) < 0) { |

In addition to comparing elements in the data array, all sort algorithms also have to change the positions of individual data elements. In order to be able to keep count of these operations, we require that elements be moved using the myCopy() or mySwap() functions.

Note: Replacing "data[i] = tmp" with myCopy() |

For example, instead of writing something like: data[i] = tmp; you would use the myCopy(): myCopy(&tmp, &data[i]); |

Note: Using mySwap() to interchange elements |

Step 2 of int tmp; tmp = array[first] array[first] = array[i]; array[i] = tmp; The effect of this code was to interchange (or swap) the elements array[first] and array[i]. The same result can be achieved with the mySwap() function: mySwap(&array[first], &array[i]); |

** **

**5.3. Using make**

Unlike previously labs where you had to type in the compile and link commands, you can perform these steps efficiently with the single command make.

** **

**6. Suggested approach**

There are several requirements in the lab. A suggested approach is outlined below.

** **

**6.1. Use make**

The
files furnished to you include *stub *implementations of mySort() and a real main() function.

By default, the command make ensures that the commands insertionSort and mergeSort exist and correspond to the most recent modifications to source code files. The command make works "right out of the box" (which you should have verified in your prelab).

Although
the "out-of-the-box" commands are compiled without error, they do not
really work! (Actually, they do work if they are asked to sort *nothing*.
Try it! (i.e. the command insertionSort < /dev/null will "sort nothing", write "nothing" to
*stdout *and write statistics to *stderr *indicating that no
comparisons, move, or swaps were needed to "sort nothing".)

** **

**6.2. Implement InsertionSort without
metrics**

As
described previously, the *InsertionSort *sort algorithm is quite similar
to *SelectionSort*.

** **

**6.3. Modify InsertionSort to include
metrics**

Once you are reasonably confident that the command insertionSort works, modify the implementation so that the metric functions are used.

The
most important measurement is the number of comparisons; start by using myCompare() instead of direct comparisons. When this works,
the output should still be sorted and the statistics (displayed to *stderr*)
will indicate the number of comparisons required.

Finally, replace data element movements with myCopy() and/or mySwap(). The statistics should now reflect the number of copies/swaps performed by the algorithm.

** **

**6.4. Complete InsertionSort theoretical
analysis**

You
should now attempt to complete a theoretical analysis of the number of
compares/copies/swaps that your implementation performs for worst-, best- and
average-case inputs of *n*. Ideally, you should develop an equation (a
function of *n*) that gives these metrics.

**6.5. Implement and analyze MergeSort**

You
should aim to complete the implementation and analysis of *InsertionSort *in
the first week of the lab.

* *

*MergeSort *is
a bit trickier to implement and analyze. Note, in particular, that the
algorithm itself is actually *very easy *to implement in C (although, of
course, it does use recursion). The tricky part is the *merge *sub-algorithm.

Although
the *merge *is not itself recursive, it does require at least one
temporary array and it does require a fair amount of copying of elements to an from
the temporary. Note, however, that it is **not **necessary to use dynamic
memory allocation functions (such as malloc()); you
can simply declare a local variable for your temporary array as (for example):

int temp[MAX_SIZE_N_TO_SORT}; |

1. Go to your **coe428** directory

2. Zip your **lab3** directory by using the following
command:

** zip -r ****lab3.zip lab3/**

3. Submit the lab3.zip file using the following command:

** submit coe428 ****lab3**
**lab3.zip**

by Ken Clowes, revised by Ivan Lee, revised by Olivia Das