# Sorting And Searching

Henceforth, in the context of searching and sorting problems, we will assume that the elements are drawn from a linearly ordered set, for example the set of integers.

Let A[1..n] be a sequence of n elements. Consider the problem of determining whether a given element x is in A. This problem can be rephrased as follows. Find an index j; 1 ≤ j ≤ n, such that x = A[j] if x is in A, and j = 0 otherwise. A straightforward approach is to scan the entries in A and compare each entry with x. If after j comparisons, 1 ≤ j ≤ n, the search is successful , i.e., x = A[j], j is returned; otherwise a value of 0 is returned indicating an unsuccessful search. This method is referred to as sequential search. It is also called linear search.

# Algorithm Linear Search

**Input :** An array A [1..n] of n elements of an element x.

**
Output :** j if x = A [j], 1 ≤ j ≤ n, and 0 otherwise.

- i ← 1
- while (j < n) and
- j ← j + 1
- end while
- if x = A [j] then return
*j*else return 0

**Algorithm binary search** gives a more formal description of this method.

**
Input :** An array A [1..n] of n elements sorted in nondecreasing order and an element x.

**
Output :** j if x = A [j], 1 ≤ j ≤ n, and 0 otherwise.

- low ← 1; high ← n; j ← 0
- while (low ≤ high) and (j = 0)
- mid ← [(low + high)/2]
- if x = A [mid] then j ← mid
- else if x < A [mid] then high ← mid – 1
- else low ← mid + 1
- end while
- return
*j*

**Selection Sort**

Let A[1..n] be an array of n elements. A simple and straightforward algorithm to sort the entries in A works as follows. First, we find the minimum element and store it in A[1]. Next, we find the minimum of the remaining n -1 elements and store it in A[2]. We continue this way until the second largest element is stored in A[n-1]. It is easy to see that the number of element comparisons performed by the algorithm is exactly

.

**Input :** An array A [1..n] of *n* elements.

**
Output :** A [1..n] sorted in non-decreasing order.

- for i ← 1 to n – 1
- k← i
- for j ← i + 1 to n {Find the ith smallest element}
- if A[j] < A[k] then k ← j
- end for
- if then interchange A[i] and A[k]
- end for

It is also easy to see that the number of element interchanges is between 0 and n -1. Since each interchange requires three element assignments, the number of element assignments is between 0 and 3(n - 1). The number of element comparisons performed by Algorithm selection sort is n(n-1)/2. The number of element assignments is between 0 and 3(n - 1).

Insertion Sort

This algorithm, which is shown below, works as follows. We begin with the subarray of size 1, A[1], which is already sorted. Next, A[2] is inserted before or after A[1] depending on whether it is smaller than A[1] or not. Continuing this way, in the ith iteration, A[i] is inserted in its proper position in the sorted subarray A[1::i -1]. This is done by scanning the elements from index i ¡ 1 down to 1, each time comparing A[i] with the element at the current position. In each iteration of the scan, an element is shifted one position up to a higher index. This process of scanning, performing the comparison and shifting continues until an element less than or equal to A[i] is found, or when all the sorted sequence so far is exhausted.

**
Input :** An array A [1..n] of

*n*elements.

**
Output :** A [1..n] sorted in non-decreasing order.

- for i ← 2 to n
- x ← A[i]
- j ← i – 1
- while (j > 0) and (A[j] > x)
- A [j + 1] ← A [j]
- j ← j – 1
- end while
- A [j + 1] ← x
- end for

It is easy to see that the number of element comparisons is minimum when the array is already sorted in nondecreasing order. In this case, the number of element comparisons is exactly n - 1, as each element A[i], 2 <i <n, is compared with A[i - 1] only. On the other hand, the maximum number of element comparisons occurs if the array is already sorted in decreasing order and all elements are distinct. In this case, the number of element comparisons is as each element A[i], 2 <i <n, is compared with each entry in the subarray A[1..i-1]. The number of element comparisons performed by Algorithm insertion sort is between n-1 and n(n-1)/2.

Quicksort

This sorting algorithm has an average running time of Θ(n log n). One advantage of this algorithm over Algorithm merge sort is that it sorts the elements in place, i.e., it does not need auxiliary storage for the elements to be sorted.

Partitioning algorithm

Let A[low..high] be an array of n numbers, and x = A[low]. We consider the problem of rearranging the elements in A so that all elements less than or equal to x precede x which in turn precedes all elements greater than x. After permuting the elements in the array, x will be A[w] for some w; low ≤ w ≤ high. For example, if A = , and low = 1 and high = 7, then after rearranging the elements we will have Thus, after the elements have been rearranged, w = 4. The action of rearrangement is also called splitting or partitioning around x, which is called the pivot or splitting element.

**
Input :** An array of elements A [low..high].

**
Output :** (1) A with its elements rearranged, if necessary, as described above.

(2) w, the new position of the splitting element A [low].

- i ← low
- x ← A[low]
- for j ← low + 1 to high
- if A [j] ≤ x then
- i ← i + 1
- if i then interchange A[i] and A[j]
- end if
- end for
- interchange A [low] and A[i]
- w ← i
- return A and
*w*

The number of element comparisons performed by the above algorithm is exactly n - 1. Thus, its time complexity is Θ(n).

Sorting Algorithm

In its simplest form, Algorithm quicksort can be summarized as follows.The elements A[low::high] to be sorted are rearranged using Algorithm split so that the pivot element, which is always A[low], occupies its correct position A[w], and all elements that are less than or equal to A[w] occupy the positions A[low::w - 1], while all elements that are greater than A[w] occupy the positions A[w + 1::high]. The subarrays A[low::w - 1] and A[w+1::high] are then recursively sorted to produce the entire sorted array.

The formal algorithm is shown as Algorithm quicksort.

**
Input :** An array A[1..n] of

*n*elements.

**
Output :** The elements in A sorted in non-decreasing order.

- quicksort (A, 1, n)

**Procedure **quicksort (A, low, high)

- if low < high then
- SPLIT (A [low..high], w) {w is the new position of A[low]}
- quicksort (A, low, w – 1)
- quicksort (A, w + 1, high)
- if A [j] ≤ x then
- end if

# Heapsort

Given an array A[1..n], we sort its elements in nondecreasing order efficiently as follows. First, we transform A into a heap with the property that the key of each element is the element itself, i.e., key(A[i]) = A[i], 1 <i <n. Next, since the maximum of the entries in A is now stored in A[1], we may interchange A[1] and A[n] so that A[n] is the maximum element in the array. Now, the element stored in A[1] may be smaller than the element stored in one of its children. Therefore, we use Procedure shift-down to transform A[1..n –1] into a heap. Next, we interchange A[1] with A[n –1] and adjust the array A[1..n –2] into a heap. This process of exchanging elements and adjusting heaps is repeated until the heap size becomes 1, at which point A[1] is minimum.

**
Input :** An array A[1..n] of n elements.

**
Output :** Array A sorted in non-decreasing order.

- MAKEHEAP (A)
- for j ← n down to 2
- interchange A[1] and A[j]
- SIFT-DOWN (A[1..j – 1], 1)
- end for

Algorithm heapsort sorts n elements in O(n log n) time and Θ (1) space.

Hashing

- The integer
is called the hash value of key*h(x)**x* - A hash table for a given key type consists of

– Hash function*h*

– Array (called table) of size*N* - When implementing a dictionary with a hash table, the goal is to store item
**(**at index*k, o*)*i = h(k).*

Example

- We design a hash table fora dictionary storing items (SSN, Name), where SSN (social security number) is a nine-digit positive integer
- Our hash table uses an array of size
= 10,000 and the hash function*N*

= last four digits of*h*(*x*)*x*

**Hash Functions**

- A Hash function is usually specified as the composition of two functions:

– Hash code map:

C : keys → integers

Compression map:

: integers → [0,– 1]*N* - The hash code map is applied first, and the compression map is applied next on the result, i.e.,
=*h*(*x*)(*h*(*c*))*x* - The goal of the hash function is to “disperse” the keys in an apparently random way