# A Voyage to the Kernel, Day 7

7
5410

We are about to enter the core part of this segment—algorithms. An algorithm could be termed a sequence of computational steps that can transform an input into the output. Here, it should be emphasised that any such sequence cannot be called an algorithm since a wrong methodology will give an incorrect output.

Almost all the code used in this segment will be in a pseudo-code format that is akin to C code. Sometimes the algorithms are given in C itself. We have to stick to this format, as our primary intention is to meddle with the kernel.
Those who wish to have an overview of the importance of bringing in innovative algorithms and building simulations (based on them) can look up the GNU Hurd project.

As I promised earlier, this voyage will not neglect the novices. So, let’s start with a few simple things. Let’s suppose we have a set of values ranging from x1, x2 x3…………..xn. If you are asked to arrange them in ascending order, you need to write an algorithm so as to get the output as xa, xb xc…….., which satisfies the condition xa < xb < xc…….

This is a simple case of sorting, which is a common algorithm that we employ in our programs.

Some fields that rely on algorithms
Writing algorithms is not just the headache of programmers alone. There are various other fields in which we need to rely on the algorithmic approach:

• The Human Genome Project has the goal of identifying all the 100,000 genes in the human DNA. It has to determine about three billion sequences of the chemical base pairs! This is virtually impossible unless we employ effective algorithms for pattern recognition and identification.
• Today we depend a lot on electronic commerce for the purchase of any commodity. A good system should have the ability to keep information (such as credit card numbers, passwords and bank statements) secure and encrypted. Algorithms are used for these cryptographic processes.
• Other branches like physical science (see the box on ‘Simulation building in physical science) require the use of algorithms for solving complex problems.
• Even a manufacturing industry (or any other commercial setting) needs algorithms for allocating scarce resources.

We just looked at an instance of the sorting problem. This may become more complex depending on the number of items to be sorted, the extent to which the items are to be sorted, the current state (sorting) of the elements, possible restrictions on the items, and even on the kind of storage device used.

Hence, while dealing with algorithms, we need to consider the data structures employed with which we can manipulate the way to store and organise data in order to facilitate access and modifications as per our requirement.

Finding the shortest route is a kind of sorting algorithm. Consider a trucking company with a central warehouse. Each day, it loads up a truck at the central warehouse and then sends it around to several locations to deliver the products. At the end of each day, the truck should return to the central warehouse so that it is ready to be loaded for the next day. To find out the lowest operating cost, the company needs an algorithm that will indicate a specific order of delivery stops such that the truck travels the lowest overall distance.

If you have enough data in your hands, you can write down an algorithm. Now, let’s look at how to write such an algorithm. For the sake of simplicity, let’s replace our problem with a simple one.

Consider five cards of clubs—2, 4, 5, 7 and 10—which are placed randomly (just like we had random stops). If you were to play the game, you would have arranged the cards as shown in Figure 1.

But what if a computer had to play your role? How would it arrange the cards? The answer is quite simple: by employing the sorting algorithm. The following code elucidates the algorithm:

```INSERTION-SORT(X)
for j ← 2 to length[X]
do key-select ← X[j]
Insert X[j] into  series X[1,.. [j-1]]

i ← j - 1
while i > 0 and X[i] > key-select
do X[i + 1] ← X[i]
i ← i - 1

X[i + 1] ← key-select```

The pseudo code carries elements that are quite akin to those in C, and the alphabets used for each iteration are conventional ones. The character ‘j’ indicates the ‘current card’ that is picked up and the ‘▹’ sign symbolises that the remainder of the line is a comment. The numbers that we wish to sort are represented as the key-select. By looking at the algorithm, you can see that the parameters are passed to a procedure by value.

As the algorithm is quite simple, it is self-explanatory. Now try to expand the same algorithm to another problem shown in Figure 2. If you have understood the first one, this will be quite simple, except that you need to bring in some additional rules as there are cards from different families and two cards here have the same value (of hearts and diamonds).

Analysing an algorithm has come to mean predicting the resources that the algorithm requires so as to get the desired output. It considers aspects like memory allocation, communication bandwidth, computer hardware, etc. We also need to take into account parameters like computational time when it comes to the practical side. These parameters may, in turn, depend on input size and the running time.

## Methodology: the divide-and-conquer approach

This is an effective methodology that we can adopt when we design algorithms. It involves:

• Divide: Divide the given sequence (with n elements) into two sub-sequences (of n/2 elements each)
• Conquer: Sort the two new sub-sequences recursively using the merge sort algorithm.
• Combine: Merge the two sorted sub-sequences to produce the desired result.

The merge mode is illustrated below:

```MERGE(A, p, q, r)

MERGE-SORT ( A, p, r)
if p < r                              To check for base case
then q < ( p + r)/2                For dividing
MERGE-SORT ( A, p, q)      Conquering
MERGE-SORT ( A, q + 1, r)  Conquering
MERGE ( A, p, q, r)          Combining```

The above pseudo code may not enlighten novice programmers, who can look at the expanded code given below and then use the above example to assimilate the core idea:

```MERGE ( A, p, q, r)
s1 ← q − p + 1
s2 ← r − q
make arrays L[1 .... s1 + 1] and R[1 ..... s2 + 1]
for i ← 1 to s1
do L[i] ← A[ p + i − 1]
for j ← 1 to s2
do R[ j ] ← A[q + j ]
L[s1 + 1] ← ∞
R[s2 + 1] ← ∞
i ←1
j ←1
for k ← p to r
do if L[i] ≤ R[ j ]
then A[k] ← L[i]
i ←i +1
else A[k] ← R[ j ]
j ← j +1```

Let’s look at an example to comprehend the idea better. I found the following problem in a book that deals with problem solving. A part of the problem involves arranging an array of values [5, 2, 4, 7, 1, 3, 2, 6]. This will be our initial (given) array. Now, we can use our methodology to solve the problem. Figure 3 shows the consecutive division and merging processes.

We can imagine the process in many different ways. If the set is random, then we could have another array with the same elements in a different order. The solution is shown in Figures 4, 5 and 6. Here, Figures 4 and 5 represent the initial steps and Figure 6 represents the final output. It should be noted that the intermediate stages are not shown here, as they are similar steps that can be envisaged in one’s mind. The step-by-step procedure will also throw ample light on the subs L and R that are created.

Once you are done with simple algorithms, you can apply the same ‘black box representation’ idea (that you employ while writing programs) and use the simple algorithms as sub-routine calls in your main algorithm (depending on the model you opt for).

Now we can look at another simple algorithm that can handle errors. The following code shows the multiplication of matrices:

```MATRIX-MULTIPLICATION(A, B)
if columns[A] ≠ rows[B]
then error “This operation is not allowed”
else for i ← 1 to rows[A]
do for j ← 1 to columns[B]
do C[i, j] ← 0
for p ← 1 to columns[A]
do C[i, j] ← C[i, j] + A[i, k] · B[k, j]
return C```

Before we move on to the complex stuff, you can test yourself by considering the problem in the following table. You are required to find out the maximum value of n that corresponds to each of the stipulated times. The data provided along with it is that you have f(n) in milliseconds. Try solving it!

 1 Sec 1 Min 1 Hr 1 Day 1 Month 1 Year n! n2 ln (n) log (n) n x ln (n) 2n

We can see that the style remains the same when we try to write algorithms in C. If we need to compare functions for a pointer to an integer, we can have the following:

```#include “compare.h”
int int_is_equal(void *vplace1, void *vplace2)
{
int *place1;
int *place2;

place1 = (int *) vplace1;
place2 = (int *) vplace2;

return *place1 == *place2;
}

int int_is_compare(void *vplace1, void *vplace2)
{
int *place1;
int *place2;

place1 = (int *) vplace1;
place2 = (int *) vplace2;

if (*place1 < *place2) {
return -1;
} else if (*place1 > *place2) {
return 1;
} else {
return 0;
}
}```

You may find a reference to a header file in the program. So you need compare.h along with it to get the desired result:

```#ifndef ALGORITHM_COMPARE_INT_H
#define ALGORITHM_COMPARE_INT_H

#ifdef __cplusplus
extern “C” {
#endif

int int_is_equal(void *place1, void *place2);

int int_is_compare(void *place1, void *place2);

#ifdef __cplusplus
}
#endif

#endif```

There could be problems that belong to the NP-complete category. These problems may not have an exact solution. We will be discussing these aspects once we are done with topics like asymptotic notations and complex algorithms.

## Simulation building in physical science

Solid-state physics largely employs simulation techniques for modelling. By taking data from experiments, crystal lattice structures can be constructed easily.

Figure 7 shows one such three-dimensional array of lattice points. The properties of the crystal structure can be inferred by building them. Simulations will give us a clear picture of the crystal properties by considering its primitives.

So is the case with complex bodies. It is found that Jupiter is accompanied, in its orbit, by two groups of asteroids that precede it and follow it at an angular distance of π/3 (see Figure 8). By building simulations we can show that these are positions of stable equilibrium. A Runge-Kutta procedure with automatic step control can be used for analysing the data from Table 2.

Data for analysing a Runge-Kutta procedure with automatic step control
Sun Jupiter Trojan 1 Trojan 2
Mass 1 0.001 0 0
x 0 0 -4.50333 4.50333
y 0 5.2 2.6 2.6
z 0 0 0 0
Vx 0 -2.75674 -1.37837 -1.37837
Vy 0 0 -2.38741 2.38741
Vz 0 0 0 0

Using the computational procedure we can get a prediction as shown in Figure 9.

You can also try solving simple problems of the following form:

The Runge-Kutta procedure is quite sufficient to handle these types of problems, provided you have enough data.