Analytical Hierarchy Process (AHP) is a process that helps us pick up one of the options of a list of choices. Each choice has a few parameters attached to it and we can set the weights of each parameter and have AHP pick the best choice from the list of choices. I will show this in the following example.
Analytical Hierarchy Process for College Selection
The strength of AHP is that it takes into account many different parameters for many alternatives and gives the result that best matches the parameters. We use this in out example of college selection. Suppose you are a college student and want to come to a conclusion as to which college you should attend. Given the different options of colleges and different personal preferences, it becomes excruciatingly hard to come up with the right college choice. In this example I have picked 8 colleges with 8 parameters each. So, a student has the choice of 8 colleges and a student has to give weights (level of importance) to each parameter (e.g. tuition, social life...).
Following is are 2 tables that give the lists of preferences and colleges:
We will use AHP to help us come to a conclusion as to which college a student should pick. In order to make this example more interesting I have simulated AHP for 1000 students. I have used matlab to code the simulations and get the results for 1000 students (see appendix B for source code).
Data Collection for Preferences
The preference matrices for the colleges were made using the data from rankings and rating found on a website. For the most part we are not interested in the data but the results we can extract from it.
Note: the data used in this example is obsolete
The data collected is not in the format we needed for AHP calculations. Following is the data extracted for the comparison matrices:
Initial pass used was to convert the grades (A, B, C…) into numbers. For this a grade-key was created giving A the maximum score and C+ the minimum score. The reason for using this scheme was that we do not have grades higher than A and grades lower than C+. The theory behind the grade mapping is to divide the scores equally into each grade. So the distribution over the grades follows the following table:
By applying the above table, we find out a more quantized matrix for the school-parameter pairs, this matrix is given below:
The whole idea of having the above matrix is to use it in pair wise computation with a student preference. So the only matrix a student has to provide is the all-preferences pair-wise matrix. The student is not supposed to give a matrix with the comparison of each school with a single parameter. The matrices that we calculate in these steps are the matrices for each parameter having all the different schools. This means that there will be 8 matrices and each matrix will compare one parameter (tuition, acceptance, salary, education etc). Within each matrix will be the comparison for each school with regard to that parameter.
To find the matrices we first need to normalize the values from the previous matrix. The normalization factor used is ‘1000’. A value this high is used because when the huge values are normalized, the difference is not emphasized with small normalization factor. This can be realized in the following table:
In the table above, we see that the tuition values are very close to each other. This is because the difference in tuition is not much in the original table and in order to preserve the small difference the huge normalization factor of 1000 is used.
Ideally the total of each column should equal 1000 but we see that in many cases it is not. This is because of the decimal precision error and is acceptable to us in this case. Since the values are very huge the effect of this error is very low on our calculations.
Pair-wise Matrix Creation
The eight different pair-wise comparison matrices were calculated using the following scheme:
- Fix the parameter for one of the college
- Use the value from step 1 to find the comparison ratio
The value that is fixed for each column of a matrix is a single college. This fixed value is divided with the rest of the values in the column from the previous matrix. The following matrix gives a better understanding of the above methodology:
The computation of the above table is done by procedure described previously. For the calculation of the first column the value in [Brown x Tuition] cell from Table 7 is fixed and divided by each row value in the column ‘Tuition’. For the second column in Table 8 the cell [Columbia x Tuition] from Table 7 is fixed and is divided by the corresponding row in the ‘Tuition’ column. This procedure is followed to populate all the columns of the table.
This table holds great value to it. We do not find it very intuitive at first but looking at the table closely shows that a very simple pattern is followed in the table. In Table 8, a more descriptive way to read the cell [Brown x Brown] is: “For tuition: I will choose Brown over Brown is 1’. This does not make a lot of sense because we are choosing the same option over itself. However, this has a useful insight to it; the value 1 means that both the comparison options have the same affect, so we can choose any one of them.
Let’s look at the cell [Harvard x Brown], we can say that “I would choose Harvard over brown with the value of 0.993182.” From our normalization we see know that a value of less than 1 means that the column option is much desired than the row option. So we will take Brown over Harvard in terms of tuition. If we look at Table 6 we see clearly why we would want Brown over Harvard; because the tuition of Brown is less than the tuition of Harvard.
Another useful interpretation of the table can be by looking at two cells: [Brown x Columbia] and [Brown x Cornell]. We see that the value in the first cell is greater than the value in the second cell. A way to interpret this is: we would want Brown over Columbia more than we want Brown over Cornell. The other pair-wise comparison matrices are listed in Appendix A.
The whole AHP is done using Matlab. A program is designed that simulates the selection process of ‘n’ number of prospective students and gives the result. Following are the different kinds of simulations that were run.
i. Creating the students’ preferences using uniform distribution
In this simulation we don’t find very interesting results. We see that all the students select Brown University.
The reason for this is that the data we have has Brown scoring very high in almost all the preferences. So if we create students giving weights to preferences uniformly, we see that Brown comes out to be the winner by a big margin. Following is the matrix for one of the randomly generated student:
For this simulation there were 999 more students generated with the same settings. The following table shows the final score for 5 of the students.
The above table shows that the students selected Brown University of the 8 options. This is because of the data that we have extracted and formulated.
ii. Creating the students’ preferences giving more weight to Salary
In this simulation we give more weight to Salary. The reason for choosing salary is that we see from Table 10 that U. Penn is the closest competitor to Brown University and looking at the parameters of both the universities in the pair-wise comparison matrices we see that UPenn has a great advantage on starting salary. Looking at this we change our student’s bent more towards salary. This simulation has much more interesting results.
We notice that by giving more preference to Starting Salary, more than half of the students tend to choose U. Penn. We saw in Table 10 that the competition with Brown and U. Penn was very close and normally we think that giving a bias to U. Penn on one of the parameters will make U. Penn the ubiquitous winner. But the important thing to note is that in all the other parameters Brown is equal or is a winner, which has a huge effect on the students’ decisions. Another important result that we see from this simulation is that Students also select Dartmouth and this is not a huge surprise. From Table 10, we saw that the second closest competitor to Brown was Dartmouth and Dartmouth also scores high in Starting Salary. Following are pair-wise matrices of two students: the first student selected Brown and the second selected U. Penn.
If we just compare both the tables without going into much depth, we see the reason why U.Penn was selected by the second student: the average of salary row with student 1 is 3.40 and the average of salary row with student 2 is 4.33. Since, student 2 favors salary as more important, he gets U. Penn and student 1 gets Brown University. The following table shows the final scores of the students with respect to the universities.
i. Creating the students’ preferences with bias to many parameters
This simulation assumes that the student is more interested in starting salary, near-by city, going-well and faculty accessibility. At the same time the student is not expecting good tuition, acceptance rate, education and social life. Running the simulations in this setting gives us the following distribution:
We see more students selecting Dartmouth in this setup.