OR Insight Vol 10 Issue 4 October - December 1997
Scheduling small groups examinations -
Minimising resources required by optimal clustering of student groups Omar Ben-Ayed
Tb is a rtic!e prese; its a;; ii iteger programm ¡ng model that
n'as built to minimise cost-resources associated with the loc,istics of arganising college examinations featuring large numbers of class sections with sinai! numbers of stude; ifs in each. The problem being inherently NP-hard, its initial for;nuiat ion problem reveals a great deal of computational intractability, because of large ;nimbers of binary variables and costraints. To overcome this difficult i, a facility layout method is used to decompose the initial problem info smaller independent su bp roh! eins. An iterative heuristic approach is later designed to solve each et the su bprobl ems,
an examination during the so-called 'second session examinations' as a section. The number of sections may be as large as the number of all courses taught
by the college during the two semesters of the
academic year. The number of students per section, on the other hand, may be as small as one (depending on the number of students with marginal performances). A student is said to be registered in this examination round if he/she belongs to at least one section. Data from the college reveal that the major-
ity of students registered are in two or three sections. It is unusual to have a student belonging to more than five sections.
-ooOoo-
The problem The College of Business and Economics at the
University of Sfax in Tunisia organises at the end of each academic year what is called 'the second session examinations round'. This examination round
is effectively about giving a second chance to all students with marginal performances during the academic year. That is, if the grade of a certain student in a given course during one semester of the
year is neither good enough nor poor enough to warrant a clear-cut pass or a clear-cut fail, respectively, the student is then invited to sit a comprehensive examination in this round of examinations, for which the grade is decisive for that course.
The objective of the college is to minimise both the number of exam-days needed to process all sections
and the number of faculty members needed for administrating these examinations. Every exam-day
is divided into a given number of exam-periods (typically two to four). This number is decided by the College at the begining of the academic year. Here, the sections are put into groups that can be scheduled on the same exam-period of the same exam-day. We shall refer to such groups as clusters.
Obviously, by minimising the number of clusters, the number of different exam-periods is also reduced, and so is the number of exam-days (first objective). Also, by assigning the appropriate rooms
to each cluster (thereby assigning any appropriate
number of sections to every single room), the number of faculty members invigilating can be diminished (second objective).
The analyst at the computer center of the College, who is usually in charge of scheduling these exami-
nations and a former student of OR himself, inqui red once whether an optimisa tien technique could be designed to ease his task. This inquiry
prompted an interest in investigating this matter. In
the process, it was apparent that the problem required more than the simple application of one particular OR technique. So, an algorithm based on the specific charateristics of the problem was con-
cci ved using several OR techniques. This article describes the problem and the solution in what can be considered as a successful application of OR.
In the article, we shall refer to ans' group of students in a specific course who end-up being invited to take Copvi'it'hí (t) 1997 Operatiniwl Research So ('i('ÍV.
The formulation Given that the total number of students registered for a given examination is relatively small as compared to the capacity of the rooms in the college, no
capacity constraint is needed for this problem. However, two important constraints should be
taken into account in this problem. One is a phvsical constraint, ie no student can sit more than one examination in one single exam-period. That is, no student can belong to more than One section of the same cluster. The second contraint is of an operational nature. it is the condition that no examina-
tion can be given more than once in a given round of
28
OR Insight Vol /0 Issue 4 October - December 1997
Decomposing the problem
examinations. In other words, no section can belong to Im rc than one du ster.
When we have no value for q other than in, the integer programming formulation (1 )-(4) would include 1n2+in binary variables and inn+in con-
Let us denote ii as the number of all students regis-
tered for the exa intion round, and in as the
number of all sections. We are therefore given an n by ni binary matrix A whose entry a, is equal to 1 if the student ¡ belongs to the section j, and is equal to O other ise. A cluster (which is a group of sections) can be viewed as a group of columns. A necessary condition for any two columns to belong to the same
straints. Typical values for in and ii are 250 and 600, respectively. Our problem is therefore an NP-hard with effectively more than 60,000 binary variables
and more than 150,00() constraints. This surely
makes this problem intractable from a computational point of view. A first step towards handling this
cluster is that they must never have values of I
difficulty involves decomposing the problem into small subproblems. Such a decomposition is possi-
simultaneously in any of their rows (by the virtue of the physical constraint).
ble by the very nature of the problem. Indeed, generally no student is involved in all courses anyway, (for example, freshman students would not
Our objective is to minimise the number of clusters, which is bounded by a number that we denote by q;
take senior level courses). From this perspective, we need to come-up with independent supplementary' families of sections meeting the requirement tha t no student belong to more than one family.
the least optimal value we can give to q is the number m of sections (ie assigning every single section to one cluster). We refer to xk, j varying
from 1 to ni and k from 1 to q as a binary variable
Every such family would form a subproblem in our decomposition.
kth cluster and the value of O otherwise. We also refer to if, k varying from 1 to q, as another binary
The problem of finding independent families of
that takes the value of 1 if the section j belongs to the
sections to proceed with the decomposition is very
variable that takes the value of 1 if at least one
similar to the cell formation problem in facility layouts. The production flow analysis method
section belongs to the kth cluster and the value of ()
otherwise. In other words, / is equal to I if and
(Meredith, 1992), in particular, can be appropriately applied in our problem. The essence of this method is to determine the machine-component matrix and
only if there exists at least one value of j for which
XK is equal to 1; and if is equal to O if and only if x's aiie equal to I) for all values of j. Accordingly, oûr problem can be formulated as:
later on identify those components with common machine requirements. The machine-component matrix is formed by listing across its top all the components and down its side all the machines; then ones (l's) are placed in the matrix wherever a component uses a machine. The objective of the method is to reorder the components and machines so that 'blocks' of ones that identify the cells are
min
st
L nf
¡=1, ii;
k=l, q
formed along the diagonal. One approach for
attempting to reorder the matrix is to move rows with ones to va rd the left to the top o f the matrix and columns with ones toward the top to the left of
/.=1, iii
the matrix. if r {O;1 };
¡=1, iii;
ksl, q
Our matrix A can he regarded as a machine-compo-
nent matrix where students replace the machines
M is a number that is large enough to be an upper
and sections replace the components. We can apply
bound for the first member of (2). One possible
the same re-ordering technique to transform the
. The inequalitvexpressions (2) represent the physical constraints when if takes the aine of 1. Also, b the virtue of these same expressions, all v equal to I) when ti' is I). No constraint is value of M is
matrix A to the form:
n
needed to make if equal to I) as long as the f's are equal to O. This has to be tru e as the if' s are seing minimised in the obiective function, which requires all of them to take the value of I) (unless a constraint e. preent hieb would make some of them take the
A1
O
O
O
A,
1)
O
O
t
...
A
p
Each A, ¡ varying from 1 to
alue of 1). The equations (3) are the operational
columns such that
contriint.
29
,i =
n and
has n rows and in1
Z
in1
= in.
OR litsight Vol 10 Issue 4 October - December 1997
The smaller the values of n(s and m1's, the more advantageous the decomposi ton is. Our approach
has been to minimise these values through the maximisation of the number p of families. In the case of the College of Business and Economics at the
University of Sfax, the average value of pis lo.
matrix that we designate as A'1. The kth iteration of the procedure is therefore about acquiring the cluster number k+l from the matrix A1. The following
integer program is applicable to a cluster of this nature. In fact, it is a cluster wïth the largest possi-
ble number of columns:
Once the problem is decomposed, every subproblem is treated independently. To solve the /th subprob1cm, we simply replace in (l)-(4), A by A1 (which is
'nl
max
really replacing in by m1, ii by n7, j by i1, j by j7, k by k7, q by q1, and M by M1):
(9)
jI=1 n?!
<1;
s.t.
min
i=1,ii1
(10)
.11 = I
k/Si
st.
X1.7 E 1f); 1}; kl ¡I
< Mt
k1=1, cf1
The physical constraints (10) are the only constraints
included in the above formulation (9)-(1 1). The operational constraints are omitted on the ground that the procedure discards all assigned sections, thereby preventing their reinclusion in any further
J7=1, m,
kISI
xi Il
¡=1, ;i;
e {0;1
;
(11)
j=1, in1
j=1, t,?; k=1, q7
cluster. This integer program includes ¡ri1 variables and ii constraints. The largest values recorded for
these two parameters have never exceeded a few hundreds and a few tens, respectively. (9)-(1 1) are
All the variables and parameters in the subproblem formulation (5)-(8) continue to be defined as in the
solvable using standard integer programming
original problem formulation (1 )-(4).
software and a Pentium microcomputer.
However, the solution provided by this iterative
An iterative method to solve large subproblems
procedure is only near-optimal to (5)-(8). For
example, consider the following student-section matrix A7:
Unfortunately, the process of solving many of these subproblcms has not been as easy as originally anticipated. in some of these cases, the subproblem has
been as large as 40 or even 50'7 of the original problem. To handle these cases, (where (5)-(8) remains hard to solve), we propose the following
1
0
0
0
0
1
0
1
1
0
1
o
iterative heuristic method.
i
1
0 0
lt is clear that column 1 is incompatible with column
3 (in row 3), and column 2 is incompatible with column 4 (in row 2). So the minimum number of
The proposed method is based on a modificaton of the objective. Instead of minimising the number of clusters, we now maximise the number of sections within each cluster. We now try to get the largest possible cluster from the matrix A1. The iterativity
clusters we can get is 2. A close observation would reveal that columns I and 2 can be included in a first
cluster and columns 3 and 4 in a second cluster. When applying our iterative procedure, we are also
consists of discarding all the columns that were
able to notice that there is a maximum of two
assigned to previous clusters from A1, and then extracting the largest possible cluster from the remaining columns. The procedure is carried-on until all columns are discarded.
columns that could be included in the first cluster. These could well be the first two columns, the last two, or columns 1 and 4. So as far as the first iteration is concerned, there are three possible solutions from which one is chosen arbitrarily. If the chosen solu tien of the proced u re is one o f the fi rst t wo possibilities (columns I and 2, or columns 3 and 4), then it must be optimal. But if the chosen solution happens to be the third possibility (columns 1 and
Let us now denote Akf as the matrix obtained after
discarding all the columns that were already assigned to the first k clusters from A1. Our original matrix A1 can thus be designated as AJ. And once we discard the columns corresponding to the first cluster from A1, we obtain a new matrix A'7. We continue the process until we end-up with a final
4), then two incompatible columns (2 and 3) are left-
out, which would result in two more clusters, (ie a
30
OR Insight Vol 10 Issue 4 October - December 1997
has the highest number of clusters deter-
total of 3).
mines the number of exam-days needed.
Based upon our experience, the solution of the itera-
The procedure was implemented by the same
tive procedure is usually satisfactory. In case a
person whose initial query prompted the interest in the present research, with some help from a student of the author. In the process, three subroutines were developed, each corresponding to one of the three
better solution is sought, the near-optimal solution could be used as a starting point to (5)-(8), giving a significantly low value for and making the resolution much less tedious.
steps of the algorithm. The first subroutine was reading the data from the binary matrix A and
Implementation
decomposing the problem into small subproblems.
Because the decomposition process did involve major changes in the order of the rows and the columns, it was necessary to create a file keeping
This article has applied some operations research techniques in solving a scheduling problem of a special type of an examination round featuring a large number of sections and a small number of students within each. The objective has been to
track of the initial order of every column and every row. This file was needed by the third subroutine. However, the main product of the first subroutine was another output file including, for each subprob-
group the sections into the smallest number of clus-
ters possible. The data of the problem is summarized in a matrix A listing across its top all of the
1cm 1, the values of the parameters n1 and rn1 as well
as the entries of A1. This file was the input for the second subroutine, which is repeated as many times
sections, and down all of the involved students. The ¡th row jth column of the matrix takes the value of I if the ith student belongs to the jth section, and the
as the number of subproblems.
The second subroutine was an interface that provid-
value of O otherwise.
ed integer programming formulations for a commercial solver (LINDO was used in this case). lt was initially decided that each subproblem had to
Theoretically, the problem can be solved using
formulation (1 )-(4). However, because of the large number of variables and constraints in the formulation, it has not been possible for this NP-hard problem to be solved. An alternative procedure has been
be formulated according to (5)-(8) if its corresponding parameters n1 and m1 were both less than loo, and according to (9)-(1l) otherwise. Given that the programmer had also been a user himself, he often took the initiative to change this value depending on
designed here to cope with the computational intractability of this problem. lt is:
his evaluation of each situation. For example, he found it more efficient to choose a higher value when the microprocessor of his workstation was upgraded from an 80486 to a Pentium. In the case
Step 1 Use the production flow analysis method to decompose the problem mto small subprob-
of formulation (5)-(8), the subroutine simply called the solver and then recorded the number of sections and columns to be included in each section. In the alternative case of formulation (9)-(l1), the subroutine was supposed to use a temporary file where the newly deleted columns were expected to be recorded at each iteration. At the end of the last iteration, the number of sections as well as the columns to be included in each section were to be recorded in the permanent output file concurrently with the clearance of the temporary file.
lems. We designate A1 as the students sections matrix of the lth subproblem.
Step 2 For each suhproblem, 1: Solve (5)-(8) if the size is sufficiently small.
Use the following iterative procedure
otherwise: 1
At each iteration k, discard all columns
assigned to previous clusters from A1, and
The third subroutine focussed on the subproblem having the highest number of sections. Where this
solve (9)-(1 1) to get a new cluster. 2
Continue until the elimination of all columns from
subproblem was initially solved according to (5)-(8), its solution would be optimal (ie no more improve-
A1.
ment could be achieved). However, a better solution ought to be sought where the solution of the subproblem was derived from the iterative proce-
If the solution obtained is satisfactory, retain it. Otherwise use it as a starting solution to
dure. In the latter case, the second subroutine
(5)-(8).
would have to be called again to solve it according
Step 3 The clusters of all subproblems can be
to (5)-(8) using its current solution as a starting solution. As soon as the minimum number of dus-
scheduled in parallel. The subproblem that
31
OR Insight UoI /0 Issue 4 Oiio/er - De'c'nther / 997
ters is obtained, the third subroutine would use the first output file of the first subroutine to provide the actual students within each section and the actual sccti( ins within each cluster.
As a result of the implementation, the process of examina hon scheduling for the so-called 'second session examinations round', which in the past used to he a genii mclv straining task for the staff of the
college's computer centre, has now been eased dramatically. The long hours that used to be re-
quired by this task have been reduced to minutes. Fewer exam-days are now needed and much less human resources are involved. But more impor-
tantly, there is also great improvement quality-wise, as no more scheduling mistakes are occuring.
Ac kn ow legmen t
The au thor r. grateful to Mr Llafe'dh Ksibi, his ex-student at the Col lege of llusi ness and Economics in Sfax, and to Mr Rachid Karaa, anal vst at the computer centre of the same College, for their efforts in implementing the procedure. The author would like also to thank Dr Ah Arifa, senior lecturer at Lincoln tjniveristv in New Zealand, for his fruitful comments on the article. For the interested reader M cred j th J R t t t2 t Tu,' Mi;ian',iir'n t of ( ) pei1itlo)i s. .'l Conceptual
I n1pí1aI". olin Wiliv & Sons, nc, New York.
Errata The article by Richard Ormerod in Issue ll)/1 entitled 'OR models a.sist the Si,ewell lt public inqu try' un fortuna teli con tarried a few errors as follows:
OMAR BEN-AYED is a 'maitre de conferences' (associate professor) it the University of Sfax in Tunisia, and is currently an associate visiting professor at King Saud UniveNity in Saudi Arabia. F le received a IFS in Quantitative Methods from the University of
l'age 2, first line of the third paragraph 'In the late n inches and
Sfax in Tunisia, an MS in Applied Mathematics and a PhD in llusiness Administration from the University of Illinois at Urha-
early....' should read 'In the late seventies and ea.rlv...
na-U hampaign in the USA. I lis work has appeared in Operations
Page 3, two thirds down the right-hand column '... which h ad been set up in t 93' should read ' ....which had been et up in
ke.earch, Com puters and Operations kesea reh, Tran sporta ti Research, Annals of Operations Research, and Revue Tunisienne d'Economie et de ( ;etion.
1975'.
l'age 7, u rider For the interested read er, 'Long k ((QQ2) should he Long t lhi2l'
2