next up previous

Introduction

System administration is the realm of computer science which deals with the planning, configuration and maintenance of computer systems. It is presently a discipline founded mainly on the anecdotal experiences of system managers[1]. To date, no formal (mathematical) analyses of system administration have been undertaken, with the aim of making more scientific studies. This makes it difficult to express objective truths about the field, avoiding marketing assertions and the vested interests of companies and individuals, which are common in the commercial sector.

The aim of the present work is to establish a formal basis for the field, a way of formulating a framework for objective discussions about computer management. It will hopefully serve as a bridge between mathematical disciplines and system administrators. In this respect, the paper may be viewed mainly as a commentary, laying some foundations for future work, rather than providing immediate solutions.

In previous work, it has been shown how the average behaviour of systems of computers and users can be approximated by a blend of statistical models and thermodynamical ideas[2]. That work allows us to form a mathematical model of computer systems which can be used as a basis for modelling system administration. The study of computer behaviour has much in common with the physics of thermodynamics. From a coarse mathematical viewpoint, system administration can be viewed in much the same way as thermodynamical experiments with pistons and engines, i.e. moving information and resources around in such a way as to change the state of the system. However this viewpoint is mainly useful in a calculational setting. System administration also has much in common with medicine. In many ways, system administration is medical science for computers: a somewhat simpler problem than that of human physiology, but nonetheless involving many of the same themes: nutrition, regulation, immunity and repair.

What then should a theory of system administration be about? The task of elucidating this sounds straightforward, but it is a slippery business. System administration, in reality, is based on mainly qualitative, high level concepts, which mix technical and sociological issues at many levels. Although it is clear to system administrators that there is a body of technical principles involved in the discipline, it remains somewhat intangible from the viewpoint of a scientist. It is hard to find anything of general, reproducible value on which to base a more quantitative theory.

One of the obstacles to formulating such a theory is the complexity of interaction between humans and computers. There are many variables in a computer system, which are controlled at distributed locations. Computer systems are complex in the sense of having many embedded causal relationships and controlling parameters. Computer behaviour is strongly affected by human social behaviour, which is often unpredictable. The task of identifying and completely specifying the ideal state is therefore a non-trivial one. It is nonetheless this task which this paper attempts to address. Can one formulate a quantitative theory of system administration, which is general enough to be widely applicable, but which is specific enough to admit analysis?

If this, already significant problem can be addressed in sufficient terms, one might then aim to look further towards general regulatory systems and approach more ambitious questions. It is not difficult to see many analogous questions in other areas of science, which could be applicable to system administration. For instance: what is the effectiveness of generalized immunity and repair systems[3] (automatic repair and regulation)? Is there an optimal strategy for error detection and correction? Is a system administrator's human mind (playing the role of doctor/surgeon) better or worse than a mechanistic response or immune system? This last point is often a bone of contention in the system administration community. Should tasks be automated? Or should a human lawgiver always remain in manual control? What is more efficient? Biological systems point to the need for both types of management: at any given moment, a doctor's intelligence and superior human cognition can compensate for a lack of adaptation in our programmed immune responses, but the automatic immune response is both faster and more capable than a doctor when its program is sufficient. Certainly the empirical evidence in biological information systems is compelling: after billions of years of evolution, nature has established immune systems in all vertebrates larger than a tadpole. Of course, this is no indication that the solution is optimal. No acceptable analysis has been used to demonstrate this yet. It could be that vertebrate evolution is merely poised on some plateau between minima of much deeper importance.

The aim of this paper then is to elevate system administration from an expression of subjective opinion to a more objective, scientific level, hopefully without inflating it meaninglessly into pseudo-science or philosophy. In order to limit the length of this paper, solutions of the models and constraints will be kept to a minimum here. However, it will be possible to draw a few general conclusions, even without reference to specific models.

The outline of this paper is as follows. To begin the discussion it is necessary to establish some basic axioms. It is important to restrict the scope of what a theory of system administration encapsulates; without such a restriction, one ends up with either many disjointed pieces or only vague hand-waving notions. Having determined the ground rules, it is then appropriate to identify the basic operations which can be carried out within that scope. This identification is required in order to formulate a discussion of strategies for system management. Once this level of formality has been attained, strategies can be formulated, based on types of action and timing and the task of administrating a computer system can be described in precise game theoretical terms. This is the primary goal of this work.


next up previous
Next: The scope of system Up: On the theory of Previous: On the theory of
Mark Burgess
2000-03-24