Knowledge Elicitation for Ordering of Design Steps

Knowledge Elicitation for Design Task Sequencing Knowledge

Masters Thesis Proposal

Janet E. Burge

Advisors:
David Brown
Eva Hudlicka (Adjunct)

Computer Science Department
Worcester Polytechnic Institute

April 23, 1998

Introduction

Knowledge Elicitation (KE) is the process of obtaining knowledge from a domain expert that describes how they perform a specific task and/or describes what general knowledge they have about the domain. More specifically, KE refers to obtaining knowledge from a person in order to transfer it to a computer program [McGraw & Harbison-Briggs, 1989]. For example, in order to build an expert system to perform medical diagnosis, physicians would be interviewed by a "Knowledge Engineer" to determine what symptoms they look for. The accessibility of this knowledge is dependent on the type of task/knowledge and the subject being questioned. If the task is one that primarily requires motor skills, the chances of the task being performed 'automatically' are much higher than for a task that involves analysis of a problem. Also, different subjects vary in how good they are at articulating their knowledge. This is affected both by the skill level of the subject and by how well they are able to verbalize their decisions. The higher the skill level, the more likely it is that the subject will be performing all or parts of the task automatically and will not be able to explain every action.

Different KE techniques have been developed in order to obtain knowledge [Boose, 1989], [Cordingley, 1989]. They can be classified in many dimensions. The most common one used is "direct" versus "indirect," where direct techniques are used to obtain information that is easily verbalized and indirect techniques are used to obtain information that is not easily verbalized [Hudlicka, 1997]. Classification can also be based on the type of interaction with the subject and the type of knowledge most commonly obtained (sequencing, classification, etc.) [Burge, 1998].

This thesis is concerned with the knowledge required to perform design [Brown, 1993]. For design, different types of knowledge are required at different stages of the design process [Smithers, 1998]. Knowledge is required when creating/revising requirements, creating a problem statement, creating a solution, or solutions to the problem, and analyzing the results. Design plans [Chandrasekaran, 1990], specifying the actions taken to produce the design, are used to create solutions to the design problem. These plans require knowing the sequence in which the actions should be taken. The sequence of actions can depend on many factors, including dependencies between subproblems and designer preferences.

Sequencing knowledge can be obtained by using a direct technique such as interviewing. One drawback, however, is that a designer may not be aware of the order in which they perform the steps of their design or why the order is important. Indirect techniques are effective at getting information that is less easily expressed, but are better suited to obtaining information about classes and attributes, not knowledge about process. In this thesis proposal, a combination of direct and indirect techniques is proposed to overcome this limitation and more effectively elicit the information required to determine the sequencing of design subproblems.

Related Work

Knowledge Elicitation Techniques

Much work has been done to classify knowledge elicitation techniques. In [Cordingley, 1989], KE techniques were grouped into twelve categories. The primary means of classification was the type of interaction with the subject. In [Geiwitz, et al., 1990], an expert system called KATALYST was proposed for selecting the most appropriate knowledge acquisition technique (KAT) for a specific problem. Both these papers discussed using a single technique at a time.

Complementary techniques are often combined in order to achieve better results. Thordsen [1991] compares Critical Decision Method and Concept Mapping. The conclusion of the paper was that they could be used very effectively together.

Knowledge Elicitation Systems

Performing knowledge elicitation manually has several drawbacks. It is time-consuming for the knowledge engineer and results in large amounts of data, needing analysis later. It can often be difficult to be consistent from session to session and subject to subject. To overcome these problems, automated KE tools [Boose, 1989] have been developed. The following list gives a small subset of these tools:

Protégé, developed to generate domain specific KE tools [Munsen, 1998]
DNA, used to generate domain specific KE tools for creating computer-based tutoring systems [Shute, 1998].
VIEW, a general KE system currently under development [Zacharias et al., 1995]. This tool will initially be used to conduct computer assisted KE sessions to obtain data about Military Decision-Making. VIEW will both generate and administer the KE sessions.

Design Knowledge

In Smithers' model of Engineering design, [Smithers, 1998], the following types of design knowledge are described:

Requirements formation knowledge and requirements revision knowledge: these types of knowledge are required to write the system requirements.

Problem synthesis and specification knowledge, and problem revision knowledge: these types of knowledge are required to formulate a problem statement.

Problem solving knowledge: this type of knowledge is required to generate proposed solutions to the problem.

Problem analysis, assessment, and evaluation knowledge: this type of knowledge is required to analyze the design solutions.

In addition, design presentation knowledge is needed to keep the customer informed about the progress of each step, and design documentation and rational recovery knowledge is required to document the design.

A similar model is used in the CommonKADS Library [Bernaras, 1993]. Design starts with the customer's needs and desires and then proceeds through an analysis phase, which produces both the initial requirements and an initial problem statement. A synthesis phase then produces a design solution.

Design Plans

The sequence of actions taken to produce a design is known as a design plan [Chandrasekaran, 1990]. These plans decompose the design problems into subproblems. If dependencies between subproblems exist, the subproblems will need to be solved in a particular order so that backtracking can be avoided. A useful heuristic is to solve heavily constrained subproblems first to reduce the solution space [Liu & Brown, 1994].

Problem

Determining the subproblems in a design plan and a good order in which to perform them is a crucial step in the design process. Sequencing errors will cause the design system to perform unnecessary backtracking or, as a worst case, fail to solve the design problem. In order to determine the correct order the following information is needed from the domain expert:

The decomposition of the problem,
The dependencies between the subproblems,
The degree to which the subproblem solutions are constrained, and
The order in which the subproblems are normally solved, and why.

Ideally this information would be obtained using a direct technique. Asking the domain expert for exactly the information needed appears to be the most efficient way to get that knowledge and implies that a direct technique would be the best match between technique and knowledge type. Unfortunately, experts may not be able to readily articulate this knowledge. This could be due to several factors: the expert may have performed the task so often that they are no longer aware of the order of the steps, or they may be aware of the order but not know why (or if) the order is the best one. This suggests that indirect methods may be required to obtain this information.

Indirect techniques are best at classifying information, not identifying process. This results in a mismatch between the technique chosen and the information type required. There are several ways to address this problem. One is to modify the method to force it to get the type of information required. For example, repertory grid analysis [Kelly, 1955] could be used to compare plans, rather than entities. Another approach is to use multiple techniques.

Solution

In order to fully utilize both types of techniques when obtaining sequencing knowledge, a direct technique is used to establish a base of knowledge and an indirect technique is used to identify additional knowledge that is not readily accessible. This will be done in two phases where the first phase involves obtaining the information using multiple KE techniques and the second involves verifying the results.

Knowledge Elicitation Phase

This phase will involve the following steps:

Use a direct technique to obtain a description of the task. This will serve to familiarize the domain expert with the task and system and will also provide insight into how they approach the task. The method used will be Forward Scenario Simulation [Cordingley, 1989], where the domain expert is presented with a description of the task and then asked to describe the procedures followed to solve it and the rationale underlying their decisions. This technique was chosen because it does not require that the designer be able to perform the task during the KE session or that the knowledge engineer have prior knowledge about how the task is performed.
Ask the domain expert to list the design subtasks, in sequence, forming a basis for the repertory grid analysis. In addition, preferences and constraints are obtained.
Use the indirect technique of repertory grid analysis [Kelly, 1955] to determine dependencies between the subtasks. This will be done by presenting the subtasks to the expert in groups of three and asking the user to describe the differences and similarities along the following dimensions:

what information is required to complete the subtask
what are the potential problems that may occur when performing the subtask

The domain expert will then be given an opportunity to repeat Steps 2 and 3 to add more subtasks.

Verification Phase

After the information gathered in the Knowledge Elicitation Phase has been analyzed to determine subproblems and subproblem ordering, the results will be shown to the domain expert to verify that they are correct. This will be done to both validate the results and to determine where the KE process requires adjustment.

Proposed Implementation

As stated above, the advantages of automated KE include easier data management and increased consistency. Another advantage is that, if necessary, sessions can be conducted without the presence of the knowledge engineer. This may allow remote KE to be performed by simply sending the KE software to the domain expert and having them return the results to the knowledge engineer. This is more convenient for the domain expert (increasing their level of cooperation) and less expensive for the knowledge engineer. The disadvantage of this approach is that it does not utilize the observations of the knowledge engineer or allow interaction between the expert and knowledge engineer during the KE session.

For the Knowledge Elicitation Phase, the system will record information identifying the subject and will record the date and time of the experiment. It will then guide the subject through the four steps described above and save their responses to a text file. The Verification Phase will be conducted through informal interviews if possible, otherwise via e-mail and phone interviews. If the results indicate that adjustments are frequently required, requirements will be written for a future system that automates the evaluation phase as well.

Since one goal is to allow remote KE, the Knowledge Elicitation Phase will be implemented to conduct KE experiments using the Internet. Development will be done using cgi scripts, JAVA, or a combination of both. This will eliminate the need to distribute software and return the results.

Evaluation

The system will be evaluated for both the quality of the method and the usability of the system. Measuring the quality of the method is a difficult problem. There are two approaches that can be used. One is to compare the method to others, the other is to evaluate the results independently.

Comparing the results of one KE method to another has many problems. If different methods are used to elicit knowledge from the same expert, the results will be influenced by the order in which the sessions occur. This will be likely to bias the results in favor of the second method applied since the expert will have had more experience discussing the problem. If two different experts are used, the quality of results is likely to be influenced as much (if not more) by the ability and specific knowledge of the expert as it is by the method chosen.

The second approach, proposed for this thesis, is to have the results of the knowledge elicitation session evaluated by an independent domain expert (one not involved in the knowledge elicitation process). This expert will be asked to examine the results of the analysis to determine if the sequencing obtained is correct. The expert may also use the knowledge obtained to perform the design task and evaluate the results.

Usability of the system is an important criterion. While most domain experts are probably familiar with computers, they may or may not be experts with them. Therefore, it is essential that the system is "easy to use." If the system is not pleasant to use, the domain expert will be less likely to participate in future KE sessions. Usability will be evaluated by building a user survey into the KE system. After they have completed their task, the user will be asked to rate the system and make any comments they have. Results from the survey will be used to make future improvements to the system.

Schedule

The following tasks have been/will be completed by the approximate dates given:

(2/98: Literature Survey)

(4/98: KE Technique Classification and Selection)

6/98: Evaluation of implementation methods

8/98: System Implementation

9/98: Conduct Experiments

10/98: System Evaluation

12/98: Write-up Results

References

Bernaras, A. (1993). Models of Design for the CommonKADS Library, ESPIRIT Project P5248 KADS-II.

Boose, J.H. (1989). A survey of knowledge acquisition techniques and tools, Readings in Knowledge Acquisition and Learning, California, Morgan Kaufmann, pp. 39-56.

Brown, D. (1993). Intelligent Computer Aided Design, Encyclopedia of Computer Science and Technology, Marcel Dekker, Inc., pp. 153-166.

Burge, J. (1998), KE Tool Classification, http://cs.wpi.edu/~jburge/thesis/kematrix.html, Artificial Intelligence Research Group, Worcester Polytechnic Institute.

Chandrasekaran, B. (1990) Design Problem Solving: A Task Analysis, AI Magazine, pp. 59-71.

Cordingley, E. S. (1989). Knowledge elicitation techniques for knowledge-based systems. In D. Diaper (Ed.), Knowledge elicitation: Principles, techniques and applications. Chichester, England: Ellis Horwood Ltd., pp. 89-173.

Geiwitz, J., Kornell, J., McCloskey, B. (1990). An Expert System for the Selection of Knowledge Acquisition Techniques. TR 785-2, Contract No. DAAB07-89-C-A044, Anacapa Sciences, CA.

Hudlicka, E. (1997). Summary of Knowledge Elicitation Techniques for Requirements Analysis, Course Material for Human Computer Interaction, Worcester Polytechnic Institute.

Kelly, G. (1955). The Psychology of Personal Constructs. New York: Norton.

Liu J., Brown D. (1994), Generating Design Decomposition Knowledge for Parametric Design Problems, Proceedings of AID-94, Kluwer Academic Publishers, pp. 661-678.

McGraw K, Harbison-Briggs, K. (1989). Knowledge Acquisition Principles and Guidelines, New Jersey, Prentice Hall.

Munsen, M. (1998) Protégé, http://smi-web.stanford.edu/projects/protege/, Knowledge Modeling Group, Stanford University School of Medicine.

Shute, V. J. (1998), DNA: Towards an Automated Knowledge Elicitation and Organization Tool, submitted for Cognitive Tools-II.

Smithers, T. (1998) Towards a Knowledge Level Theory of Design Process, to appear in Proceedings of AID-98, Kluwer Academic Publishers.

Thordsen, M. (1991). A Comparison of Two Tools for Cognitive Task Analysis: Concept Mapping and the Critical Decision Method. Proc. Human Factors Soc. 35^th Ann. Mtg.

Zacharias, G., Illgen, C., Asdigha, A., Hudlicka, E. (1995). VIEW Visualization and Interactive Elicitation Workstation, Final Rep. No. R94371, Contract No. DASW01-95-C-0066, Charles River Analytics, Psychometrix, MA