Muscle Based Modeling
The goal of the research is to devise an efficient and accurate model of the human face
and to develop a notational system to encode actions. This notation should be rich enough
to make computer recognition possible.
It seems to be very convenient to have a separate notation for body and action representation:
- action-based notation
- This high-level notation is used to describe which actions should be performed.
Several systems have been examined for features such as completeness and adaptibility to computer:
- Labanotation is used primarly for describing dance movements, so there is little or no
support for facial expressions.
- Sutton-notation is more pictorially based. It does support facial expressions.
- Birdwhistell proposes a large vocabulary of pictorial symbols for describing the
actions of both the body and the face.
- The Facial Action Coding System FACS is not graphically-oriented. It describes
a set of all possible basic actions performable by the face. Each action is called an
Action Unit (AU). An AU is a basic action in the sense that is cannot be broken up into smaller
actions. Each action is caused by a minimal number of muscles and is therefore closely connected
to the anatomy of the face.
- structure-based notation
- This low-level notation is used to describe the physical model of the face. The
high-level actions are decomposed into lower-level structures in order to produce a simulation
of the face. Obviously, this notation should be well-chosen since it is the most important part of the
simulation. Three known techniques are:
- A simple 2D surface patch technique breaks up to head in small patches of skin. A facial action
consist of warping a subset of these patches.
- Parke introduced a parameteric approach to define the face and its actions.
This approach has produced some very impressive results. Some subtle interactions of the face
do require larger and larger sets of parameters to be manipulated. Since all parametes have to be hardcoded
(their existence, not their values), the systems loses some generality.
- A last representation involves a complete low-level simulation of the face. This model consist
of three levels:
- the bone
- the muscles
- the skin
Each muscle is connected to the bone and to one or more points in the skin. These connections
are represented by arcs, which hold information about the connection (elasticity, ...).
The basic action in this network of points and arcs is the application of a force (or tension) to
a point of the net. This force is propagated outwards to the adjacent points, ... .
These networks are often referred to as Tension Nets.
Given the two separate notations, the authors propose a system that recognizes facial
actions on a given face (using a camera) and simulate that face:
Both the high-level and the low-level representation (before and after the Internal Model
Manipulator) can be used to store the face for further use, modification or reconstruction
on another computer.
Design of the System
The authors have chosen the tension-net as structure-based representation.
They believe it is the most usable (and general) approach since it is a naturally-based
system. The FACS has been chosen as action-based notation as it is very compatible with
the tension-nets.
The FACS - Tension-net approach offers the following features:
- any performable facial action can be simulated
- it is a naturally based system
- there is a close relation between the cause of an action and its simulation
- FACS is face-independent
- the FACS decomposition is unique
- Efficiency of the representation
- the theory is extensible to cover other non-rigid objects
Subprocesses
- Although the camera processor was not yet accomplished, the authors feel that
it is a possible tasks (although not a trival one !). The program should scan the input image
for certain facial features. These features are used to determine the AUs involved in the
facial action. One of the major problems here is that one AU may mask another one (raising an eyebrow
can make the detection of eyelid actions impossible, ...).
- The AU parser takes a list of AUs and finds the muscle contractions for each one.
- The simulator performs the necessary contractions to simulate the expression.
Datastructures
The basic structures used are:
- adjacent 3D-points are connected by an arc. This arc holds information about
the elasticity, ... . If a force is applied to a point, the change in location is
calculated by
where k is the sum of the spring constants at that point.
- a muscle fiber consists of a fiber point, a bone point and one or more
skin points. A force acting on a muscle is applied to the fiber point.
- a muscle typically consists of several fibers. When a muscle contracts, all its fibers
contract in parallel.
- The highest-level structure needed is the AU. This consists of one or more muscles and their
relative magnitudes (these indicate the importance of the muscle for this AU).
Algorithms
When a fiber contracts, a force is applied to its fiber point. The direction of this force
is towards the bone point. The displacement of the fiber point is relative to the elasticity of the flesh. The
force is propagated through all the arcs adjacent to that point.
To simulate a muscle or a set of muscles, the sets of all fibers of all these
muscles is considered.
Animation
When a force can be applied to a muscle as stated above, animation can be achieved simply
by applying a force f/n n times. The animation becomes smoother as n increases,
but at the same time, the computational cost becomes higher, so a tradeoff has to be found.
Problems
Solved
The choice of the representations was backed-up by the fact that the solution of several
expected problems came naturally. The problem of one AU masking the existence of another
(first smile, then raise cheek vs raise cheek: since the cheek is raised during a smile, the second
action of the first case should not have any effect) is handled naturally as a result of the way skin
elasticities are handled. A second problem involving the creation of bulges, wrinkles and
furrows was again handled very will (these are all caused by two forces pushing points
towards each other.
Unsolved
Several problems were not handled at the time of writing:
- muscles following the flow of a bone sheet are not handled well: the muscle should
flow over the bone, not through it !
This problem can be solved by altering the representation of the fiber by adding several fiber-points:
- jaw actions are not handles well (yet)
- cheek actions such as sucking and puffing require a complex model of the face
including fluid (air) filled chambers
- totally non-rigid structures (such as the tongue) are not investigated. In the current
model, each fiber always has a bone point.
Back to home..