Distributed Processing

Two directions that traditional operating systems have gone in terms of processing:

  1. support for threads
  2. distributed processing

Threads

Have talked about threads in CS3013, but review a little.

Also called lightweight processes. Contain an execution state within a shared address space.

Threads are natural to use for a server handling requests. Each request can be handled by a thread. Can also have multi-threaded clients for handling user interaction along with network and file I/O.

Threads vs. Processes

Terminology among different systems:

Distributed OS kernel Thread name Exec. Env. Name
Amoeba Thread Process
Chorus Thread Actor
Mach Thread Task
V System Process Team
Unix - Process

System Models for Distributed Computing

Processor Pool Model

Amoeba. Best argument for using this approach comes from queueing theory. ``Replacing small resources by one big one that is times more powerful, reduces the average response time -fold.

Dedicated processors. Do not support user interaction. For example: Beowulf cluster of processors runnin Linux.

Workstation Model

Every user has their own computer. Can share computing resources.

Can show both theoretically and practically that many idle nodes exist at any one time. Look at Fig 11.1 and 11.2 from Singhal.

Would like to use these idle nodes. What is idle?

What is performance we are trying to optimize?

Scheduling Issues

Components

Examples

Co-Scheduling

co-scheduling or gang scheduling to get processes that are cooperating to run at the same time.

Load Sharing

Code Download by Helper Clients

Receiver-initiated approach for Internet-wide scale.

ELZ Paper

``Adaptive Load Sharing in Homogeneous Distributed Systems'' by Eager, Lazowska and Zahorjan

Sender-initiated policies.

It is adaptive in that policies react to system state. Basic idea is to understand how relatively simple load sharing policies work. Problems with complex policies:

Identified a transfer policy (when to transfer) and a location policy (where to transfer).

Assume a system model of 20 homogeneous node with some processor cost assigned to a task transfer.

Exponentially distributed arrival rates and service times. Used an analytical study backed up by a simulation.

Policies: all try to transfer a new task if the queue length is greater than or equal to a threshold .

Look at primary results. Basically show that simple is good.

Wills and Finkel Work

``Load Sharing Using Multicasting''

Buddy policy by Shin and Chang is to broadcast state changes to buddy set of nodes

Node states: underloaded (eligible to receive tasks), medium, full-loaded (try to transfer tasks)

Policies:

Look at results. Can also look at scaling.

Implemented as MQP on WPI's Beowulf cluster. System is called PANTS (PANTS Application Node Transparency System).