A disk drive consists of a disk pack containing one or more platters stacked like phonograph records. Information is stored on both sides of the platter.
Each platter is divided into concentric rings called tracks, and each track is divided into sectors. All transfers to and from the disk are performed at the sector level.
For example, to modify a single byte, the system reads the sector containing the byte off the disk, modifies the byte, and rewrites the entire sector.
The disk uses a read/write head to transfer data to and from the platter. The operation of moving the head from one track to another is called seeking, and the heads generally move together as a unit.
Because the disk heads move together as unit, it is useful to group the tracks at the same head position into logical units called cylinders. Sectors in the same cylinder can be read without an intervening seek.
One or more disk drives connects to a disk controller, which handles the details of moving the heads, etc. The controller communicates with the CPU through a host interface. Moreover, through direct memory access (DMA), the controller can access main memory directly, transferring entire sectors without interrupting the CPU.
Look at picture. Fig 5-3 from Tanenbaum.
Three factors influence the delay, or latency, that occurs in transferring data to/from a disk:
Current technologies (Maxtor Diamond Max 1750 Drive)
An important (experimental) observation is that:
We want to handle both file types efficiently.
The operating system may choose to use a larger block size than the sector size of the physical disk. Each block consists of consecutive sectors. Motivation:
Note:
The size of transfer convenient for operating system is a disk block. It may be the same size of a sector or larger. Generally moving to larger block size. NTFS uses 4K block size for disks larger than 2GB. FAT-32 uses 4K up to 8GB, 8K up to 16GB, 16K up to 32GB and 32K above 32GB.
Can also look at allocating blocks in a contiguous manner, but may not know the total needed for a file at creation time. Also can lead to fragmentation with too many small block runs.
Details of policies for reducing latencies for retrieval of blocks from disk discussed in previous course.
Approaches for keeping track of free blocks:
Also can groups set of consecutive free blocks using an address/count approach.
Can look at space/time tradeoffs for each approach.
Caching is crucial to improve performance. Why?
Unfortunately, caching may cause data in memory to become out of step with data on disk. This is known as the cache coherency problem. The problem is most significant in following contexts:
One solution is for the operating system to provide a ``flush'' system call, that writes to disk the file blocks associated with the file descriptor. The system might also flush the cache when the file is closed.
To guard against complete file loss, most systems recommend that operators dump the file system to archival storage (floppy disks, Zip drives, tape, etc). Thus, some or all files could be restored after a disk failure.
Because of the expense of dumping the entire file system, most dumps are incremental, meaning that only files modified (or created) within the last N days are saved. Once a week (month) a complete dump of the file system is performed.
Operators cycle through a sequence of dump tapes, reusing a tape only after the files stored on the tape have been archived to a another tape. For instance, daily dump tapes might be recycled only after the weekend dump has saved all files modified since the last weekly dump.
Thus, it is not always possible to restore every file from a dump, but chances increase the longer the file has been in existence.
For additional safety, backups are often stored off-site. A back might store backups in a different region for increased reliability.
Traditional files systems maintain a number of data structures on disk (directory structures, free-block pointers, indoes ...). These structures may be cached in memory, but ...
A crash in the middle of changes to files can leave these structures in an inconsistent state.
One solution is to run a consistency check on system reboot to ensure that file system structures are in a consistent state--time consuming!
Becoming more common to maintain a log of file system metadata changes. Changes are committed as a transaction once written to the log. Completed transactions are removed from log. Incomplete transactions can be replayed on system reboot.
Also faster performance for file requests as only log file needs to be written before committing transaction versus waiting for all data structure updates to be completed in critical path of the request.
Section 14.5 of SGG
There are several levels of RAID disks. Simplest is to use mirroring (RAID level 1) where each disk is shadowed by a copy. Requires double the disk space, but reads can be faster by reading from both disks.
Another level is block interleaved parity (RAID level 4). Fewer redundant disks. Assume nine disk blocks. Use one parity block for every eight data blocks. This is called disk striping or interleaving. If one of the disks fails then the data can be recovered using the remaining eight disks and the parity computation. Reads are quick because the data is spread amongst eight disks. Writes are expensive because parity must be recomputed.