Chapter 10: Mass-Storage Structure

Overview

Mass storage systems provide long-term, non-volatile storage for large amounts of data. The most common form of secondary storage is the magnetic disk, which offers direct access to data through read/write heads.

The operating system (OS) manages mass storage by: - Organizing disks into logical units - Scheduling disk access efficiently - Ensuring reliability and data protection

Disk Structure and Attachment

Disk Structure
  • A magnetic disk consists of: - Platters: Circular disks coated with magnetic material. - Tracks: Concentric circles on the platter surface. - Sectors: Subdivisions of each track, typically 512 bytes or 4 KB. - Cylinders: Set of tracks located at the same position on all platters.

Disk Access
  • To read or write data, the disk head must: 1. Move to the correct track (seek time) 2. Wait for the correct sector to rotate under the head (rotational latency) 3. Transfer data (transfer time)

Disk Performance Metrics
  • Seek time: Time to move the arm to the desired track.

  • Rotational latency: Delay waiting for the sector to rotate into position.

  • Transfer time: Time to move data between disk and memory.

  • Access time = Seek time + Rotational latency + Transfer time

Disk Attachment
  • Host-Attached Storage: Connected directly to the computer via interfaces such as SATA, SCSI, or NVMe.

  • Network-Attached Storage (NAS): Provides file-level access over a network.

  • Storage Area Network (SAN): Provides block-level access via high-speed networks (e.g., Fibre Channel, iSCSI).

Disk Scheduling

Disk scheduling determines the order in which disk I/O requests are processed. Goals include minimizing seek time and improving system throughput.

Common Disk Scheduling Algorithms

  1. FCFS (First-Come, First-Served) - Processes requests in the order they arrive. - Simple and fair, but can lead to long average seek times.

  2. SSTF (Shortest Seek Time First) - Selects the request closest to the current head position. - Reduces seek time but may cause starvation of far-away requests.

  3. SCAN (Elevator Algorithm) - Disk arm moves in one direction servicing requests, then reverses. - Provides better response uniformity and prevents starvation.

  4. C-SCAN (Circular SCAN) - Like SCAN, but the arm returns to the beginning without servicing requests on the way back. - Provides more uniform wait times.

  5. LOOK and C-LOOK - Variants of SCAN/C-SCAN that reverse direction or reset only after the last request rather than the disk end.

Comparison of Disk Scheduling Algorithms

Disk Scheduling Algorithm Comparison

Algorithm

Characteristics / Notes

FCFS

Simple, fair, but high average seek time.

SSTF

Selects nearest request; may cause starvation.

SCAN

Moves back and forth like an elevator; fairer.

C-SCAN

Circular movement; provides uniform response.

LOOK

SCAN variant that stops at last request.

C-LOOK

C-SCAN variant that jumps back after last request only.

RAID Structure

RAID (Redundant Array of Independent Disks) improves performance and reliability by combining multiple physical disks into a single logical unit.

Goals of RAID
  • Performance: Parallel data access increases throughput.

  • Reliability: Redundancy provides fault tolerance.

  • Capacity: Combines multiple disks into a larger virtual volume.

RAID Implementation Techniques
  • Data Striping: Distributes data across multiple disks to improve performance.

  • Mirroring: Duplicates data on two or more disks for redundancy.

  • Parity: Stores error correction information that can rebuild data after a disk failure.

Common RAID Levels

RAID Levels Summary

Level

Description / Features

RAID 0

Data striping with no redundancy; best performance, no fault tolerance.

RAID 1

Mirroring; data duplicated on two disks; high reliability, lower usable capacity.

RAID 5

Block-level striping with distributed parity; good balance between performance and redundancy.

RAID 6

Like RAID 5 but with double parity; can survive two disk failures.

RAID 10 (1+0)

Combines mirroring and striping; high speed and reliability but high cost.

Advantages of RAID
  • Faster data access through parallelism.

  • Improved fault tolerance via redundancy.

  • Scalable storage capacity.

Disadvantages
  • Increased hardware cost.

  • Complexity in management and recovery.

  • Parity overhead in some configurations.

Summary

Mass Storage Summary

Concept

Key Points

Disk Structure

Disks consist of platters, tracks, sectors, and cylinders.

Disk Attachment

Can be local (SATA, NVMe) or network-based (NAS, SAN).

Disk Scheduling

Optimizes access order to reduce seek time.

RAID

Uses redundancy and striping for performance and reliability.

Key Takeaways
  • Disk performance depends on seek time, latency, and transfer time.

  • Disk scheduling algorithms balance fairness and efficiency.

  • RAID enhances both reliability and performance through redundancy.