**************************************** Chapter 10: Mass-Storage Structure **************************************** Overview =========== Mass storage systems provide long-term, non-volatile storage for large amounts of data. The most common form of secondary storage is the **magnetic disk**, which offers direct access to data through read/write heads. The **operating system (OS)** manages mass storage by: - Organizing disks into logical units - Scheduling disk access efficiently - Ensuring reliability and data protection ------------------------------ Disk Structure and Attachment ------------------------------ **Disk Structure** - A magnetic disk consists of: - **Platters:** Circular disks coated with magnetic material. - **Tracks:** Concentric circles on the platter surface. - **Sectors:** Subdivisions of each track, typically 512 bytes or 4 KB. - **Cylinders:** Set of tracks located at the same position on all platters. **Disk Access** - To read or write data, the disk head must: 1. Move to the correct track (**seek time**) 2. Wait for the correct sector to rotate under the head (**rotational latency**) 3. Transfer data (**transfer time**) **Disk Performance Metrics** - **Seek time:** Time to move the arm to the desired track. - **Rotational latency:** Delay waiting for the sector to rotate into position. - **Transfer time:** Time to move data between disk and memory. - **Access time = Seek time + Rotational latency + Transfer time** **Disk Attachment** - **Host-Attached Storage:** Connected directly to the computer via interfaces such as SATA, SCSI, or NVMe. - **Network-Attached Storage (NAS):** Provides file-level access over a network. - **Storage Area Network (SAN):** Provides block-level access via high-speed networks (e.g., Fibre Channel, iSCSI). ------------------------------ Disk Scheduling ------------------------------ Disk scheduling determines the order in which disk I/O requests are processed. Goals include minimizing seek time and improving system throughput. **Common Disk Scheduling Algorithms** 1. **FCFS (First-Come, First-Served)** - Processes requests in the order they arrive. - Simple and fair, but can lead to long average seek times. 2. **SSTF (Shortest Seek Time First)** - Selects the request closest to the current head position. - Reduces seek time but may cause **starvation** of far-away requests. 3. **SCAN (Elevator Algorithm)** - Disk arm moves in one direction servicing requests, then reverses. - Provides better response uniformity and prevents starvation. 4. **C-SCAN (Circular SCAN)** - Like SCAN, but the arm returns to the beginning without servicing requests on the way back. - Provides more uniform wait times. 5. **LOOK and C-LOOK** - Variants of SCAN/C-SCAN that reverse direction or reset only after the last request rather than the disk end. **Comparison of Disk Scheduling Algorithms** .. list-table:: Disk Scheduling Algorithm Comparison :widths: 20 60 :header-rows: 1 * - **Algorithm** - **Characteristics / Notes** * - FCFS - Simple, fair, but high average seek time. * - SSTF - Selects nearest request; may cause starvation. * - SCAN - Moves back and forth like an elevator; fairer. * - C-SCAN - Circular movement; provides uniform response. * - LOOK - SCAN variant that stops at last request. * - C-LOOK - C-SCAN variant that jumps back after last request only. ------------------------------ RAID Structure ------------------------------ **RAID (Redundant Array of Independent Disks)** improves performance and reliability by combining multiple physical disks into a single logical unit. **Goals of RAID** - **Performance:** Parallel data access increases throughput. - **Reliability:** Redundancy provides fault tolerance. - **Capacity:** Combines multiple disks into a larger virtual volume. **RAID Implementation Techniques** - **Data Striping:** Distributes data across multiple disks to improve performance. - **Mirroring:** Duplicates data on two or more disks for redundancy. - **Parity:** Stores error correction information that can rebuild data after a disk failure. **Common RAID Levels** .. list-table:: RAID Levels Summary :widths: 15 70 :header-rows: 1 * - **Level** - **Description / Features** * - RAID 0 - Data striping with no redundancy; best performance, no fault tolerance. * - RAID 1 - Mirroring; data duplicated on two disks; high reliability, lower usable capacity. * - RAID 5 - Block-level striping with distributed parity; good balance between performance and redundancy. * - RAID 6 - Like RAID 5 but with double parity; can survive two disk failures. * - RAID 10 (1+0) - Combines mirroring and striping; high speed and reliability but high cost. **Advantages of RAID** - Faster data access through parallelism. - Improved fault tolerance via redundancy. - Scalable storage capacity. **Disadvantages** - Increased hardware cost. - Complexity in management and recovery. - Parity overhead in some configurations. ------------------------------ Summary ------------------------------ .. list-table:: Mass Storage Summary :widths: 25 60 :header-rows: 1 * - **Concept** - **Key Points** * - Disk Structure - Disks consist of platters, tracks, sectors, and cylinders. * - Disk Attachment - Can be local (SATA, NVMe) or network-based (NAS, SAN). * - Disk Scheduling - Optimizes access order to reduce seek time. * - RAID - Uses redundancy and striping for performance and reliability. **Key Takeaways** - Disk performance depends on seek time, latency, and transfer time. - Disk scheduling algorithms balance fairness and efficiency. - RAID enhances both reliability and performance through redundancy.