****************************************
Chapter 10: Mass-Storage Structure
****************************************

Overview
===========

Mass storage systems provide long-term, non-volatile storage for large amounts of data.  
The most common form of secondary storage is the **magnetic disk**, which offers direct access to data through read/write heads.

The **operating system (OS)** manages mass storage by:
- Organizing disks into logical units
- Scheduling disk access efficiently
- Ensuring reliability and data protection

------------------------------
Disk Structure and Attachment
------------------------------

**Disk Structure**
  - A magnetic disk consists of:
    - **Platters:** Circular disks coated with magnetic material.
    - **Tracks:** Concentric circles on the platter surface.
    - **Sectors:** Subdivisions of each track, typically 512 bytes or 4 KB.
    - **Cylinders:** Set of tracks located at the same position on all platters.

**Disk Access**
  - To read or write data, the disk head must:
    1. Move to the correct track (**seek time**)
    2. Wait for the correct sector to rotate under the head (**rotational latency**)
    3. Transfer data (**transfer time**)

**Disk Performance Metrics**
  - **Seek time:** Time to move the arm to the desired track.
  - **Rotational latency:** Delay waiting for the sector to rotate into position.
  - **Transfer time:** Time to move data between disk and memory.
  - **Access time = Seek time + Rotational latency + Transfer time**

**Disk Attachment**
  - **Host-Attached Storage:** Connected directly to the computer via interfaces such as SATA, SCSI, or NVMe.
  - **Network-Attached Storage (NAS):** Provides file-level access over a network.
  - **Storage Area Network (SAN):** Provides block-level access via high-speed networks (e.g., Fibre Channel, iSCSI).

------------------------------
Disk Scheduling
------------------------------

Disk scheduling determines the order in which disk I/O requests are processed.  
Goals include minimizing seek time and improving system throughput.

**Common Disk Scheduling Algorithms**

1. **FCFS (First-Come, First-Served)**
   - Processes requests in the order they arrive.
   - Simple and fair, but can lead to long average seek times.

2. **SSTF (Shortest Seek Time First)**
   - Selects the request closest to the current head position.
   - Reduces seek time but may cause **starvation** of far-away requests.

3. **SCAN (Elevator Algorithm)**
   - Disk arm moves in one direction servicing requests, then reverses.
   - Provides better response uniformity and prevents starvation.

4. **C-SCAN (Circular SCAN)**
   - Like SCAN, but the arm returns to the beginning without servicing requests on the way back.
   - Provides more uniform wait times.

5. **LOOK and C-LOOK**
   - Variants of SCAN/C-SCAN that reverse direction or reset only after the last request rather than the disk end.

**Comparison of Disk Scheduling Algorithms**

.. list-table:: Disk Scheduling Algorithm Comparison
   :widths: 20 60
   :header-rows: 1

   * - **Algorithm**
     - **Characteristics / Notes**
   * - FCFS
     - Simple, fair, but high average seek time.
   * - SSTF
     - Selects nearest request; may cause starvation.
   * - SCAN
     - Moves back and forth like an elevator; fairer.
   * - C-SCAN
     - Circular movement; provides uniform response.
   * - LOOK
     - SCAN variant that stops at last request.
   * - C-LOOK
     - C-SCAN variant that jumps back after last request only.

------------------------------
RAID Structure
------------------------------

**RAID (Redundant Array of Independent Disks)** improves performance and reliability by combining multiple physical disks into a single logical unit.

**Goals of RAID**
  - **Performance:** Parallel data access increases throughput.
  - **Reliability:** Redundancy provides fault tolerance.
  - **Capacity:** Combines multiple disks into a larger virtual volume.

**RAID Implementation Techniques**
  - **Data Striping:** Distributes data across multiple disks to improve performance.
  - **Mirroring:** Duplicates data on two or more disks for redundancy.
  - **Parity:** Stores error correction information that can rebuild data after a disk failure.

**Common RAID Levels**

.. list-table:: RAID Levels Summary
   :widths: 15 70
   :header-rows: 1

   * - **Level**
     - **Description / Features**
   * - RAID 0
     - Data striping with no redundancy; best performance, no fault tolerance.
   * - RAID 1
     - Mirroring; data duplicated on two disks; high reliability, lower usable capacity.
   * - RAID 5
     - Block-level striping with distributed parity; good balance between performance and redundancy.
   * - RAID 6
     - Like RAID 5 but with double parity; can survive two disk failures.
   * - RAID 10 (1+0)
     - Combines mirroring and striping; high speed and reliability but high cost.

**Advantages of RAID**
  - Faster data access through parallelism.
  - Improved fault tolerance via redundancy.
  - Scalable storage capacity.

**Disadvantages**
  - Increased hardware cost.
  - Complexity in management and recovery.
  - Parity overhead in some configurations.

------------------------------
Summary
------------------------------

.. list-table:: Mass Storage Summary
   :widths: 25 60
   :header-rows: 1

   * - **Concept**
     - **Key Points**
   * - Disk Structure
     - Disks consist of platters, tracks, sectors, and cylinders.
   * - Disk Attachment
     - Can be local (SATA, NVMe) or network-based (NAS, SAN).
   * - Disk Scheduling
     - Optimizes access order to reduce seek time.
   * - RAID
     - Uses redundancy and striping for performance and reliability.

**Key Takeaways**
  - Disk performance depends on seek time, latency, and transfer time.
  - Disk scheduling algorithms balance fairness and efficiency.
  - RAID enhances both reliability and performance through redundancy.