# File-System Structure

I/O 在 memory 和 disk 之間的轉移以 block 為單位
- 一個 block 由一個或多個 sector 組成
- 一個 sector 通常 512 bytes
2 個 design problems in FS
- interface to user programs（上層）
- interface to physical storage （disk）（下層）
一個 OS 能支援多個 FS type
- NTFS, FAT32

# File System Implementation

# On-Disk Structure

Boot control block：開啟 OS 所需要的資訊
- 每個 partition 都有，位於 partition 的第一個 block（空的表示沒有 OS）
- UFS（Unix File Sys.）：boot block
- NTFS：partition boot sector
Partition control block：partition details
- 每個 partition 都有
- details：blocks 數量、block 大小、free-block-list、free FCB pointers
- UFS：superblock
- NTFS：Master File Table
File control block：file details
- 每個 file 都有
- details：權限、大小、location of data blocks（存在 physical 的哪裡）
- UFS：inode
- NTFS：stored in MFT (relational database)
Directory structure：organize files
- 每個 file system 都有

# In-Memory Structure

in-memory partition table：每個被掛載的 partition 的資訊
in-memory directory structure：最近被存取的 directories 的資訊
system-wide open-file table：opened file’s FCB copy
per-process open-file table：pointer（file handler/descriptor）指到 system-wide open-file table 對應的 entry

# File operation step

Create
1. OS allocate 一個新的 FCB
2. 更新 directory structure
  1. OS 讀進對應的目錄到 memory，檢查有沒有一樣的名字
  2. 更新 directory structure
  3. file 被 closed 之後 OS 將 directory structure 寫回 disk
3. file 出現在 user’s dir command
Open
1. 根據 path 去 traverse directory structure
2. 沒找到的話，on demand 去 disk load directory 進來，找到我們要的 leaf node，其指到該 fcb
Read
1. pass matadata 的 permission 後，per-process 會記目前的 pos

# Directory Implementation

Linear list - <file name, pointer to data block> 組成的 list

簡單的實作，但是效能差

Hash table – linear list w/ hash data structure
- searching 常數時間
- linked list for collisions on a hash entry
- 通常有固定大小的 entry 數

# Disk Allocation Methods

disk blocks 如何分配給 files？分配磁碟的三種方法：

連續 (contiguous)
鏈結 (linked)
索引 (index)

# 連續性配置 (Contiguous allocation)

若檔案大小為 n 個 blocks，則 OS 必須在 disk 上找到 $\geq n$ 個連續 free blocks 才能配置

記錄方式：File Name, Size, Starting Block #
優點：
- 可支援 sequential access 及 random access
- seek time 通常較短 (檔案所佔的連續區塊大都會落在同一個 track 上)
缺點：
- external fragmentation
  解法 - compaction
- 檔案大小無法任意增 / 刪
  解法 - extend-based FS
- 若檔案大小是緩慢增加的，將會有配置問題
  - 原先配置固定大小
  - 配置額外的連續部分，與原位置鍊結

# extent-based FS

這個方法用 extent 分配 disk blocks

如果用完了給他的 extent，再給他另一個 extent，用 linked list 串起來
許多新的 FS 使用 modified contiguous allocation scheme
若跳開，需要讀前一個 extent 的最尾的 extent block 中的 extent pointer（放 pointer 的位置由 designer 決定）
- 加總要 access 2 次
extent：連續的 disks blocks
You can’t use ‘macro parameter character #’ in math mode
缺點
- 隨機存取 costly
- 可能會發生 internal & external fragmentation

# 連結配置 (Link allocation)

若檔案大小為 n 個區塊，則 OS 必須在 disk 上找到 $\geq n$ 個可用區塊即可配置 (不一定要連續)，各配置區塊之間以 link list 做串連

記錄方式：File Name, Start Block #, End Block #
優點：
- 沒有 external fragmentation
- 檔案可任意增 / 刪空間建檔時無需事先宣告大小
缺點：
- 不支援 random access 只支援 sequential access，要讀某個檔案的某個區塊時需從頭找起
- seek time 比 contiguous 配置方式長，非連續性 blocks 可能散落在不同的 track
- Reliability 差，指標可能指錯，若連結斷裂則會資料丟失

# FAT (File Allocation Table) file system

類似鏈結配置，表格為每個磁碟區段保有一份紀錄
造成大量磁頭尋找 (FAT 一次、實際位置一次)

# 索引式配置 (Index Allocation)

將指標全部放在索引區段 (index block)，每個檔案皆會多配置一些額外的 blocks 作為 index blocks 內含各配置的 data block 編號，採非連續性配置方式

記錄方式：File Name, Index Block #
優點：
- 沒有 external fragmentation（linked list 的優點）
- 支援 random access 及 sequential access（contiguous 的優點）
- 可任意增 / 刪空間
- 建檔時無需事先宣告大小
缺點：
- 佔用額外空間：索引區塊佔用 free block 的空間
- index block 太小不夠存放一個檔案的所有 block 編號，太大又形成浪費

解決 index block 不夠大的三個方法：

鏈結方式（linked scheme）：索引區段鍊結，提供更多空間

index block 裡面放 pointer 指向其他 index blocks

缺點是資料區塊 IO 隨大小線性成長，無法 random access

多層索引（Multilevel index）：第一層索引再指向第二層索引

僅適用大檔案，小檔案不適合 (似 2-level cache table)

當存到小檔案時，有可能 index 的大小會大於檔案

組合方式（combined（inode in BSD UNIX））：直接指標 + 單一間接指標 + 雙重間接指標 + 三重間接指標

# Free-Space Management

可用空間串列 (free-space list)：紀錄可用磁碟空間
位元映像 (bit map)/ 位元向量 (bit vector)
1
2
bit[i] = 1// occupied
bit[i] = 0// free
- Bit Vector（bitmap）: 每個 block 用一個 bit 表示
- 優點：簡單、效率、硬體支援 bit-manipulation instruction
- 缺點：bitmap 為了效能必須快取
  A 1-TB（4KB block） disk needs 32MB bitmap
鏈結串列：將所有可用連結區鍊結在一起
組群 (Grouping)：將前 n 個可用區段的位址存放在第一個區段中，最後一個存放另外 n 個可用區段位址
計數 (Counting)：利用一般情況下許多連續區段同時被分配或同時被釋放的事實，只需記住第一個的位址與後面 n 個連續區段的數量

# Ref Notes

Kipper’s Note
Chang-Chia-Chi’s Note
陳品媛的筆記
Mr. Opengate. OS - Ch10
WillyWangkaa’s Note