本文是我在上UCSD的 CSE 120: Principles of Operating Systems (Winter 2020) 整理的笔记,这一课主要介绍了操作系统里面文件系统以及文件的概念,包括文件系统的组成以及访问具体文件/文件夹的方法。
Basic
Preview
File: logical unit of storage, container of data
- Accessed by (name, region within file)
File System: a structured collection of files
- Access control, name space, persistent storage
File System Abstraction
Repository of objects
- Objects are data, programs for system and users
- Objects referenced by name, to be read/written
More than a repository
- Objects can be r/w, protected, shared, locked
- Contains I/O devices: disk, keyboard, display
- Processes: memory
Pesistent: remains “forever”
- Large: “unlimited” size
- Sharing: controlled access
- Security: protecting information
Hierarchical File Name Space
Name space is organized as a tree
- Name has components, branches start from root
- No size restrictions
- Intuitive for users
Example: UNIX “Pathnames”
- Absolute: /a/b/c
- Relative: b/c relative to /a
- Not strictly a tree: links
File
Attributes
- Type (recognized by system or users)
- Times: creation, accessed, modified
- Sizes: current size, (maximum size)
- Access control (permissions)
Operations
- Creation: create, delete
- Prepare for access: open, close, mmap
- Access: read, write
- Search: move to location
- Attributes: get, set(e.g., permissions)
- Mutual exclusion: lock, unlock
- Name management: rename
Read/Write model
- (file descriptor)fd = open(fname, usage)
- nr = read(fd, buf, size)
- nw = write(fd, buf, size)
close
Memory-mapped model
Map file into address space
- mmap(addr, n, …, fd, …)
- addr = mmap(NULL, n, …, fd, …)
Use memory ops
- x = addr[5]
- strcpy(addr, “hello”)
Issues
- Efficient for multiple process sharing memory
- If memory is written, how is file actually updated?
Access Control
- Who can access file
- What operations are allowed
- User interface must be simple and intuitive
Example: Unix
- r/w/x permissions for owner, group and everyone
File System Implementation
Goals
Archival storage
- Keep forever, including previous versions
Support various storage technologies
- Disks (different types), remote disks, …
How to best achieve and balance:
- Performance
- Reliability
- Security
Storage Abstraction
Hide complexity of device
- Model as array of blocks of data
- Randomly addressable by block number
Typical block size: 1KB (also 4KB ~ 64KB)
- Generally multiple of disk sector size: 512B
Simple interface
- read(block_num, mem_addr)
- write(block_num, mem_addr)
Typical Implementation Structure
- Three major regions: Sequence of blocks for each one
Region 1: File System Metadata
- Information about file system
Sizes
- Files in use, free entries
- Data blocks in use, free entries
Free lists (or bitmaps)
- File control blocks
- Data blocks
Region 2: File Metadata (File Control Blocks)
- Information about a file
- Referenced by number/index
Contains
- Attributes: type, size, permissions,…
- References to data blocks: disk block map
Note: many file control blocks may fit in single storage block
Example:
- Number: 88 (index in file control block array)
- Size: 4096 bytes
- Permissions: rw-r—r—
- Data blocks: set of indexes into storage array, may not be contiguous (such as 567, 7076, 9201)
Region 3: Data Blocks
- File contents
File control blocks
Keeping track of allocated blocks
Contiguous blocks
- Single sequence of blocks
Extents
- Groups of contiguous blocks
Non-contiguous blocks
- Blocks individually named
Keeping track of free blocks
Free Block Map
- Compact if lots of free regions of space
Doubly Linked List
- Easy to keep ordered due to fast inserts and deletes
Bit Map
- Fixed size regardless of fragmentation
File Name to File Control Block
- Users access files using file names
Problem: how to translate
- from file name: “/sports/baseball/Padres”
- to file control block number: 88
Must parse file name
- Each branch corresponds to a directory/folder
- Each directory/folder may itself be a file
Example: UNIX v.7 Block Map
Block Map UNIX v.7
- Array of pointers to data blocks
13 Pointers
- 10 direct: references 10 data blocks
- 1 singly-indirect: references $n$ data blocks
- 1 doubly-indirect: reference $n^2$ data blocks
- 1 triply-indirect: reference $n^3$ data blocks
$n$ depends on how many pointers fit in a block
Example: 256 4-byte pointers will fit in 1KB block
Implementing UNIX Directories
Table where each entry contains
- name and attributes
- name and pointer to file control structure
Unix (name and pointer) - pre-BSD
- Each entry: branch name (14), i-node number (2)
- Berkeley Unix uses a more complex scheme to support long names
Example of parsing names in UNIX
Given pathname: /sports/baseball/Padres
- Inode 0 block map points to data block(s) of root directory
- Look up “sports” in root directory to get inode 22
- Inode 22 blocks map points to data block(s) of sports directory
- Look up “baseball” in sports directory to get inode 15
…
Storage
File Systems use disks for storage
- pros: persistent, random access, cheap
- cons: slow (mechanical)
Performance
Accesses are time expensive: 5 ~ 20 msec
- Rotational latency: 2 ~ 6 msec (5200 ~ 15000 RPM)
- Seek time: 3 ~ 13 msec
- Transfer rate: 100+ MB/sec
Reduced accesses by
- reading multiple blocks in one access (read ahead)
- maintaining a block cache
Cluster related blocks to reduce seek time
Solid State Drives (SSD)
- NAND-based flash memory, non-volatile
- Unaffected by shock, magnetic fields; no noise
- Limited number of writes, wears out with age
Performance
Caching
- Data blocks of files
- File system metadata (keep in memory)
File metadata
- Currently active file
- Recently used
Block maps
File names
- Name to file metadata translations
Clustering
Blocks that exhibit locality of reference
- Directory, and files within that directory
- The inodes of the directory and files
Strategy
- Place related blocks close to each other: clustering
- Reduces disk head movement and seek time
Block size
trade off
- the larger the block, the better the throughput
- The smaller the block, the less wasted space
technology trend
- Disk density is increasing faster than disk speed
- Make disk blocks larger: 1KB $\rightarrow$ 8KB, 64KB, 1MB
Reliability
Consistency
- Buffer cache reduces disk access
- If system crashes, block modifications lost
To improve file system consistency
- write out modified blocks periodically
- write out critical blocks
Critical blocks: file system meta-data
- Directories, i-nodes, free block lists
Journaling
- Journal: log of file (or file system) updates
- For every update, create log entry
- Write log entry out to disk (part of journal)
If crash occurs:
- Look at journal entries
- Check if mods properly reflected in file system
- Update appropriately