1.0 List the characteristics of block and character devices
- What are Block Devices?
- They store a set number of devices and blocks.
- What are Character Devices?
- Produces/consumes stream of bytes (keyboard).
1.1 Describe the hardware interfaces of I/O devices
- What are the two important components of I/O devices?
- Hardware interface.
- Internals.
- What is the hardware interface?
- Hardware interface allows the system software to control its own operation.
- It uses registers in order to find the status of the device, command to execute (read/write operations) and data transfer.
- What 3 registers are included in the hardware interface?
- status - sees current status of the device.
- command - tells device to form a task.
- data - pass/get data from device.
- How does an Operating system Control device behavior?
- By reading and writing the above three registers.
- What are the Internal components?
- There is also an internal component which includes:
- micro-controller(CPU)
- Memory(DRAM/SRAM)
- other hardware chips.
- The internals are hidden from the CPU.
1.2 Explain how the OS interacts with I/O devices.
- What two ways can you read/write to registers?
- Explicit I/O Instructions
- Memory mapped I/O
- What is explicit I/O Instruction?
- OS can use privileged instructions to directly read and write to specific registers.
- What is memory mapped I/O?
- Device makes registers look like memory locations.
- The OS can simply read and write from memory.
- Memory hardware routes access from memory to the device
- How does the OS abstract I/O operations?
- Encapsulates any SPECIFICS of device interactions.
- What are the main problems with encapsulating I/O Operations?
- Problem is that a lot of code gets unused because you might not be using certain devices but the drivers are there for it.
- This is a main contributor to kernel crashes and bugs.
- What is polling? Why is bad? What is another alternative?
- This happens when an I/O device waits until the device is not busy or in sure before being able to perform an action. It is bad because it wastes CPU clock time waiting. An alternative to this would be context switching to another process.
- Describe what happens during an interrupt of I/O Devices.
- When an interrupt for another task occurs, the current task that is currently performed in the CPU gets put into a disk while task 2 operates (puts to sleep). After task 2 finishes, task 1 will be waken up and brought back to the CPU. This ensures that CPU and disk are utilized properly.
- When would you use interrupts vs polling?
- Use interrupts when your process is slow and polling when your interrupts are fast.
1.3 Explain performance characteristics of hard drives.
- What is the speed of data processing in hard drives?
- Even though it is the main form of persistent data storage, it is very slow in processing data.
- What are the components of HDD?
- 1. Platter
- 2. Spindle
- 3. Track
- 4. Disk head
- What is the Platter?
- It is an aluminum coated platter with a thin magnetic layer. It stores data by inducing magnetic changes on the platter. It also contains two sides called a surface.
- What does the Spindle do?
- It is connected to a motor that spins platter around between 7200-15,000 rpm.
- What does the Track do?
- It is circles of sectors that encodes data on each surface of the track. It contains a Disk head. There is one head per surface of the drive and this is what reads and writes to the disk.
1.4 Compute hard drive transfer rates for different workloads.
- What components must you consider to calculate the time taken for I/O operations (Complete I/O time)?
- 1. seek time
- 2. rotational latency
- 3. transfer time
- What is the seek time?
- It is the time to get the disk arm on the right track (few ms)
- This is a very costly operation.
- What are the phases of seek time?
- 1. acceleration - disk arm moving
- 2. coasting - arm moving at full speed
- 3. deceleration - arm slows down
- 4. settling - head carefully positioned over the right track
- time is significant
- What is rotational latency?
- This is the time for disk to spin to correct sector on track (few ms)
- Worst case R-1.
- What is the transfer time?
- time to read or write from sector (few tens ms)
- What is a problem that you can encounter?
- Track Skew
- When the head moves to the next block but the block underneath has already moved ie) skips location
- What does the Cache (Track Buffer) contain and do?
- 8MB - 16MB in size
- Holds data read FROM or WRITTEN to disk
- Drive can quickly respond to requests
- What are the two ways to write on cache?
- 1. Writeback (immediate reporting)
- Acknowledges that a write has completed when it has put data in memory.
- FAST but DANGEROUS
- 2. Write through
- Acknowledges a write has completed AFTER the write has been WRITTEN to disk.
- What are the relevant Formulas?
- I/O Time
- Ti/o = Tseek + Trotation + Ttransfer
- Rate of I/O
- Ri/o = Sizetransfer / Ti/0
1.5 Describe the operation of some I/O schedulers.
- What performs the disk scheduling operation?
- Disk scheduling gets performed by disk scheduler since requests are not served in FIFO since I/O processes need to minimize seek time and rotational delay.
- What are 3 methods of I/O Scheduling?
- 1. SSTF (Shortest Seek Time First)
- 2. SCAN or C-SCAN (Elevator)
- 3. SPTF(Shortest Positioning Time First)
- What is the SSTF(Shortest Seek Time First)method?
- Pick requests on the nearest track to complete first.
- What are the problems associated with SSTF?
- 1. Starvation of far away positions that may never get served.
- 2. Drive geometry is not available to the host OS.
- What is the Elevator(SCAN or C-SCAN)?
- Service requests in order across the tracks.
- What are the steps of the SCAN?
- 1. Sweep - a single pass across the disk.
- 2. F-SCAN - Freeze the queue to be serviced when it's doing a sweep to avoid starvation of far-away requests.
- 3. C-SCAN (Circular scan)- Sweep from outer to inner and then inner to outer.
- Why is SPTF(Shortest Positioning Time First)useful?
- Useful because seek and rotation are roughly equivalent.
1.6 Understand the characteristics of Redundant Array of Inexpensive Disks (RAID)
- What is it RAID?
- Ways to recover data on errors by replicating across multiple disk.
- Uses multiple disks together to build faster, bigger, and reliable disk system.
- What are the advantages?
- Performance & Capacity
- Use multiple disks in parallel.
- Reliability
- Can tolerate disk loss.
- How it works?
- When RAID receives I/O request it
- 1. Calculates which disk to access
- 2. Issues one or more physical I/Os to do so.
- Example?
- Mirrored RAID System
- Keep two copies of each block on separate disks.
- Perform two I/Os for every one logical I/O it is issued.
- What do you need to perform a RAID?
- Microcontroller to run firmware to direct operation
- Volatile Memory(DRAM) for buffer data blocks
- Non-Volatile Memory to safely write buffers
- Logic to perform calculations
- What is the Fault Model?
- Detects and recovers from CERTAIN disk faults.
- Types:
- Fail-Stop
- A disk is categorized into two states : Working or Failed.
- Working = all blocks can be written to and read from.
- Failed = the disk is permanently lost.
- We evaluate RAID based on what criteria?
- 1. Capacity
- 2. Reliability
- 3. Performance
- What are the different RAID levels and what do they do?
- Level 0: Striping
- Spread blocks out onto the same row in a round robin fashion.
- NO REDUNDANCY.
- Chunk Sizes
- Small
- + Increase Parallelism
- - Increase positioning time to access blocks
- Big
- - Reduce parallelism
- + Reduce Positioning time
- Rating:
- Capacity +
- N disks worth of useful capacity
- Reliability -
- Any disk failure will lead to data loss
- Performance +
- Disk utilized properly and in parallel
- Level 1: Mirroring
- Tolerates disk failures.
- Copies more than one of each block in the system.
- Rating:
- Capacity: -
- Expensive (N/2 disk space).
- Reliability: +
- Can tolerate up to N/2 disk loss.
- Level 4: Saving space with Parity
- Add a single parity block.
- stores redundant information for that stripe of blocks.
-
- Rating:
- Capacity
- Up to N-1 disks.
- Reliability
- Tolerates 1 disk failure only.
- Problems:
- Small-Write problem which can be a bottleneck.
- Level 5: Rotating Array
- Solves small write problem of Level 4 by putting data on the horizontal.
- Rating:
- Capacity: + (N -1 disk used).
- Reliability: - (Only tolerates 1 disk failure).
- When do we use each type of Raids?
- Raid-0 : Performance and not reliability
- RAID-1 : Random I/O performance and reliability.
- RAID-5 : Capacity and Reliability, Sequential I/O and Maximize Capacity.
1.7 Explain the file system abstractions for persistent storage (files, directories).
- What is the problem dealing with persistent storage?
- It is the ability of keeping data(HDD/SSD)intact when there are accidents such as a power loss.
- What are the two ways we could abstract the virtualization of storage?
- Using Files and Directories.
- What is a File?
- How do we identify Files?
- It is a linear array of bites that is stored persistently.
- It is identified with a human readable file name and an OS-level identifier called inode number.
- What is a Directory?
- It contains other sub-directories and files along with their inode numbers.
- How do you know where the next read or write will occur?
- You use a current offset number to see where the next location is.
- You can do this explicitly by using lseek().
- You can do this implicitly by reading/writing where current byte is then adding the offset.
- What 3 operations can you do on files?
- 1. Renaming a file
- 2. Deleting a file (Unlinking).
- unlink()
- 3. Get statistics on a file.
- stat()
- fstat()
1.8 Explain the concepts of hard link, symbolic link, volume, mount.
- What is hard link?
- A hard link is a link that directly associates a name with a given file in an operating system. Unlike a soft link, which changes the pointer when the file is renamed, a hard link still points to the underlying file even if the file name changes.
- Hard links are more persistent in connecting a directory entry or file to the same memory space. Hard links resist file replacement. Having multiple hard links can result in the “alias effect” where files are known under multiple names.
- Although some refer to soft links as pointers or shortcuts, experts point out that both hard links and soft links are technically pointers, but that hard links are more persistent pointers. For instance, if someone creates a hard link to a file named “cheese” and then changes the filename to “milk,” the hard link would still work. However, if it is a soft link, the link would then go to a non-existent file.
- What is a symbolic link?
- a symbolic link (also symlink or soft link) is a term for any file that contains a reference to another file or directory in the form of an absolute or relative path and that affects pathname resolution.
- What is a volume?
- When referring to data storage, a volume refers to a logical drive (e.g., hard drive), which has a single file system and is usually on a single partition. For instance, on a typical Microsoft Windows computer, the volume named C: contains the operating system. In Windows, any drive which has an assigned drive letter is a volume.
- What is a mount?
- Mounting a file system attaches that file system to a directory
(mount point) and makes it available to the system. The
root (/) file system is always mounted. Any other file
system can be connected or disconnected from the root (/)
file system.
1.9 List key design goals for file systems.
- What two things do we need to keep in mind when we implement a file system?
- 1. Data structures
- 2. Access methods
- What are the goals in file organization?
- 1.Minimize number of trips to the disk in order to get desired information. Ideally get what we need in one disk access or get it with as few disk access as possible.
- 2. Grouping related information so that we are likely to get everything we need with only one trip to the disk (e.g. name, address, phone number, account balance).
- What characteristics describe a good File Structure Design
- Fast access to great capacity
- Reduce the number of disk accesses
- Manage growth by splitting these collections
1.10 Describe the on-disk data structures for a simple file system.
- What data structure can you use for a simple file system?
- The best datastructure for a file system is a Tree. If you think of it, it naturally models the hierarchy of a directory structure
No comments:
Post a Comment