My Coding Journey: CST 334 Operating Systems Module 7 : Persistence

1.0 List the characteristics of block and character devices

What are Block Devices?

They store a set number of devices and blocks.

What are Character Devices?

Produces/consumes stream of bytes (keyboard).

1.1 Describe the hardware interfaces of I/O devices

What are the two important components of I/O devices?

Hardware interface.
Internals.

What is the hardware interface?

Hardware interface allows the system software to control its own operation.
It uses registers in order to find the status of the device, command to execute (read/write operations) and data transfer.

What 3 registers are included in the hardware interface?

status - sees current status of the device.
command - tells device to form a task.
data - pass/get data from device.

How does an Operating system Control device behavior?

By reading and writing the above three registers.

What are the Internal components?

There is also an internal component which includes:

micro-controller(CPU)
Memory(DRAM/SRAM)
other hardware chips.

The internals are hidden from the CPU.

1.2 Explain how the OS interacts with I/O devices.

What two ways can you read/write to registers?

Explicit I/O Instructions
Memory mapped I/O

What is explicit I/O Instruction?

OS can use privileged instructions to directly read and write to specific registers.

What is memory mapped I/O?

Device makes registers look like memory locations.
The OS can simply read and write from memory.
Memory hardware routes access from memory to the device

How does the OS abstract I/O operations?

Encapsulates any SPECIFICS of device interactions.

What are the main problems with encapsulating I/O Operations?

Problem is that a lot of code gets unused because you might not be using certain devices but the drivers are there for it.
This is a main contributor to kernel crashes and bugs.

What is polling? Why is bad? What is another alternative?

This happens when an I/O device waits until the device is not busy or in sure before being able to perform an action. It is bad because it wastes CPU clock time waiting. An alternative to this would be context switching to another process.

Describe what happens during an interrupt of I/O Devices.

When an interrupt for another task occurs, the current task that is currently performed in the CPU gets put into a disk while task 2 operates (puts to sleep). After task 2 finishes, task 1 will be waken up and brought back to the CPU. This ensures that CPU and disk are utilized properly.

When would you use interrupts vs polling?

Use interrupts when your process is slow and polling when your interrupts are fast.

1.3 Explain performance characteristics of hard drives.

What is the speed of data processing in hard drives?

Even though it is the main form of persistent data storage, it is very slow in processing data.

What are the components of HDD?

1. Platter
2. Spindle
3. Track
4. Disk head

What is the Platter?

It is an aluminum coated platter with a thin magnetic layer. It stores data by inducing magnetic changes on the platter. It also contains two sides called a surface.

What does the Spindle do?

It is connected to a motor that spins platter around between 7200-15,000 rpm.

What does the Track do?
- It is circles of sectors that encodes data on each surface of the track. It contains a Disk head. There is one head per surface of the drive and this is what reads and writes to the disk.

1.4 Compute hard drive transfer rates for different workloads.

What components must you consider to calculate the time taken for I/O operations (Complete I/O time)?

1. seek time
2. rotational latency
3. transfer time

What is the seek time?

It is the time to get the disk arm on the right track (few ms)
This is a very costly operation.

What are the phases of seek time?

1. acceleration - disk arm moving
2. coasting - arm moving at full speed
3. deceleration - arm slows down
4. settling - head carefully positioned over the right track

time is significant

What is rotational latency?

This is the time for disk to spin to correct sector on track (few ms)
Worst case R-1.

What is the transfer time?

time to read or write from sector (few tens ms)

What is a problem that you can encounter?

Track Skew

When the head moves to the next block but the block underneath has already moved ie) skips location

What does the Cache (Track Buffer) contain and do?

8MB - 16MB in size
Holds data read FROM or WRITTEN to disk
Drive can quickly respond to requests

What are the two ways to write on cache?

1. Writeback (immediate reporting)

Acknowledges that a write has completed when it has put data in memory.
FAST but DANGEROUS

2. Write through

Acknowledges a write has completed AFTER the write has been WRITTEN to disk.

What are the relevant Formulas?

I/O Time

Ti/o = Tseek + Trotation + Ttransfer

Rate of I/O

Ri/o = Sizetransfer / Ti/0

1.5 Describe the operation of some I/O schedulers.

What performs the disk scheduling operation?

Disk scheduling gets performed by disk scheduler since requests are not served in FIFO since I/O processes need to minimize seek time and rotational delay.

What are 3 methods of I/O Scheduling?

1. SSTF (Shortest Seek Time First)
2. SCAN or C-SCAN (Elevator)
3. SPTF(Shortest Positioning Time First)

What is the SSTF(Shortest Seek Time First)method?

Pick requests on the nearest track to complete first.

What are the problems associated with SSTF?

1. Starvation of far away positions that may never get served.
2. Drive geometry is not available to the host OS.

What is the Elevator(SCAN or C-SCAN)?

Service requests in order across the tracks.

What are the steps of the SCAN?

1. Sweep - a single pass across the disk.
2. F-SCAN - Freeze the queue to be serviced when it's doing a sweep to avoid starvation of far-away requests.
3. C-SCAN (Circular scan)- Sweep from outer to inner and then inner to outer.

Why is SPTF(Shortest Positioning Time First)useful?

Useful because seek and rotation are roughly equivalent.

1.6 Understand the characteristics of Redundant Array of Inexpensive Disks (RAID)

What is it RAID?

Ways to recover data on errors by replicating across multiple disk.
Uses multiple disks together to build faster, bigger, and reliable disk system.

What are the advantages?

Performance & Capacity

Use multiple disks in parallel.

Reliability

Can tolerate disk loss.

How it works?

When RAID receives I/O request it

1. Calculates which disk to access
2. Issues one or more physical I/Os to do so.

Example?

Mirrored RAID System

Keep two copies of each block on separate disks.
Perform two I/Os for every one logical I/O it is issued.

What do you need to perform a RAID?

Microcontroller to run firmware to direct operation
Volatile Memory(DRAM) for buffer data blocks
Non-Volatile Memory to safely write buffers
Logic to perform calculations

What is the Fault Model?

Detects and recovers from CERTAIN disk faults.
Types:

Fail-Stop

A disk is categorized into two states : Working or Failed.
Working = all blocks can be written to and read from.
Failed = the disk is permanently lost.

We evaluate RAID based on what criteria?

1. Capacity
2. Reliability
3. Performance

What are the different RAID levels and what do they do?

Level 0: Striping

Spread blocks out onto the same row in a round robin fashion.
NO REDUNDANCY.
Chunk Sizes

Small

+ Increase Parallelism
- Increase positioning time to access blocks

- Reduce parallelism
+ Reduce Positioning time

Rating:

Capacity +

N disks worth of useful capacity

Reliability -

Any disk failure will lead to data loss

Performance +

Disk utilized properly and in parallel

Level 1: Mirroring

Tolerates disk failures.
Copies more than one of each block in the system.
Rating:

Capacity: -

Expensive (N/2 disk space).

Reliability: +

Can tolerate up to N/2 disk loss.

Level 4: Saving space with Parity

Add a single parity block.

stores redundant information for that stripe of blocks.

Rating:

Capacity

Up to N-1 disks.

Reliability

Tolerates 1 disk failure only.

Problems:

Small-Write problem which can be a bottleneck.

Level 5: Rotating Array

Solves small write problem of Level 4 by putting data on the horizontal.
Rating:

Capacity: + (N -1 disk used).
Reliability: - (Only tolerates 1 disk failure).

When do we use each type of Raids?

Raid-0 : Performance and not reliability
RAID-1 : Random I/O performance and reliability.
RAID-5 : Capacity and Reliability, Sequential I/O and Maximize Capacity.

1.7 Explain the file system abstractions for persistent storage (files, directories).

What is the problem dealing with persistent storage?

It is the ability of keeping data(HDD/SSD)intact when there are accidents such as a power loss.

What are the two ways we could abstract the virtualization of storage?

Using Files and Directories.

What is a File?
How do we identify Files?

It is a linear array of bites that is stored persistently.
It is identified with a human readable file name and an OS-level identifier called inode number.

What is a Directory?

It contains other sub-directories and files along with their inode numbers.

How do you know where the next read or write will occur?

You use a current offset number to see where the next location is.
You can do this explicitly by using lseek().
You can do this implicitly by reading/writing where current byte is then adding the offset.

What 3 operations can you do on files?

1. Renaming a file
2. Deleting a file (Unlinking).

unlink()

3. Get statistics on a file.

stat()
fstat()

1.8 Explain the concepts of hard link, symbolic link, volume, mount.

What is hard link?

A hard link is a link that directly associates a name with a given file in an operating system. Unlike a soft link, which changes the pointer when the file is renamed, a hard link still points to the underlying file even if the file name changes.
Hard links are more persistent in connecting a directory entry or file to the same memory space. Hard links resist file replacement. Having multiple hard links can result in the “alias effect” where files are known under multiple names.
Although some refer to soft links as pointers or shortcuts, experts point out that both hard links and soft links are technically pointers, but that hard links are more persistent pointers. For instance, if someone creates a hard link to a file named “cheese” and then changes the filename to “milk,” the hard link would still work. However, if it is a soft link, the link would then go to a non-existent file.

What is a symbolic link?

a symbolic link (also symlink or soft link) is a term for any file that contains a reference to another file or directory in the form of an absolute or relative path and that affects pathname resolution.

What is a volume?

When referring to data storage, a volume refers to a logical drive (e.g., hard drive), which has a single file system and is usually on a single partition. For instance, on a typical Microsoft Windows computer, the volume named C: contains the operating system. In Windows, any drive which has an assigned drive letter is a volume.

What is a mount?

Mounting a file system attaches that file system to a directory (mount point) and makes it available to the system. The root (/) file system is always mounted. Any other file system can be connected or disconnected from the root (/) file system.

1.9 List key design goals for file systems.

What two things do we need to keep in mind when we implement a file system?

1. Data structures
2. Access methods

What are the goals in file organization?

1.Minimize number of trips to the disk in order to get desired information. Ideally get what we need in one disk access or get it with as few disk access as possible.
2. Grouping related information so that we are likely to get everything we need with only one trip to the disk (e.g. name, address, phone number, account balance).

What characteristics describe a good File Structure Design

Fast access to great capacity
Reduce the number of disk accesses
Manage growth by splitting these collections

1.10 Describe the on-disk data structures for a simple file system.

What data structure can you use for a simple file system?

The best datastructure for a file system is a Tree. If you think of it, it naturally models the hierarchy of a directory structure

My Coding Journey

Tuesday, August 10, 2021

CST 334 Operating Systems Module 7 : Persistence

No comments:

Post a Comment

CST 499 Capstone - Week 8 Learning Journal Final Entry

Report Abuse