Monday, May 31, 2021

CST 363 Intro to Database Systems Module 5: Indexing and Transactions

5.1 Review

This week, we worked under the hood of MySQL and learned what goes on behind the queries that are made. This week's topic we covered the concepts of Indexing and Transactions. 

5.2 Index

Indexing is a way to speed up data search and retrieval on relations with large amounts of data. Normally, when you are writing a query to search up data, MySQL does a linear search on that data, which (in an unsorted column) involves sequentially checking each element until the search value is found. This is rather slow.

To create a faster query, you can apply a binary search, which is a non-linear search that involves looking at a value in the middle and comparing the searched value of the query. This type of search reduces the amount of time by half. 

When creating an index, it is implemented as an additional table with two columns. One column contains the sorted values while the other contains pointers to the original column on the table. Indexes improves the search speed of queries significantly by searching through less items in the table. 

However, indexing has costs which include creating an additional space in the db, and taking more time to update the table.

5.3 Transactions

Databases have ACID characteristics. A stands for Atomic meaning that SQL statements are all or none, C stands for Consistency, I stands for Isolation (Concurrent programs do not interfere with each other), and D stands for Durability (Updates are not lost when a crash occurs).

Serializability is an important concept in this topic. In the topic of concurrency control, if the outcome of a transaction concurrently is equal to the outcome executed serially without overlapping. Non-serializability however can lead to lost update problems, inconsistent reads, phantom reads, and cascading transactions. 

The solution to this include pessimistic locking, multiple version concurrency control, and optimistic locking.

5.4 Questions

The web site "Use the Index Luke" has a page on "slow indexes".  If indexes are supposed to speed up performance of query,  what is meant by a slow index?

A slow index means that index lookup isn't as fast as it is expected to be. For example, there might be a case where an index lookup needs to perform a tree traversal AND follow a leaf node chain, and there might be a case where accessing a table might be slow because a single leaf node might have many hits. 

No comments:

Post a Comment

CST 499 Capstone - Week 8 Learning Journal Final Entry

This is the very last entry of the journal of your CS Online learning!  Keeping regular journals is a great way for us to grow, both profe...