Database: Difference between revisions

From Rice Wiki
No edit summary
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
A '''database''' is a system that stores, retrieves, and operates on a collection of related data.
A '''database''' stores, retrieves, and operates on a collection of related data. A '''database management system (DBMS)''' is a general way to implement databases. This page will describe why we want a system specifically for this task, a high-level breakdown of its functionality, as well as an overview of how it is implemented.
 
== Motivation ==
A separate system for the management of collections of data is necessary because of two things: 1) the task's prevalence in many application, and 2) the difficulty to implement such a system correctly and efficiently. The former point is relatively intuitive, but the second point is more involved.
 
To design a way to manage collections of data is to solve the following problems among many.
 
* '''Data integrity''': Ensure that data stored is valid and will not cause errors upon operation and/or retrieval. Examples include type checking.
* '''Efficient implementation:''' Implementation needs to be efficient for the task at hand. Examples include data structures and concurrency.
* '''Durability:''' Ensure the safety of the data during system errors and modifications. Examples include computer crashing.
 
All of the above tasks are highly involved. As such, a layer of abstraction is very beneficial for development: you don't want to handle all of these problems when building your project!
 
== Description ==
The role of a general-purpose DBMS is to handle the definition, creation, querying, update, and administration of databases. DBMSes are widely tested and deployed, and as such allow developers to focus on other aspects of their program and rely on DBMSes to store the data.
 
== Data models ==
A '''data model''' is a collection of concepts that describes how data is represented in the database.
 
=== [[Relational model]] ===
The '''relational model''' abstracts away database implementation by allowing developers to access data through a high-level language, where the DBMS decides the strategy. It stores data in a simple structure called a '''relation''', and the physical implementation of the database is handled by the DBMS.
 
The relational model is known for its flexibility, and is the most frequently used model to this day.
 
=== [[NoSQL]] ===
'''NoSQL''' is a diverse category of data models. The following list some examples.
 
* Key/Value
* Graph
* Document
* Column-family
 
=== Other models ===
 
* Array/Matrix is useful for machine learning applications
* Hierarchical, network, multi-value is used in legacy systems.
{{Database sources}}
[[Category:Computer Science]]
[[Category:Computer Science]]

Latest revision as of 23:08, 20 July 2024

A database stores, retrieves, and operates on a collection of related data. A database management system (DBMS) is a general way to implement databases. This page will describe why we want a system specifically for this task, a high-level breakdown of its functionality, as well as an overview of how it is implemented.

Motivation

A separate system for the management of collections of data is necessary because of two things: 1) the task's prevalence in many application, and 2) the difficulty to implement such a system correctly and efficiently. The former point is relatively intuitive, but the second point is more involved.

To design a way to manage collections of data is to solve the following problems among many.

  • Data integrity: Ensure that data stored is valid and will not cause errors upon operation and/or retrieval. Examples include type checking.
  • Efficient implementation: Implementation needs to be efficient for the task at hand. Examples include data structures and concurrency.
  • Durability: Ensure the safety of the data during system errors and modifications. Examples include computer crashing.

All of the above tasks are highly involved. As such, a layer of abstraction is very beneficial for development: you don't want to handle all of these problems when building your project!

Description

The role of a general-purpose DBMS is to handle the definition, creation, querying, update, and administration of databases. DBMSes are widely tested and deployed, and as such allow developers to focus on other aspects of their program and rely on DBMSes to store the data.

Data models

A data model is a collection of concepts that describes how data is represented in the database.

Relational model

The relational model abstracts away database implementation by allowing developers to access data through a high-level language, where the DBMS decides the strategy. It stores data in a simple structure called a relation, and the physical implementation of the database is handled by the DBMS.

The relational model is known for its flexibility, and is the most frequently used model to this day.

NoSQL

NoSQL is a diverse category of data models. The following list some examples.

  • Key/Value
  • Graph
  • Document
  • Column-family

Other models

  • Array/Matrix is useful for machine learning applications
  • Hierarchical, network, multi-value is used in legacy systems.

Sources

Information about databases on this wiki come from the following sources: