Distributed Database

Working on from on February 01, 2024 ·

General Information

  • University: Salahaddin University-Erbil
  • Department: Software Engineering Dept.
  • My Status: Lecturer
  • Level: MSc
  • Year: 2024-Current

Course Description

This course provides an in-depth examination of the principles, design, and implementation of distributed database systems (DDBS). It covers essential concepts such as distributed data storage, query processing, transaction management, concurrency control, replication, reliability, and scalability in a distributed environment.

Prerequisites

  • Database Systems (or equivalent)

Course Objectives

Upon completion of this course, students will be able to:

  • Understand the core concepts, advantages, and challenges of distributed database systems.
  • Analyze different DDBS architectures and their suitability for various applications.
  • Design distributed databases, including fragmentation, replication, and allocation strategies.
  • Implement and optimize query processing techniques in a distributed environment.
  • Apply transaction management and concurrency control mechanisms to ensure data consistency.
  • Design and implement replication strategies for availability, fault tolerance, and scalability.
  • Evaluate reliability techniques for distributed databases.
  • Explore current trends and research directions in distributed database systems.

Course Outline

Module 1: Introduction to Distributed Database Systems

  • Definition and characteristics of DDBS
  • Advantages and disadvantages of DDBS
  • DDBS vs. centralized databases
  • Components of a DDBS
  • Common challenges in DDBS

Module 2: DDBS Architectures

  • Homogeneous and heterogeneous DDBS
  • Client-server architecture
  • Peer-to-peer architecture
  • Multi-database systems
  • Federated databases

Module 3: Distributed Database Design

  • Fragmentation (horizontal, vertical, mixed)
  • Data allocation and placement

Module 4: Distributed Query Processing and Optimization

  • Query decomposition and localization
  • Distributed join algorithms
  • Cost models for distributed query optimization
  • Techniques for minimizing communication overhead

Module 5: Distributed Transaction Management

  • ACID properties in a distributed setting
  • Concurrency control mechanisms (locking, timestamps, optimistic concurrency control)
  • Distributed commit protocols (Two-Phase Commit, Three-Phase Commit)

Module 6: Distributed Concurrency Control and Recovery

  • Serializable schedules and conflicts
  • Distributed deadlock detection and resolution
  • Replicated data consistency strategies
  • Failure recovery

Module 7: Replication in Distributed Databases

  • Fundamentals of data replication
  • Synchronous vs. asynchronous replication
  • Master-slave and multi-master replication models
  • Replication for availability, scalability, and fault tolerance
  • Consistency-performance trade-offs in replication

Module 8: Reliability and Security in Distributed Databases

  • Fault tolerance techniques
  • Availability in distributed systems
  • Security and privacy concerns in DDBS
  • Authentication, authorization, access control
  • Cloud-based distributed databases
  • Distributed NoSQL databases
  • NewSQL distributed databases
  • Edge and fog computing architectures for DDBS

Textbooks

  • [Recommended] “Distributed Databases: Principles and Systems” by Stefano Ceri and Giuseppe Pelagatti
  • [Optional] “Principles of Distributed Database Systems” by M. Tamer Özsu and Patrick Valduriez

Assessment

  • Assignments (20%)
  • Project (20%)
  • Midterm Exam (20%)
  • Final Exam (40%)

Polla Fattah