Database architecture and design review checklist
Let’s assume you’re starting work on a new database for a new project or planning to extend an existing one with additional tables. To make informed architectural and design decisions, you first need a solid understanding of the data. The following is my starting point - a set of key questions to ask before moving forward:
- Will the data support transactional, operational, and/or analytical business functions?
- What DBMS technology (SQL or NoSQL) is being planned?
- Does the data need to be centralized or distributed?
- What are the sources of the data (e.g., internal systems, external systems)?
- Does the data already exist elsewhere? If yes, why does the data need to be duplicated?
- What are the expected query patterns (read-heavy, write-heavy, mixed)?
- What are the ingestion and access requirements (per hour, per minute, etc.)? How much data (volume) will be ingested and kept?
- For heavy or high-frequency read scenarios, can caching layers (e.g., Redis, Memcached, etc.) or read-only replicas (e.g., AlwaysOn) be utilized?
- Are there relationships, constraints, or domain rules that must be enforced?
- How do international data requirements (e.g., language, currency, date/time formats) affect data design and management?
- What are the security requirements? Does the data include PII such as name, email, SSN, or DOB? What privacy or regulatory constraints apply? Does the data need encryption or obfuscation?
- What is the data retention policy?
- What are the backup and recovery requirements for the data, including the acceptable RPO and RTO for disaster recovery?
This is not an exhaustive list, and you may have additional valuable questions to consider. If you notice anything important that I may have missed, feel free to reach out and let me know!
Thanks for reading!