In this article, I will guide you step by step to create a dynamic documentation site, adaptable to any project, where you can connect your documentation to a database
to extract and display data, ensuring the information is always up to date. We will also explore how to automate the entire process, from content generation to deployment in the cloud with AWS.
The solution includes support for charts
and diagrams
, continuous integration (CI/CD)
using a simple workflow in GitHub Actions
, and automatic deployment using Terraform
. Let鈥檚 get started!
What is Documentation as Code?
Documentation and its updates are an important process in many companies that develop software, often carried out using different tools, many of which are paid solutions.
Therefore, in recent times, the concept of "doc as code" has emerged. This means using the same tools and workflows used in software development to manage
, version
, and deploy
documentation.
This approach not only allows for better tracking of the documentation but also facilitates its maintenance and ensures alignment with the same best practices used in software development, not just in the code but also in the documentation.
Tools for Documentation as Code
For the development of these sites, it is essential to understand some practices and tools that allow us to implement this approach. Below is a detailed list of the most important aspects to cover in this tutorial.
- 馃摑 Markdown: The most common markup language for writing documentation due to its simplicity and integration with version control platforms and static site generators.
- 馃梻锔� Git: Git allows versioning of documentation just like code. Thanks to Git, every change in the documentation is recorded, enabling teams to track edits, revert changes, and collaborate more efficiently.
- 馃攧 Gitflow: This methodology provides a structured workflow to manage versions and revisions of documentation, ensuring that changes are approved and tested before reaching production. Gitflow also facilitates collaboration between teams, allowing for safe and organized change management.
- 鈽侊笍 Cloud Services: Using services like AWS S3, Netlify, or GitHub Pages, you can deploy documentation at a low cost. These services allow the creation of fast, secure, and easily accessible static sites.
- 馃寪 Static Site Generators: Tools like Docusaurus, Jekyll, or Hugo convert Markdown documentation into a navigable website, allowing you to create rich and organized documentation without a server.
- 馃殌 Continuous Integration (CI/CD): CI/CD pipelines (e.g., GitHub Actions, GitLab CI, or Jenkins) allow you to automatically deploy documentation when a new version is merged or modifications are approved. This ensures the documentation is always up-to-date.
Advantages of Docs-as-Code
- 鉁� Consistency and Quality: By using version control and change reviews, the documentation remains consistent and of high quality.
- 鈿欙笍 Automation: CI/CD tools enable automation of documentation deployment, reducing update times and minimizing errors.
- 馃 Efficient Collaboration: With tools like Git, teams can collaborate on creating and maintaining documentation without conflicts.
- 馃敡 Simplified Maintenance: Maintaining documentation is integrated into the development workflow, making updates easier as the code evolves.
馃搫 MkDocs
MkDocs is a static site generator written in 馃悕Python, designed specifically for documenting projects. Its goal is to simplify creating documentation using Markdown files, which are easy to write and read.
With minimal configuration, MkDocs converts Markdown files into a navigable and well-structured documentation website, making it ideal for developers and teams who want to keep their documentation up to date.
鉁忥笍 MkDocs Material
MkDocs Material is an advanced theme for MkDocs that follows Google鈥檚 Material Design guidelines.
馃殌 Key features include:
- 馃摫 Responsive Design: Automatically adapts to any screen size.
- 馃帹 Customization: Easily modify colors, fonts, favicon, and logo to match your project鈥檚 visual identity.
- 馃攳 Search Interface: Advanced search groups results and highlights searched terms, helping users find the information they need.
- 鈿� Lazy Loading: Implements lazy loading for search results, improving performance and reducing load times.
- 馃敆 Integrations: Compatible with Google Analytics, Disqus, and GitHub, facilitating traffic analysis, user feedback, and direct connection to the project repository.
鉁忥笍 Mermaid
Mermaid is a JavaScript library for creating diagrams and charts from text. By integrating with MkDocs Material, Mermaid allows you to generate visualizations such as flowcharts, entity-relationship diagrams, and other charts within the documentation without external tools.
馃З Dynamic Page: Jinja
Jinja is a library that allows embedding variables and data from Python dictionaries into HTML, making web pages dynamic. This library is commonly used for generating dynamic HTML and sending personalized emails.
馃 Docusaurus Overview
Docusaurus is an open-source project developed by Meta in 2007 that simplifies the creation, deployment, and maintenance of documentation websites in a fast and efficient way. It allows the use of Markdown and MDX to write content, while its core built on React enables full customization of the styles to fit the specific needs of the project.
Additionally, Docusaurus supports Mermaid through the @docusaurus/theme-mermaid
plugin, enabling the inclusion of charts and diagrams directly within the documentation.
馃帹 Diagram as Code
Diagram as Code is an approach that allows you to create diagrams through code, rather than using traditional graphic tools. Instead of manually building diagrams, you write code in a text file to define the structure, components, and connections of your diagrams.
This code is then translated into graphical images, making it easier to integrate and document in software projects. It's especially useful for creating and updating architectural and flow diagrams programmatically.
馃帹 Diagram as Code: Example of Creating Cloud Diagrams
As previously mentioned, Diagrams allows you to generate blueprints using the icons of major cloud technologies. The representation of these diagrams is done through nodes, and in our example, we鈥檒l use all cloud-related nodes and AWS services.
For more details on how I created this, you can read my article about Diagram as Code, and the full implementation can be found in this repository:
r0mymendez / diagram-as-code
A tutorial on how to create a documentation project using the 'Doc as diagram' methodology
馃帹 Diagram-as-Code: Creating Dynamic and Interactive Documentation for Visual Content
Diagram as Code is an approach that allows you to create diagrams through code instead of traditional graphic tools. Instead of manually building diagrams, you can write code in a text file to define the structure, components, and connections of your diagrams.
This code is then translated into graphical images, making it easier to integrate and document in software projects, where it is especially useful for creating and updating architectural and flow diagrams programmatically.
What is Diagrams?
Diagrams is a 馃悕Python library that implements the Diagram as Code approach, enabling you to create architectural infrastructure diagrams and other types of diagrams through code. With Diagrams, you can easily define cloud infrastructure components (such as AWS
, Azure
, and GCP
), network elements, software services, and more, all with just a few lines of code.
馃帀 Benefits of Diagram-as-Code
- 馃摑鈥�/li>
馃摎 Use Case: Creating a Documentation Site for a Machine Learning Project
In this use case, I will create a documentation site
for a machine learning project involving 馃彞 hospital data. The goal is to build an interactive documentation site using MkDocs initially and later migrate it to Docusaurus. The site will include both static and dynamic components to meet specific requirements, such as embedding visual diagrams and updating data dynamically from a SQLite database.
馃殌 Key Features of the Documentation Site
-
Visual Representations: I will embed diagrams created with Diagrams (
Diagram as Code
) to illustrate the architecture of the machine learning pipeline effectively. - Dynamic Data Updates: The documentation will display the version and last update date dynamically, pulling this information from a SQLite database to ensure accuracy and relevance.
- Sample of data: The documentation will include a sample from the Synthea patient table, showcasing synthetic data as an example.
馃搫 Pages of the Site
For this reason our documentation site will have the following pages:
- 馃搫 Home: The homepage of the documentation.
- 馃搫 Tables:Explanation of the Synthea data tables and their uses.
- 馃搫 Architecture:A detailed overview of the data processing architecture, hosted on AWS.
- 馃搫 Glossary: A glossary of terms used throughout the project
MkDocs Implementation
In this section, we鈥檒l walk through the steps to set up a documentation project using MkDocs from scratch and explain its organized directory structure.
馃敡 Prerequisites for MkDocs
To get started, you'll need to install the following 馃悕Python libraries:
Install MkDocs and the Material
pip install mkdocs mkdocs-material
Install additional libraries to enable dynamic content updating
pip install aiosql pandas sqlite3 jinja2 shutil
馃敡 Mkdocs: Project Setting Up
Initialize the Project
Start by creating a new MkDocs project. Run the following commands in your terminal:
mkdocs new mkdocs
cd mkdocs
This command creates a basic MkDocs project with a default structure.
Explore the Directory Structure
Once the MkDocs site is created, you need to add the following files and folders, as they are not included by default.聽
Remember, the links to the repository are provided at the end of this post for your reference, and each component will be explained in detail below.
馃搧 docs/
鈹溾攢鈹� 馃搧 img/
鈹溾攢鈹� `architecture.md`
鈹溾攢鈹� `glossary.md`
鈹溾攢鈹� `index.md`
鈹溾攢鈹� `tables.md`
鈹溾攢鈹� 馃搧 template/
鈹� 鈹溾攢鈹� 馃搧 db/
鈹� 鈹� 鈹溾攢鈹� 馃搧 data/
鈹� 鈹� 鈹� 鈹溾攢鈹� hospital.db
鈹� 鈹� 鈹溾攢鈹� 馃搧 queries/
鈹� 鈹溾攢鈹� `architecture.md`
鈹� 鈹溾攢鈹� `glossary.md`
鈹� 鈹溾攢鈹� `index.md`
鈹� 鈹溾攢鈹� `tables.md`
鈹� 鈹斺攢鈹� `update.py`
馃搧 infraestructure/
馃搧 github/
鈹溾攢鈹� 馃搧 workflows/
鈹� 鈹溾攢鈹� main.yml
馃搫 mkdocs.yml
馃搨Mkdocs: Component Overview
Component | Directory | Description |
---|---|---|
Database (db ) |
db |
Contains the SQLite database (hospital.db ) and queries (metadata.sql , person.sql ) to manage dynamic data. Learn more about managing SQL queries in Python in my previous article: Python Projects with SQL. |
馃枊锔� Templates & Pages | template |
Markdown templates: index.md , tables.md , architecture.md , glossary.md . Supports Mermaid diagrams, embedded images, and database-driven content. |
馃柤锔� Static Content (docs )
|
docs |
Final site generated by update.py , including images (img/ ) and dynamic content populated from template . |
馃寪 Infrastructure (infraestructure )
|
infraestructure |
Terraform scripts (main.tf , variables.tf ) to deploy an S3 bucket for documentation hosting. |
馃搫 Mkdocs: Configuring mkdocs.yml
Once we have our project structure set up, we will configure it step by step, starting with the mkdocs.yml file
. This file defines the structure and settings for your documentation site. Here's how it should be structured:
mkdocs.yml
site_name: Hospital Documentation
nav:
- Home: index.md
- Synthea Tables: tables.md
- AWS Architectur: architecture.md
- Glossary: glossary.md
markdown_extensions:
- pymdownx.superfences:
custom_fences:
- name: mermaid
class: mermaid
theme:
name: material
In this configuration file, you can primarily see in the nav section the pages that will be accessible from the menu. Then, we specify the Mermaid
extension, which will be explained in the next section. Finally, the theme section applies the Material theme
, enabling styling and components available within this library.
鉁忥笍 Mkdocs: Mermaid Extension
As mentioned earlier, Mermaid is a JavaScript library for creating diagrams and charts from text. Below, we will see some examples. In our case, we will use it to generate an Entity Relationship Diagram (ERD) on the tables page of the documentation.
In the repository, you will be able to see how to construct this code based on the Entity Relationship Diagram (ERD) found in the official Synthea documentation. You can also check the example of the tables page in the following link: tables.md.
鈿欙笍 Mkdocs: Dynamic Content with Jinja
To enable dynamic content generation for our documentation site, we鈥檒l use Jinja to process templates and replace placeholders with actual data. Below is a step-by-step breakdown:
Set Up a
templates
Folder
Create a folder namedtemplates
to store all Markdown files for the site. These files should include placeholders. For instance, inindex.md
, you might have placeholders like{{database.version_date}}
and{{database.version}}
.Utilize Placeholders
Placeholders are dynamic variables in the Markdown files. These variables will be updated automatically using Python dictionaries to inject relevant data.-
Generate Dynamic Content with
update.py
- Prepare your Markdown templates by identifying the sections where dynamic data is required.
- Use a Python script (
update.py
), available in my repository, to process the templates. The script performs the following tasks:- Database Connection: Connects to a SQLite database to fetch the latest values.
- Template Rendering: Uses the Jinja library to substitute placeholders with data from the database.
-
File Generation: Outputs updated Markdown files to the
docs
folder, ready for rendering in MkDocs.
from jinja2 import Template
def render_template(template_str, data):
"""Render the template with the data."""
template = Template(template_str)
return template.render(data)
# Data structure
data_dict = {
'database': {
'version': 1,
'version_date': '2024-01-01'
}
}
# Render the template
rendered_content = render_template("Data updated to {{database.version_date}}", data_dict)
print(rendered_content)
Data updated to 2024-10-01 10:20:30
By following these steps, you can automate the updating process for your documentation site, ensuring the content remains dynamic and relevant without manual edits.
Dynamic Update of Data Tables
In the next example, we will update the content in the tables.md
file to show an example of the persons
table from the database. To do this, we will create a placeholder {{table.person}}
within the Markdown file. The idea is to dynamically fetch the data from the persons
table, and then use the Jinja library along with pandas to convert the query results into a Markdown table format.
Here鈥檚 an example of how the tables.md
file looks with the placeholder:
#### Example Person Table
## Person Table
{{table.person}}
The process is as follows:
-
Query the Database: The script will query the
persons
table in the SQLite database to fetch relevant records. - Convert to Markdown: Using pandas, the results of the query will be converted into a Markdown table format.
-
Replace the Placeholder: The
{{table.person}}
placeholder in thetables.md
file will be replaced by the generated Markdown table.
import sqlite3
import pandas as pd
import aiosql
def get_queries():
sql = aiosql.from_path('template/db/queries', 'sqlite3')
return sql
def get_table_person(db_name):
"""Get a DataFrame from the PATIENTS table."""
query = get_queries().get_example_patients.sql
connection = sqlite3.connect(db_name)
df = pd.read_sql_query(query, connection)
connection.close()
return df
def update_tables_file(template_path, output_dir, db_name):
"""Update the tables.md file"""
df = get_table_person(db_name)
# Convert the dataframe in markdown format
data_dict = {
'table': {
'person': df.to_markdown(index=False)
}
}
print(get_table_person("hospital.db"))
This way, the documentation always reflects up-to-date data, displaying dynamic examples based on the actual content from the database.
鈿欙笍 Mkdocs: Final Workflow
-
Create Templates: Develop your pages in the
docs/template
directory. -
Run
update.py
: Populate dynamic content and generate the final files indocs/output
. -
Preview Locally: Use
mkdocs serve
to preview the site on localhost. -
Build for Deployment: Use
mkdocs build
to generate a static site in thedocs/
folder. - Deploy: Use Terraform to deploy the site to an AWS S3 bucket. Refer to the deployment section of this post for detailed instructions.
馃 Docusaurus Implementation
In the following sections, I will provide detailed steps and insights on how to implement a documentation site using Docusaurus. This includes setup, customization, and deployment options.
馃殌 Key Features of Docusaurus
- 馃摱 Mermaid Support: Similar to MkDocs, Docusaurus supports Mermaid for embedding diagrams.
- 鈿涳笍 React Components: Built on React, Docusaurus enables the integration of dynamic components into your documentation.
- 馃攧 Dynamic Content: Leverages Python scripts to fetch and update content dynamically from an SQLite database.
馃敡 Docusaurus Setup: From Scratch
To get started with Docusaurus, we follow a quick setup process, which is very similar to the steps we used for MkDocs but with different tools.
- Create a New Docusaurus Project: First, install Node.js and run the following command to create a new Docusaurus site:
npx create-docusaurus@latest my-website classic
- Install Mermaid Package: To enable Mermaid diagrams, install the required package:
npm install @docusaurus/theme-mermaid
- Run the Development Server: Once installed, navigate to your project directory and run the development server:
cd my-website
npx docusaurus start
-
Visit the Site:
Your site will be live locally at:
http://localhost:3000
.
馃敡 Docusaurus Customization: Configuration
The configuration file docusaurus.config.js
is where we customize the title, theme, navigation, and enable features like Mermaid for diagram rendering.
Example snippet for enabling Mermaid:
module.exports = {
title: 'Hospital Documentation',
tagline: 'Documentation for Hospital Data ML Project',
favicon: 'img/favicon.ico',
url: 'https://your-site-url.com',
markdown: {
mermaid: true, // Enable Mermaid diagrams
},
themeConfig: {
navbar: {
title: 'Hospital Docs',
items: [
{ to: 'docs/', label: 'Home', position: 'left' },
{ to: 'docs/tables', label: 'Tables', position: 'left' },
{ to: 'docs/architecture', label: 'Architecture', position: 'left' },
{ to: 'docs/glossary', label: 'Glossary', position: 'left' },
],
},
footer: {
style: 'dark',
links: [
{ label: 'GitHub', href: 'https://github.com/your-repo' },
],
},
},
};
馃敡 Docusaurus Customizing the Homepage
To customize the homepage, we modify the src/components/HomepageFeatures/index.js
file. Here, you can adjust the FeatureList object to update the features displayed on the homepage.
馃搨 Docusaurus Content Organization and Structure
Just like in MkDocs, Docusaurus supports Markdown files for content, and we organize the structure as follows:
-
Template Folder: Store your Markdown files in the
docs/template
directory, and create a Python script (similar toupdate.py
) to fetch and populate dynamic data into these templates. -
Category File (
__category__.json
): To manage the order of documents in the sidebar, create a__category__.json
file in each folder. For example:
Architecture
鈹溾攢鈹� architecture.md
鈹溾攢鈹� img
鈹斺攢鈹� __category__.json
__category__.json
Example:
{
"label": "Architecture",
"position": 2,
"link": {
"type": "generated-index",
"description": "AWS Data Processing Blueprint"
}
}
鈿欙笍 Dynamic Data with Jinja
To incorporate dynamic content, such as database tables, we use a 馃悕Python script named update.py
, which you can find in the repository.
This script fetches data from a SQLite database and processes the Markdown files stored in the templates
folder. It then updates these files with the fetched data and copies them into the docs
folder, preparing them for site rendering.
This workflow ensures that the content remains up-to-date and ready for deployment, following a similar approach to what we implemented with MkDocs.
鈿欙笍 Docusaurus: Final Workflow
-
Create Templates: Develop your Markdown files within the
docs/template
directory. - Run Python Script: Use the script to dynamically populate data into the templates.
-
Preview Locally: Run
npx docusaurus start
to preview the site. -
Build for Deployment: Once ready, use
npx docusaurus build
to generate the static site. - Deploy: Host the static files on your preferred platform, such as AWS S3.
馃殌 Deployment
In this section, we will cover the deployment process for both MkDocs and Docusaurus using AWS S3 for hosting. While the deployment steps are the same for both tools, the installation processes differ, with MkDocs being Python-based and Docusaurus being JavaScript-based.
Infrastructure Setup with Terraform
To deploy a static documentation site to AWS S3, we use Terraform to provision and configure the required resources. The setup defines the S3 bucket, enables static website hosting, and configures public access with a bucket policy to allow read-only access. You can find the main.tf
file in the repository.
馃殌 Key Components for S3 Deployment
- S3 Bucket Creation: The resource to create the S3 bucket where the documentation will be hosted.
-
Static Website Hosting: Configuration for static web hosting, setting the
index.html
anderror.html
as the main and error documents. - Public Access Configuration: Manages public access to the S3 bucket, ensuring it is configured for read-only access.
- Bucket Policy: Allows public access to retrieve the documentation content from the S3 bucket.
You can access the complete Terraform file and the corresponding configurations for deploying the site in the repository:
Terraform Configuration File:
GitHub Action Workflow for Automatic Deployment: A CI/CD pipeline to automate the deployment process is also included in the repository.
GitHub Actions Configuration
Make sure to configure your AWS credentials in the GitHub repository secrets under Settings > Secrets > Actions. This will allow GitHub Actions to securely access your AWS account and perform actions like uploading files to S3 when you push changes to themain
branch.
Repositories
Below are the links to all the code to deploy your documentation site. If you find it useful, you can leave a star 猸愶笍 and follow me to receive notifications of new articles. This will help me grow in the tech community and create more content.
- MkDocs Deployment: GitHub Repository for MkDocs
r0mymendez / doc-as-code-mkdocs
A tutorial on how to create a documentation project using the 'Doc as Code' methodology
鈿欙笍 Doc as Code Tutorial
馃殌 MkDocs & MkDocs-material
MkDocs is an excellent solution for implementing a documentation portal that can be easily updated with code
, helping to keep your software development project documentation up-to-date and versioned.
In this repository, I have created a simple site to document the data model and machine learning project.
The documentation will include charts
, tables
, and architecture
examples, providing a comprehensive and easy-to-understand
guide on how to implement this framework in combination with two other 馃悕Python libraries.
What is Documentation as Code?
Documentation and its updates are an important process in many companies that develop software, where this process is carried out using different tools, many of which are paid solutions.
Therefore, in recent times, the concept of "doc as code" has emerged. This means using the same tools and workflow used in software development to manage
, version
, and鈥�/p>
- Docusaurus Deployment: GitHub Repository for Docusaurus
r0mymendez / doc-as-code-docusaurus
A tutorial on how to create a documentation project using the 'Doc as code' methodology
鈿欙笍 Doc as Code Tutorial
馃殌 Docusaurus
Docusaurus is an excellent solution for implementing a documentation portal that can be easily updated with code
, helping to keep your software development project documentation up-to-date and versioned.
In this repository, I have created a simple site to document the data model and machine learning project.
The documentation will include charts
, tables
, and architecture
examples, providing a comprehensive and easy-to-understand
guide on how to implement this framework in combination with two other 馃悕Python libraries.
What is Documentation as Code?
Documentation and its updates are an important process in many companies that develop software, where this process is carried out using different tools, many of which are paid solutions.
Therefore, in recent times, the concept of "doc as code" has emerged. This means using the same tools and workflow used in software development to manage
, version
, and deploy
documentation鈥�/p>
馃攳 Final Conclusions: MkDocs vs. Docusaurus
Both solutions are easy to implement, but in the following items, we can explore some differences, and what is the best solution depends on the context, knowledge, and complexity you may need to implement.
- 馃捇 Language & Customization: MkDocs is Python-based, with simple YAML configurations and templates, ideal for quick setups. On the other hand, Docusaurus is React-based, offering advanced customization and interactive components, making it more suitable for users needing more control over visuals.
- 馃搼 Markdown & Rendering: Both use Markdown, but Docusaurus allows for interactive elements, making it better for dynamic content.
- 鈿欙笍 Complexity: Docusaurus is better for complex documentation applications, such as those with login systems. MkDocs is simpler but Docusaurus offers more flexibility for styling and features.
- 馃懃 Community: Docusaurus has a strong community with Discord and 74 plugins, while MkDocs relies on GitHub discussions for community support.
- 鈽侊笍 Amazon Deployment: You can deploy a static site to S3, reducing deployment costs, and also use CI/CD for automatic deployment.
馃摎 References
- Mkdocs: https://www.mkdocs.org/
- Mkdocs-Material: https://squidfunk.github.io/mkdocs-material/
- Diagrams: https://diagrams.mingrammer.com/
- Docusaurus: https://docusaurus.io/
- Jinja: https://jinja.palletsprojects.com/en/stable/
- Git Book - What is doc as code: https://www.gitbook.com/blog/what-is-docs-as-code
- Write the docs: https://www.writethedocs.org/guide/docs-as-code/
Top comments (0)