Click here to skip to the meat-and-potatoes. Otherwise you will end up reading a story about why I wrote this article. It reads like a preface to an online recipe and ymmv. Last chance, click here NOW to go to the good shit.
... ok now that the nerds are gone ... this one is for all the online recipe fans out there ... click here to reveal a bonus preface ...
oof, got'ya ... ok for real now, let's skip ahead ...
Writing can be as tedious as publishing, but the pain of publishing can be turned into a quick and easy experience with the help of the right software. Sites like Blogger and Medium, as well as products like Wordpress are among the easiest and most popular solutions for blogging. In my case, I wanted to make my own solution because damn it, I'm a web developer and this is my website. How hard can it really be to write software to publish blog posts?
So I took a stab at writing my own solution, that's what I'll describe here in this blog post. I sought out to create a solution that matched a few criteria that I imagined would be my ideal self-publication experience. In particular, I needed my custom blogging software to give me the ability to:
- write blog posts in plain HTML, nothing fancy like markdown,
- save my blog posts to files, avoiding the need for databases,
- easily integrate into my current site styles, for consistency,
- avoid javascript frameworks, to keep my site light and quick,
- create the system myself, for the sake of fun.
I realized I needed static site generation.
Static Site Generation (SSG)
What the he|| is SSG and why would we use it? Cloudflare defines SSG as ...
... a tool that generates a full static HTML website based on raw data and a set of templates. Essentially, a static site generator automates the task of coding individual HTML pages and gets those pages ready to serve to users ahead of time. Because these HTML pages are pre-built, they can load very quickly in users' browsers.
So SSG is a tool that generates static files and because those files are static, pages load quickly. But what exactly do SSG tools do to generate those static files, and when does this process happen? FreeCodeCamp describes SSG as a process of ...
... compiling and rendering a website or app at build time. The output is a bunch of static files, including the HTML file itself and assets like JavaScript and CSS.
The same FreeCodeCamp article above mentions Next.js as an option for applications built in React, which would give me excellent control and out-of-the-box config for performing SSG.
Unfortunately Next.js is not a good option for me simply because my personal website does not use React. I briefly unpacked this tangent, click here to read more.
This probably deserves a blog post in-and-of-itself, but for illustration, most interactivity on my site is either written in typescript and compiled down to a javascript module (e.g., see my show-html-comments module; it's responsible for the HTML comment easter egg on my about page), or a small web component (e.g., see my poloroid-like carousel). (And because I wanted my blog posts to be written in HTML, I could even write one-off javascript interactivity, like the "cooking recipe" easter egg functionality at the beginning of this post.)
I just haven't written interactivity that requires stateful layers or super complex user experiences to justify reaching for javascript frameworks (yet). Instead, my use cases are small and discrete, which can be perfectly addressed by importing small and discrete global javascript modules.
This allows me to easily add-and-remove scripts to different pages, as well as gain a performance boost because each script is bundled and loaded separately. So although I am a big fan of React, for now I'd prefer an option that will keep my javascript overhead light.
Anyways, I'm in the mood to build an SSG tool myself with vanilla javascript and node. So let's begin.
My Solution
Currently my personal website follows a simple folder structure that is traditionally used in server directories, which is recommended in the modern web dev guide:
www
└── css
└── ... css assets in here
└── html
├── index.html
└── ... other html
└── js
└── ... js assets in here
To get started, I updated my package.json
with the following
scripts that will trigger my custom SSG script in two situations, hot module
reloading and builds:
"start": "yarn start:server & yarn start:scss & yarn start:ssg",
"start:server": "wds --config ./wds.config.mjs --app-index www/html/index.html",
"start:scss": "sass --no-source-map -w www/css --style compressed",
"start:ssg": "yarn start:ssg:blog",
"start:ssg:blog": "esr .bin/watch.mjs --dir www/html/blog --ignore index.html --cmd 'yarn build:ssg:blog'",
"build": "rm -rf public && yarn build:ssg && yarn sass && rollup -c ./rollup.config.mjs",
"build:ssg": "yarn build:ssg:blog",
"build:ssg:blog": "esr .bin/ssg.mjs --template www/html/blog/template.html && prettier --write www/html/blog/**/index.html",
A couple critical additions here:
-
start
runsstart:ssg
and therefore runsstart:ssg:blog
in parallel with other local development config.
-
Notice that
start:ssg:blog
calls.bin/watch.mjs
. -
Also notice that
yarn build:ssg:blog
is passed as the "--cmd" argument to the.bin/watch.mjs
script.
-
build
runsbuild:ssg
and therefore runsbuild:ssg:blog
.
-
Notice that
build:ssg:blog
calls.bin/ssg.mjg
.
watch.mjs
is a handy script that will watch a directory for
file changes and trigger a side-effect command in response. In this case,
the command yarn build:ssg:blog
runs in response to file
changes. Essentially this provides hot module reloading (HMR) in my local
development environment for all SSG'd content. This means that I can make
changes to a blog file, save it, and immediately see the SSG'd version of
that blog post in my browser.
Much of the heavy lifting for this functionality is provided by the chokidar node library, meaning the source code for this watch script is as little as 20 lines of node code. Not bad!
/**
* watch.mjs -- file change side-effect script
*
* Given a directory and a command, this script will watch the target
* directory and run the provided command whenever any files change.
*
* @arg {string} cmd The command to run, e.g., `--cmd 'yarn build'`
* @arg {string} dir The target directory, e.g., `--dir www/html/blog`
* @arg {string | undefined} ignore File patterns to ignore, e.g., `built-file.html`
*
*/
import chokidar from "chokidar";
import { spawn } from "child_process";
import args from "./args.mjs";
const {
dir,
cmd,
ignore: ignored,
} = args(["dir", "cmd"], { optional: ["ignore"] });
console.log(
`watch.mjs watching for changes in "${dir}" and will run "${cmd}" in response \n`
);
chokidar.watch(dir, { ignoreInitial: true }).on("all", function (event, path) {
if (ignored && path.match(ignored)) return; // break infinite loops
// run the provided command with its args
const [command, ...args] = cmd.split(" ");
spawn(command, args, {
stdio: "inherit",
});
});
For both HMR and builds, I call ssg.mjs
. This script requires
two arguments: "--base" (optional) and "--template". If a base is not
provided, the template's directory is assumed to be the base directory. The
template is HTML that includes content "slots" within HTML comments using
the following syntax: <!--ssg:xyz.html-->
. For example,
here is the template I used when this blog post was generated.
Content slots are identified by parsing the template for the matching
syntax. For example, if the slot
<!--ssg:article.html-->
is defined, then the SSG script
will walk recursively from the base directory to find files matching
"article.html". For each file found that matches the slot, a page is
generated. If multiple slots are defined and multiple matches are found in a
specific directory, the content for both files are injected into the
template and one file is generated.
Global slot content can be defined at the base directory. For example, if
the slot <!--ssg:header.html-->
is defined at
"/base-dir/header.html", then the content within "/base-dir/header.html"
will be considered the global or fallback content to use for each page. For
example, if the SSG script finds a directory with only "article.html"
defined, then the global content for the header defined in
"/base-dir/header.html" will be used for that page. If however that same
directory contained a "header.html", that content is used for that slot
instead.
Here is a diagram to further illustrate how the script will take a template and apply it to all matching slots from the base directory:
blog -- "base" - nested folders are targeted for iteration
└── index.html -- this file is ignored
└── blog-template.html -- "template" - this file is the template
| -- within this file are two slots:
| -- <!--ssg:article.html--> and <!--ssg:header.html-->
└── article.html -- this is the GLOBAL article slot default / fallback
| -- (global slots are not required; script will fail graciously)
└── header.html -- this is the GLOBAL header slot default / fallback
└── 2020/10
└── article.html -- this is the LOCAL article slot override
└── index.html -- SSG'd file w/ LOCAL article and GLOBAL header
└── 2020/11
└── article.html -- this is the LOCAL article slot override
└── header.html -- this is the LOCAL header slot override
└── index.html -- SSG'd file w/ both LOCAL article AND header
For a more complex example, take a look at my blog directory as-of the publication of this blog post. (It looks similar to the above diagram, but with more slots!)
With just under 60 lines of source code, ssg.mjs
will take a template, walk a directory, and
generate static content for all matching slots.
/**
* ssg.mjs -- static site generation script
* Given a template and a target directory base, this script will
* iterate over the nested directories from the base directory and
* files that match the content specified in the template.
* @arg {string | undefined} base
* The directory to target for ssg iteration. If not defined,
* the template's path is used for iteration.
* @arg {string} template The path to the template.html file.
* The template file should include one or more HTML comments with
* slots defined using the syntax: ``.
* For each directory in the base path that includes at least 1 slot,
* an `index.html` is generated with
* @example Folder structure:
blog <-- "base" - nested folders are targeted for iteration
└── index.html <-- this file is ignored
└── blog-template.html <-- "template" - this file is the template
| - within this file are two slots:
| and
└── article.html <-- this is the GLOBAL article slot default / fallback
| (global slots are not required; script will fail graciously)
└── header.html <-- this is the GLOBAL header slot default / fallback
└── 2020/10
└── article.html <-- this is the LOCAL article slot override
└── index.html <-- SSG'd file w/ LOCAL article and GLOBAL header
└── 2020/11
└── article.html <-- this is the LOCAL article slot override
└── header.html <-- this is the LOCAL header slot override
└── index.html <-- SSG'd file w/ both LOCAL article AND header
*/
import { readFileSync, writeFileSync } from "fs";
import glob from "glob";
import { dirname, join } from "path";
import args from "./args.mjs";
// setup helper data / fxs
const ssgSlotSyntax = /(?<=)/g;
function content(slots, __path, content = {}) {
slots.forEach((slot) => {
try {
// if there is content in the taraget directory, use it
const __file = readFileSync(join(__path, slot), "utf-8");
if (__file) content[slot] = __file;
} catch {
// else if no global content, use empty string to clear html comments
if (!content[slot]) content[slot] = "";
}
});
return content;
}
// destructure script args
const {
template = null,
base: __base = join(process.cwd(), dirname(template)),
__template = join(process.cwd(), template),
} = args(["base", "template"], { optional: ["base"] });
// derive ssg slots
const html = readFileSync(__template, "utf-8");
const slots = html.match(ssgSlotSyntax);
// gather global slot content from base into obj
// keys are file names for the content, values are the content within that file
const globalSlotContent = content(slots, __base);
// iterate over relevant nested files from the base directory
glob(
`${__base}/**/*.html`,
{ ignore: `${__base}/*.html` },
function (_, files) {
const __directories = new Set(files.map((file) => dirname(file)));
__directories.forEach((__dir) => {
// update slots
const localSlotContent = content(slots, __dir, globalSlotContent);
const regex = new RegExp(
Object.keys(localSlotContent)
.map((key) => ``)
.join("|"),
"gi"
);
let sscontent = html.replace(
regex,
(matched) => localSlotContent[matched.match(ssgSlotSyntax)]
);
// update relative paths
const relativity = (__dir.split(__base)[1].match(/\//g) || []).length + 1;
const relativePathStr = /(?<=)("\.\.\/)/g;
sscontent = sscontent.replace(
relativePathStr,
() => `"${'../'.repeat(relativity)}`
);
// write file
writeFileSync(`${__dir}/index.html`, sscontent);
});
}
);
Wrapping Up
It is not much but it'll do. What I have described here is my high-level approach to designing a custom SSG tool. Credit owed to this article that describes SSG in node, which I drew a little bit of influence from when creating my solution.
There are certainly opportunities to improve my solution, some ideas of mine include:
- Single Page SSG: rather than recursively generating files from a base directory, it would be nice to target just a single directory for cases where there may be many subdirectories (e.g., generation on the fly).
- SSG from a fountain: pull data from a remote source and plug the returned content into a template.
- Hybrid SSG: pull some data from a remote source (e.g., define an endpoint within a file to ping, parse, and print).
- SSR On Top: If the above 3 tasks are implemented then it would be interesting to respond to unfound pages within a known SSG target with an SSG runner, for SSG on-the-fly. This might be helpful for cases where there are hundreds or more entries to generate. (Is this when SSG becomes SSR??)
- CMS layer, but let's not get too ahead of ourselves.