XSLT Static Site Generator
For a project where our agency was contracted to design and deliver HTML, CSS and JavaScript to the client, I created a static site generator as a means of quickly prototyping web page layouts as static HTML files, while still being able to maintain all the features of the layout that we intended to deliver to the client, such as current classes on selected navigation items.
The best way to deliver quality code to the client was to rely on a W3C standard for templating: XSLT. Rather than have to manually edit pages across the entire set of layouts, I was able to run a series of commands on the command line to generate the pages for the site. Because these commands rely on xsltproc
, which is already available out of the box in any UNIX-based system, including Mac and Linux, it was a great way to use HTML preprocessing in our front end development process.
Preprocessors for HTML
The Preprocessors for HTML article describes the basic process of using xsltproc
to transform XML data with XSLT stylesheets to output HTML.
Most of my work has involved the development of static HTML prototypes to be integrated into custom web applications or content management systems. Manually developing the prototypes as static HTML is usually fine when I am dealing with a relatively small number of pages, but as the requirements of the application grows, it is usually necessary to abandon the static work on the HTML files and work directly within the application or CMS. This often involves a fair amount of overhead just to get a local development environment set up. And the design process often gets bogged down in the complexities of building and maintaining the system, rather than on the design process.
As the page complexity increases and the number of pages increases, there is a point where maintaining a repository of static HTML files, in the midst of a design process, becomes impractical. But the requirements of this project demanded the delivery of static HTML files where the site had a very complex and deeply nested navigation structure. I needed to quickly find a solution to managing the HTML files in a way that our team could easily maintain without the overhead of installing a CMS.
XML and XSLT
The system that we are integrating with is a custom PHP-based solution built on XML and XSLT. So, it made sense to enable easy integration into the client CMS by building our templates with XML and XSLT.
The solution I came up with has evolved over time, but I eventually came to pattern the file structure after what I am most familiar with, the XSLT-powered open source content management system, Symphony CMS.
Directory Structure
The main difference between the directory structure of the Symphony workspace and the XSLT static site generator is the addition of the data
directory, which contains XML data used by all pages. Also, for static site generation, there is no need for any of the PHP files used by the system, so the only other directories needed are the pages
and utilities
directories.
site/
├── index.xml
├── workspace/
│ ├── data/
│ │ ├── navigation.xml
│ ├── pages/
│ │ ├── index.xsl
│ ├── utilities/
│ │ ├── master.xsl
XML Data
When creating HTML output with XSLT, you always need to start with XML data. Extensible Stylesheet Language Templates are an extension of XML, Extensible Markup Language. With XSLT, you can output to any text-based format that you like, but it shines in its ability to transform any form of valid XML data into other forms of XML, such as XHTML.
You don’t need much data to transform into output. You could start with something as simple as a single node.
data.xml
<?xml version="1.0" encoding="utf-8" ?>
<data/>
Then, use the simplest XSLT stylesheet.
hello.xsl
<?xml version="1.0" encoding="utf-8" ?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">Hello</xsl:template>
</xsl:stylesheet>
Run the following command in the same directory as these files and the output would be a simple text file with the word, Hello
:
xsltproc -v -o hello.txt hello.xsl data.xml
hello.txt
Hello
Static Site Generation Files
For the purposes of what I needed to accomplish, I only needed a small set of data to set specific properties.
index.xml
I create an XML file that will serve as the data source for page-specific parameters, such as page title, the URL handle for the current page, and how to navigate to the root directory relative to the current directory.
<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="workspace/pages/index.xsl" ?>
<!-- xsltproc -v -o index.html workspace/pages/index.xsl index.xml -->
<data>
<params>
<website-name>Site Name</website-name>
<page-title>Home</page-title>
<current-page>home</current-page>
<root>./</root>
</params>
</data>
The file also includes the xml-stylesheet
declaration that enables a browser to process HTML with the specified XSL file when pointing to the XML file.
And, just in case you want to run the xsltproc
command from within the same directory as the index.html
file, there is a comment to store the command.
index.xsl
The XML file is transformed into HTML by XSLT stylesheets stored in the workspace/pages
directory.
A page template can start out very basic:
<?xml version="1.0" encoding="utf-8" ?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:import href="../utilities/master.xsl" />
<xsl:template match="data">
<h2><xsl:value-of select="params/page-title" /></h2>
</xsl:template>
</xsl:stylesheet>
There’s not much going on in this template other than an xsl:import
instruction and a match template to output an an h2
element with the page title. The intention is that the page template will contain only page-specific content.
The important thing to note in this file is that the match template is using an XPath expression as the value of the match
attribute. This template matches the data
node of the XML file. As the processor is navigating the XML tree, when it finds the data node, the XSLT creates output by following the instructions found in the xsl:template
element.
While the data
node is the first node of the XML document, the root of the document precedes the first node. The root of the document is selected by the match template with a forward slash (/
) as the value of the match attribute. We find this value in the imported master.xsl
stylesheet.
master.xsl
The index.xsl
page template imports the master.xsl
stylesheet from the workspace/utilities
directory.
<?xml version="1.0" encoding="utf-8" ?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml"
doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"
doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
omit-xml-declaration="yes"
encoding="UTF-8"
indent="yes" />
<xsl:template match="/">
<html>
<head>
<title><xsl:value-of select="data/params/website-name" /></title>
</head>
<body>
<h1><xsl:value-of select="data/params/website-name" /></h1>
<xsl:apply-templates />
</body>
</html>
</xsl:template>
</xsl:stylesheet>
The master template is being used to manage the structure of the page outside of the main content area.
HTML Output
If you view the XML file in a modern browser, you’ll see the generated HTML. If you view the source, you’ll probably see the original XML file. You can view the rendered source with browser developer tools.
To actually render the HTML file and save the file to the directory, run this command inside the directory containing the index.xml
file:
xsltproc -v -o index.html workspace/pages/index.xsl index.xml
The command will use the XSLT processor to process the index.xml
file with the workspace/pages/index.xsl
file and output to the index.html
file. This is the command that is documented at the top of the XML file.
index.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Site Name</title>
</head>
<body>
<h1>Site Name</h1>
<h2>Home</h2>
</body>
</html>
Now, the directory structure should look like this:
site/
├── index.html
├── index.xml
├── workspace/
│ ├── pages/
│ │ ├── index.xsl
│ ├── utilities/
│ │ ├── master.xsl
The index.html
file was created by the XSLT processor. At this point, we don’t have more than a single page, so we haven’t considered data that is needed by all pages, so I haven’t included the data
directory yet.
The Build File
We can set up a simple shell script that allows the ability to run the XSLT processing command with a simple command.
Create a file called build
in the root directory with the following contents:
#!/bin/bash
xsltproc -v -o index.html workspace/pages/index.xsl index.xml;
I have added a semicolon to the end of the command so we can run multiple commands when we eventually add more pages to the list.
Make sure the file has the correct permissions to be able to execute the script:
chmod 755 build
Then, simply call the shell script from the root directory.
./build
That’s the basic example. The actual templates are a little more complex. Let’s see if we can break it down into understandable pieces.
XSLT Templates
What makes XSLT really interesting is the ability to set variables and parameters, and the ability to set modes on templates and override templates. All of these things help to keep the process of building pages DRY (Don’t Repeat Yourself). So, let’s start by breaking down the page into manageable pieces, so we need only edit each part in one place.
master.xsl
<?xml version="1.0" encoding="utf-8" ?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fb="http://ogp.me/ns/fb#">
<xsl:import href="../utilities/head.xsl" />
<xsl:import href="../utilities/body.xsl" />
<xsl:output method="xml"
doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"
doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
omit-xml-declaration="yes"
encoding="UTF-8"
indent="yes" />
<!-- Page Parameters -->
<xsl:param name="root" select="/data/params/root" />
<xsl:param name="website-name" select="/data/params/website-name" />
<xsl:param name="page-title" select="/data/params/page-title" />
<xsl:param name="current-page" select="/data/params/current-page" />
<xsl:param name="parent-page" select="/data/params/parent-page" />
<xsl:param name="root-page">
<xsl:choose>
<xsl:when test="/data/params/root-page">
<xsl:value-of select="/data/params/root-page" />
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$current-page" />
</xsl:otherwise>
</xsl:choose>
</xsl:param>
<xsl:param name="navigation" select="document('../data/navigation.xml')" />
<xsl:param name="has-section-nav" select="false()" />
<!-- Directories -->
<xsl:param name="css" select="concat($root, '_assets/stylesheets/css/')" />
<xsl:param name="scripts" select="concat($root, '_assets/scripts/')" />
<xsl:param name="images" select="concat($root, '_assets/images/')" />
<xsl:template match="/">
<xsl:comment><![CDATA[[if lt IE 7]> <html class="ie ie6 lt-ie7 no-js" lang="en" xmlns:fb="http://ogp.me/ns/fb#"> <![endif]]]></xsl:comment>
<xsl:comment><![CDATA[[if IE 7]> <html class="ie ie7 lt-ie8 no-js" lang="en" xmlns:fb="http://ogp.me/ns/fb#"> <![endif]]]></xsl:comment>
<xsl:comment><![CDATA[[if IE 8]> <html class="ie ie8 lt-ie9 no-js" lang="en" xmlns:fb="http://ogp.me/ns/fb#"> <![endif]]]></xsl:comment>
<xsl:comment><![CDATA[[if IE 9]> <html class="ie ie9 lt-ie10 no-js" lang="en" xmlns:fb="http://ogp.me/ns/fb#"> <![endif]]]></xsl:comment>
<xsl:comment><![CDATA[[if gt IE 9]><!]]></xsl:comment><html class="no-js" lang="en" xmlns:fb="http://ogp.me/ns/fb#"><xsl:comment><![CDATA[<![endif]]]></xsl:comment>
<xsl:apply-templates select="." mode="head" />
<xsl:apply-templates select="." mode="body" />
</html>
</xsl:template>
</xsl:stylesheet>
The master template is a single template that acts as the central station for managing all the page layouts of the site. All the elements that are consistent across the site design should be accounted for with this template, at least as a reference to other templates.
The xsl:stylesheet
element declares the XML namespace for the Facebook Open Graph Protocol. For the XSLT processor to run without errors, the XML namespaces used in the document must be properly declared.
xmlns:fb="http://ogp.me/ns/fb#"
The xsl:import
instructions include href
attributes with values that point to the templates on which the master template depends. The child elements for any HTML document are the head
and body
elements. So, we create a separate template to be responsible for each element and its descendant elements and import these templates into the master template.
head.xsl
body.xsl
The xsl:output
element sets the properties to be used to render the output of the XSL transformation.
- Format: XML
- Doctype: XHTML 1.0 Strict
- Document Type Declaration Schema URL: http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
- Omit the XML declaration: yes
- Character encoding: UTF-8
- Code Indenting: yes
In this case, the template collects data from the XML file in the form of page parameters
$root
$page-title
$current-page
$parent-page
$root-page
$navigation
$has-section-nav
There is also a set of parameters to render the paths to directories storing the site design assets:
$css
$scripts
$images
These paths will be output as relative URLs, with the $root
rendered as a relative path.
Finally, we come to the match template, which matches the root of the XML document. This is where the transformation begins. We would usually start with the root element of the HTML document, the html
element. However, because we want to serve IE conditional comments, the template starts with xsl:comment
instructions to render CDATA
sections needed to render properly formatted HTML comments. The XSLT processor will ignore comments when processing the output, so comments must use the xsl:comment
instruction to render the output.
Within the conditional comment for the latest version of IE is the opening tag of the html
element. Contained within the html
element are two xsl:apply-templates
instructions with .
as the value of the select
attribute with different values for the mode
attribute. The dot is an XPath expression referring to the current node. This means that the XSLT processor will continue from this point by first finding templates with the same mode value that match the current node, that is, the root node (/
).
We’ll describe these two XSLT stylesheets next.
head.xsl
The head
element contains the meta
, title
, link
and script
elements.
<?xml version="1.0" encoding="utf-8" ?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:import href="../utilities/js.xsl" />
<xsl:template match="/" mode="head">
<head>
<meta name="description" content="{$website-name}" />
<meta name="author" content="{$website-name}" />
<meta name="viewport" content="width=device-width,initial-scale=1" />
<xsl:apply-templates mode="page-title" />
<xsl:apply-templates mode="css" />
<xsl:apply-templates mode="js" />
</head>
</xsl:template>
<xsl:template match="data" mode="page-title">
<title>
<xsl:value-of select="$page-title" />
<xsl:text> | </xsl:text>
<xsl:value-of select="$website-name"/>
</title>
</xsl:template>
<xsl:template match="data" mode="css">
<link rel="stylesheet" href="{$css}screen.css" />
<xsl:comment><![CDATA[[if IE]> <link href="]]><xsl:value-of select="$css" /><![CDATA[ie.css" media="screen, projection" rel="stylesheet" type="text/css" /> <![endif]]]></xsl:comment>
</xsl:template>
</xsl:stylesheet>
At the top of the file, the xsl:import
instruction refers to the js.xsl
stylesheet, containing instructions for handling the script
elements for JavaScript files.
This XSLT stylesheet contains three templates. The first template is the match template with a mode
attribute value of head
that matches the root element (/
). This is the next step for the XSLT processor after processing the master.xsl
stylesheet.
Using the mode allows the ability to use the xsl:apply-templates
instruction without needing to traverse further down the XML tree structure before adding more structure and processing additional instructions to render the desired output.
The three xsl:apply-templates
instructions inside the head
match template do not have select
attributes. These templates will be processed on matching the next node in the XML tree structure, which is the data
node. These templates have been set to output the title
, link
and script
elements, using the page-title
, css
and js
modes, respectively.
The second template is the match template with a mode of page-title
. This template matches the data
node of the XML. The parameters $page-title
and $website-name
that were declared in the master.xsl
template are output by the xsl:value-of
instructions.
body.xsl
<?xml version="1.0" encoding="utf-8" ?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:import href="../utilities/global-nav.xsl" />
<xsl:import href="../utilities/header.xsl" />
<xsl:import href="../utilities/sub-nav.xsl" />
<xsl:import href="../utilities/footer.xsl" />
<xsl:template match="/" mode="body">
<body class="section-{$root-page}">
<xsl:if test="$current-page = 'home'">
<xsl:attribute name="id">home-page</xsl:attribute>
</xsl:if>
<div class="page" id="{$current-page}-page">
<xsl:call-template name="global-nav" />
<xsl:call-template name="header" />
<xsl:apply-templates />
<xsl:call-template name="sub-nav" />
<xsl:call-template name="footer" />
</div>
</body>
</xsl:template>
</xsl:stylesheet>
The match template for the body
element sets the class
on the body element and an id
attribute on the div
element that serves as the container for the rest of the page structure.
The xsl:call-template
instructions refer to the named templates contained by the stylesheets imported at the beginning of the body.xsl
stylesheet. These stylesheets contain templates to manage the following pieces of the page layout:
- global navigation
- header
- subnavigation
- footer
In the middle of these xsl:call-template
instructions is another xsl:apply-templates
instruction that has no select
or mode
attributes. The XSLT processor will continue traversing the XML document, matching elements to templates to process the remaining instructions. Here is where the main content area of the page is built.
index.xsl
On very complex pages, such as the home page, the structure of the page is further broken down into the pieces that build up this page.
<?xml version="1.0" encoding="utf-8" ?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:import href="../utilities/master.xsl" />
<xsl:import href="../utilities/notifications.xsl" />
<xsl:import href="../utilities/main.xsl" />
<xsl:import href="../utilities/overview.xsl" />
<xsl:import href="../utilities/supplementary.xsl" />
<xsl:import href="../utilities/secondary.xsl" />
<xsl:param name="notifications" select="true()" />
<xsl:template match="data">
<xsl:if test="$notifications">
<xsl:call-template name="notifications" />
</xsl:if>
<xsl:call-template name="main" />
<xsl:call-template name="overview" />
<xsl:call-template name="supplementary" />
<xsl:call-template name="secondary" />
</xsl:template>
</xsl:stylesheet>
Each HTML page is built by processing many different templates that can be modified in one place. By building each page on these same templates, changes can be reflected across the entire set of HTML pages by running a build
script that maintains a record of all the xsltproc
commands to process each HTML file.
The file structure currently looks something like this, although I have modified the page structure to be more generic, simpler and not quite as deeply nested:
site/
├──_assets
├──build
│ ├── images/
│ ├── scripts/
│ ├── stylesheets/
│ │ ├── css/
│ │ ├── sass/
│ ├── about/
│ │ ├── index.html
│ │ ├── index.xml
│ ├── contact/
│ │ ├── index.html
│ │ ├── index.xml
├── index.html
├── index.xml
│ ├── products/
│ │ ├── index.html
│ │ ├── index.xml
│ ├── services/
│ │ ├── index.html
│ │ ├── index.xml
├── workspace/
│ ├── data/
│ │ ├── navigation.xml
│ ├── pages/
│ │ ├── index.xsl
│ │ ├── about.xsl
│ │ ├── contact.xsl
│ │ ├── products.xsl
│ │ ├── services.xsl
│ ├── utilities/
│ │ ├── body.xsl
│ │ ├── footer.xsl
│ │ ├── global-nav.xsl
│ │ ├── head.xsl
│ │ ├── header.xsl
│ │ ├── js.xsl
│ │ ├── main.xsl
│ │ ├── master.xsl
│ │ ├── notifications.xsl
│ │ ├── overview.xsl
│ │ ├── page-title.xsl
│ │ ├── secondary.xsl
│ │ ├── section-navigation.xsl
│ │ ├── sub-nav.xsl
│ │ ├── supplementary.xsl
Modifying the Build File
As more pages are added to the site, the instructions for processing each XML file to output the HTML should be added to the build
script. With the site structure described above, the build
file would look something more like this:
#!/bin/bash
# These commands will process HTML files with XSLT when you run "./build" from the root of your site directory.
# Modify this list as you add more pages to the site.
xsltproc -v -o index.html workspace/pages/index.xsl index.xml;
xsltproc -v -o about/index.html workspace/pages/about.xsl about/index.xml;
xsltproc -v -o contact/index.html workspace/pages/contact.xsl about/contact.xml;
xsltproc -v -o products/index.html workspace/pages/products.xsl about/products.xml;
xsltproc -v -o services/index.html workspace/pages/services.xsl about/services.xml;
Again, it requires just a simple command to process all HTML files:
./build
Automating the Process
Theoretically, it wouldn’t take much to automate the process, so that, similar to the common set up for compiling SASS files, watched folders could be set up for the directories containing the XSL files. If XSL files are modified, all the HTML files could be processed. But, the likelihood for XSLT processing errors would likely increase, so it probably makes more sense for the build
script to be run on demand rather than automatically.
I will typically run the XSLT processor one file at a time, so I can trace errors and ensure the output is working as expected before processing all files.
Modular Site Development
While I haven’t spent time on the exact code used to build the content of the site, I have focused the attention of this article primarily on how to build the basic framework of a flexible and modular templating system for a static site generator.
There are specific issues involving setting the XML data for each page and how to write the templates to set the current class on navigation items. This may best be demonstrated through working examples. However, I first wanted to walk through the overall structure before delving deeper into the details of specific templates.
Comments for this article
No comments have been made so far.