Skip to content

Blogging with Gatsby & MDX


So, you want to start a blog in 2019?

With free hosting options like Firebase Hosting (what we use), Netlify, GitHub Pages and Static Site generators like Gatsby and NextJS for React, hosting and building your own powerful blog is a lot easier these days. You just need a few command-line skills.

Also, these steps happen to be how we setup our new blog here at React Training. In this article I'll assume that you are already somewhat familiar with:

  • React
  • Markdown
  • Very basic JavaScript
  • Command line stuff like NPM

Just show me the code!

This is what we're building: https://github.com/bradwestfall/gatsby-mdx-blog

What is MDX?

Many static site generators use Markdown for content. MDX is a superset of Markdown that lets you import and render React components!

Check it out:

import CustomComponent from './CustomComponent'

# Here is our markdown header

Here is a markdown paragraph.

<CustomComponent />

More markdown content.

The magic of MDX is that it knows how to treat the above file as a Markdown file but also as a JavaScript module in the sense that it can import and use React's JSX.

Using MDX with Gatsby

Even though MDX isn't Gatsby-specific, this article is about setting up MDX with Gatsby specifically and some of the hurdles you might face. After all, MDX is still fairly new.

To start, let's install Gatsby. These steps will be somewhat terse so if you're looking for more detail, be sure to review the Official Setup Guide

# Globally install Gatsby's Command-Line API
npm i -g gatsby-cli

# Make sure it installed. We're using v2 for this article
gatsby --version

The next step is to generate the Gatsby site based on a "starter" -- which is just their term for a GitHub repo that gives you a nice starting point for a Gatsby project. There are many starters you can choose from, but we'll be using the most basic one:

gatsby new my-blog https://github.com/gatsbyjs/gatsby-starter-default

Now you can cd my-blog and run the blog using npm start. If you examine package.json you'll see that this is just an alias to gatsby develop.

If all went well, you should be able to go to localhost:8000 to see your Gatsby site.

MDX "Hello World"

Next, install these three packages for using MDX with Gatsby:

npm i gatsby-mdx @mdx-js/mdx @mdx-js/tag

Let's create a basic "hello world" for MDX to make sure it's working.

First open up the gatsby-config.js file on the root of the project. This is a configuration file where we can add plugins to Gatsby. Installing a plugin might remind you of Babel plugins where often times we just need to do the npm install and then add the name of the plugin to the config array:

{
  "siteMetadata": {},
  "plugins": ["gatsby-mdx"]
}

Next, go to the pages folder and create a new folder called /blog with a file called first-blog-post.mdx. Be sure to put some valid Markdown in the file for testing:

/pages
 └─ /blog
     └─ first-blog-post.mdx

You should be able to go to localhost:8000/blog/first-blog-post to see the rendered result of the Markdown in that file.

If you're having issues, you may have forgotten to do npm start again after installing those packages in the steps above.

Nested Layouts and Routes

If you're new to Gatsby but you have some experience with React Router, you might be wondering how nested layouts work. After all, based on what we just learned we can see that URL routes are generated automatically by filename convention and the folder structure of anything inside of pages. This is far different from conventional React where something like React Router is manually configured within layouts to eventually work it's way down to a page (in an top-to-bottom approach).

Gatsby takes more of a bottom-up approach by focusing on the pages first and then wrapping them in any number of layouts.

For example, look at the pages/index.js file (which is the home page). You'll see that it's wrapped in a <Layout>. In theory, you could have primary layouts and sub-layouts by using similar wrapping techniques.

Keep in mind this is just a the default starter for Gatsby and is not meant to be the only way you should do things. Feel free to change these existing pages and layouts to your liking.

Where do we write the HTML file?

We don't. Gatsby handles all of that for us and lets us focus on pages. This might leave you wondering "How can I change the contents of <head> like meta tags since I don't have direct access to an index.html file?"

react-helmet is a great tool for solving this. It lets you add a <Helmet> element anywhere in your React code and it will put those values in the document's <head>. In theory you can add stuff like this your your layout:

<Helmet>
  <title>My Custom Title</title>
  <meta name="description" content="My Blog's Description" />
</Helmet>

The Gatsby Starter we used does a few more steps though. They abstract <Helmet> away into an <SEO> element and then they populate it with props and content that exists in gatsby-config.js. You can probably follow the trail from pages/index.js to see how they use <SEO>. As with layouts, feel free to customize or simplify this strategy however you see fit.

MDX Layouts

Layouts for MDX pages are a bit different. We let Gatsby know what layout to use for MDX files in gatsby-config.js. This change to our configuration indicates we want to find .mdx or .md files and to render their contents into blog-post-layout.js:

plugins: [
  {
    resolve: `gatsby-mdx`,
    options: {
      // Apply gatsby-mdx to both .mdx and .md files
      extensions: ['.mdx', '.md'],
      defaultLayout: require.resolve('./src/components/blog-post-layout.js')
    }
  },
  ...
]

Important! Previously the plugin for gatsby-mdx was just a string in the array. That's fine unless you want to add options like we do now. When options are required of a plugin, we use an object instead of a string. The plugin name goes in the resolve property like above.

The above configuration states that we want the contents of any Markdown or MDX file to be put into our <BlogPostLayout> element (via props.children), seen here:

// src/components/blog-post-layout.js
import React from 'react'
import Layout from './layout'

function BlogPostLayout({ children }) {
  return (
    <Layout>
      <article>
        <header>
          <h1>Todo: Main Title Will Go Here</h1>
        </header>
        {children}
      </article>
    </Layout>
  )
}

export default BlogPostLayout

We could call this a "sub-layout" since it also wrapps itself in the main layout.

Frontmatter

"Frontmatter" is a strange word unless you've used a Markdown-based static site generator before. Frontmatter is meta data in YAML format that goes at the beginning of the Markdown file (or MDX in our case). Apparently the word comes from traditional paperback books and I first heard the term when using Jekyll years ago -- one of the first really popular static-site generators from the folks at GitHub.

Here's what it looks like:

---
title: First Blog Post
author: Brad Westfall
date: 2019-02-05

---

The above area fenced in by three hashes is frontmatter. Below it (this paragraph) is Markdown.

The frontmatter goes at the top of the MDX file but doesn't get rendered there. Remember it's just meta information to be used elsewhere.

The most common usecase for frontmatter is to give data to the post's layout. The gatsby-mdx plugin we setup earlier is what's responsible for providing the MDX file to our layout. It also sends along frontmatter as props. Let's now adjust the layout to receive them:

// src/components/blog-post-layout.js
import React from 'react'
import Helmet from 'react-helmet'
import Layout from './layout'

function BlogPostLayout({ children, pageContext }) {
  const { title, author, date } = pageContext.frontmatter
  return (
    <Layout>
      <Helmet>
        <title>{title}</title>
      </Helmet>
      <article>
        <header>
          <h1>{title}</h1>
          <span>Author: {author}</span>
          <time>Date: {date}</time>
        </header>
        {children}
      </article>
    </Layout>
  )
}

export default BlogPostLayout

Now we can make up any frontmatter we want to use in the layout. At the React Training blog, we use frontmatter to pass an image URL to the layout for the "banner" section at the top. We also pass meta-tag info for Open Graph social sharing -- just to give you some ideas.

Browsing Blog Posts

This is where things start to get really interesting and it's also where Gatsby's use of GraphQL starts to help us. Even though Gatsby has server-side rendering (SSR), this is not to say it has a production server that is going to dynamically render pages for us on each request. Instead, Gatsby generates a bunch of static pages when we run gatsby build. During the build, Gatsby gathers meta information about our site and makes it available to our pages via GraphQL. This allows us to create a page for browsing all blog posts that feels as if it were a server-side dynamic page, but actually it's just another static pages.

Here's what we need to do to make our "Browse Blog Posts" page:

Step 1: Source the blog posts

In gatsby-config.js, add this new section into the plugins array:

plugins: [
  {
    resolve: `gatsby-source-filesystem`,
    options: {
      name: `blog`,
      path: `${__dirname}/src/pages/blog`,
    },
  },
  ...
]

You'll notice that there's already an entry for gatsby-source-filesystem in the plugins which is sourcing images. It's totally fine (and intended) to use this plugin multiple times for sourcing different information from the filesystem for different purposes, so don't change the existing entry. Add the above entry for sourcing files in the pages/blog folder.

This new entry tells Gatsby to read our files to make some of their metadata available to GraphQL in the next steps.

Step 2: GraphQL and our first "dynamic" page

Add a new page at src/pages/blog/index.js that uses GraphQL to query information from the previous step so we can make our listing of posts:

// src/components/blog/index.js
import React from 'react'
import { graphql, Link } from 'gatsby'
import Layout from '../../components/layout'

function BlogIndex({ data }) {
  const { edges: posts } = data.allMdx
  return (
    <Layout>
      {posts.map(({ node }) => {
        const { title, author } = node.frontmatter
        return (
          <div key={node.id}>
            <header>
              <div>{title}</div>
              <div>Posting By {author}</div>
            </header>
            <p>{node.excerpt}</p>
            <Link to={node.fields.slug}>View Article</Link>
            <hr />
          </div>
        )
      })}
    </Layout>
  )
}

export default BlogIndex

export const pageQuery = graphql`
  query blogIndex {
    allMdx {
      edges {
        node {
          id
          excerpt
          fields {
            slug
          }
          frontmatter {
            title
            author
          }
        }
      }
    }
  }
`

Note: This file will not work until Step 3 below.

You may have noticed there are two exports: a default and a named (pageQuery) export. The named export is how we tell Gatsby what data we need to render this page. In this query we want to single out MDX files. If you're brand new to GraphQL, here is a good resource to learn how to use GraphQL with Gatsby.

By convention the two exports are meant to work together. The query will be parsed first to gather information and then results are fed into React component as props (the default export).

Note that I've only included two frontmatter fields in my query, so that's the only one I'll receive in my props.

For the purposes of Gatsby and GraphQL, node basically means "page". In node, there are more things we can query for besides id, excerpt, and frontmatter. You can explore all of them with the built-in GraphiQL Dashboard. However, the one thing we queried for which breaks the code so far is the fields { slug }. This is because fields is not given to us by default. If you were to run the code now, you would get this error: Cannot query field "fields" on type "Mdx". Let's fix this.

Step 3: Creating a "Slug"

Since a slug is not provided to us by default when we use gatsby-mdx with gatsby-source-filesystem, we'll have to make one for ourselves.

This gives us a chance to take our first look at gatsby-node.js which is an empty file at this point. This file allows us to get access to Gatsby's API functions at build-time and to make the server do things for us that it doesn't do by default. We can create dynamic pages (which we'll cover later) or add more stuff that can be queried with GraphQL.

Add this code to gatsby-node.js:

const { createFilePath } = require('gatsby-source-filesystem')

// Here we're adding extra stuff to the "node" (like the slug)
// so we can query later for all blogs and get their slug
exports.onCreateNode = ({ node, actions, getNode }) => {
  const { createNodeField } = actions
  if (node.internal.type === 'Mdx') {
    const value = createFilePath({ node, getNode })
    createNodeField({
      // Individual MDX node
      node,
      // Name of the field you are adding
      name: 'slug',
      // Generated value based on filepath with "blog" prefix
      value: `/blog${value}`
    })
  }
}

It might not make total sense at first, but if you read about the gatsby-node.js file you'll quickly learn about APIs like createNodeField, which allows us to add arbitrary fields to our nodes (pages) so GraphQL can query for them later. Then we can use the createFilePath function from gatsby-source-filesystem to make a slug based on the MDX filename.

If you restart Gatsby you should be able to go to localhost:8000/blog to see your new page that browses blog posts. The error goes away because GraphQL can now query for fields of MDX nodes.

Pagination

The next step is to create pagination for browsing posts. There are many ways we can do this -- for example we could take our existing BlogIndex we just made and use some URL variable to determine how to iterate over the results. This might feel more like traditional pagination with ?page=1 in the URL, but this technique isn't great for SSR and our static pages because there isn't a "real" page to visit for page 2, 3, etc.

What we want is a bonafide static page for each of the paginated pages as if we had handmade pages/blog/1.js and pages/blog/2.js etc. We can't really get this effect with one file like pages/blog/index.js because anything we put in pages is a 1:1 mapping to a URI. Instead let's remove pages/blog/index.js and tell Gatsby to build pages even though they don't exist in pages. We can do this programmatically with gatsby-node.js and GraphQL.

Even though we're removing the first strategy, I'm glad we covered how to create a basic page that reads from GraphQL, but the new strategy will be better for pagination. At least most of it's contents are re-usable in the new strategy.

Open gatsby-node.js and add this to what we already have:

const path = require('path')

// Programmatically create the pages for browsing blog posts
exports.createPages = ({ graphql, actions }) => {
  const { createPage } = actions
  return graphql(`
    query {
      allMdx(sort: { order: DESC, fields: [frontmatter___date] }) {
        edges {
          node {
            id
            excerpt(pruneLength: 250)
            fields {
              slug
            }
            frontmatter {
              title
              author
            }
          }
        }
      }
    }
  `).then((results, errors) => {
    if (errors) return Promise.reject(errors)
    const posts = results.data.allMdx.edges

    // This little algo takes the array of posts and groups
    // them based on this `size`:
    let size = 10
    let start = 0
    // Premake the grouped array to the correct length. new Array
    // wasn't working with map so don't @ me :)
    let groupedPosts = Array.from(Array(Math.ceil(posts.length / size)))
    groupedPosts = groupedPosts.map(() => {
      const group = posts.slice(start, start + size)
      start += size
      return group
    })

    // Here's the basic idea of what the grouping is doing if the
    // size variable was 2:
    // posts: [post1, post2, post3]
    // groupedPosts: [[post1, post2], [post3]]

    groupedPosts.forEach((group, index) => {
      const page = index + 1
      createPage({
        path: `/blog/${page}`,
        component: path.resolve('./src/components/browse-blog-posts.js'),
        context: { groupedPosts, group, page }
      })
    })
  })
}

That's quite a bit to read and understand. Here's the gist of what's happening:

  1. Gatsby calls createPages() automatically for us (if it exists).
  2. Then we use GraphQL to get all of our MDX posts.
  3. Next, get an array of posts and group them by page.
  4. Then use the createPage API to build a page at /blog/1, /blog/2, etc. Note that context just lets us pass down props into the browse-blog-posts.js component which we'll make next.

Create src/components/browse-blog-posts.js

The file is very reminiscent of the previous index file but now it doesn't need it's own GraphQL query because it receives the posts data through props passed in from createPage in gatsby-node.js. The new file has the same basic purpose, to make an index page of posts, only now we'll be creating several of them for each page of the pagination.

// src/components/browse-blog-posts.js
import React from 'react'
import { Link } from 'gatsby'
import Layout from './layout'

function BrowseBlogPosts({ pageContext }) {
  const { groupedPosts, group, page } = pageContext
  return (
    <Layout>
      {group.map(({ node }) => {
        const { title, author } = node.frontmatter
        return (
          <div key={node.id}>
            <header>
              <div>{title}</div>
              <div>Posting By {author}</div>
            </header>
            <p>{node.excerpt}</p>
            <Link to={node.fields.slug}>View Article</Link>
            <hr />
          </div>
        )
      })}
      <footer>
        Pages:{' '}
        {groupedPosts.map((x, index) => {
          const currentPage = index + 1
          return (
            <Link
              key={index}
              to={`/blog/${currentPage}`}
              className={currentPage === page ? 'active' : null}
            >
              {index + 1}
            </Link>
          )
        })}
      </footer>
    </Layout>
  )
}

export default BrowseBlogPosts

Remember to remove src/pages/blog/index.js and to do a full Gatsby restart anytime you change gatsby-node.js

That's it. You should be able to browse to /blog/1 or /blog/2 (if you have enough posts to fill up more than one page). Be sure to redirect from /blog to /blog/1. There are lots of different ways and plugins to do this in Gatsby so we'll let you choose one that's suiting.

Closing Remarks

You made it to the end! This tutorial is a bit long, but all these details were new to me until I had to dive head-first into Gatsby and MDX, so hopefully it helps to clarify some things.

Speaking of "remark"

MDX uses Remark as the underlying engine for processing Markdown. As such, we can use a variety of Remark plugins like this:

plugins: [
  {
    resolve: `gatsby-mdx`,
    options: {
      extensions: ['.mdx', '.md'],
      defaultLayout: require.resolve('./src/components/blog-post-layout.js'),
      mdPlugins: [
        require('remark-images'),
        require('remark-emoji'),
        require('remark-slug'),
        require('remark-autolink-headings')
      ],
    },
  },
  ...
]

Checkout this giant list of plugins

Syntax Highlighting with PrismJS

It was a little tricky to get proper syntax highlighting working. If you poke around there are several ways suggested to get Gatsby working with Prism but most are for ordinary Remark-based Markdown. For MDX, there is a plugin called gatsby-remark-prismjs but at the time of this writing, I could only get it work work if I installed [email protected] with the plugin setup like this:

{
  resolve: `gatsby-mdx`,
  options: {
    extensions: ['.mdx', '.md'],
    defaultLayout: require.resolve('./src/components/blog-post-layout.js'),
    gatsbyRemarkPlugins: [
      {
        resolve: `gatsby-remark-prismjs`,
        options: {
          classPrefix: 'language-',
          inlineCodeMarker: null
        }
      }
    ]
  }
}
Loading...

React Router

Michael Jackson and Ryan Florence create the React libraries that you use in your apps like React Router and Reach UI. All of our trainers are experts in React and JavaScript so let us share our knowledge with you and your team!

I Love React
© React Training 2019