In the world of content strategy, spreadsheets are a critical tool for planning and communication. In particular, content types are often defined and refined in spreadsheets before they're committed to code or CMS configuration.
The challenge comes once everyone "agrees" and the content types are implemented. If the model is updated in any way, it's easy for them to fall out of sync. The CMS is tweaked, but the spreadsheet is never updated to match, or decisions are made by the content team and entered in the spreadsheet but they never make it to the CMS configuration. In addition, if developers misunderstand the spreadsheet or make mistakes when implementing the content types, the mismatch can be easily overlooked.
While working on the recently launched redesign of MSNBC.com, we found ourselves in just that situation. The back-and-forth fixes between the content modeling people and the developers turned into an ongoing time-sink and everyone was frustrated.
The Solution
I hate doing work that computers can do better. Faced with this spreadsheet/CMS synchronisation challenge, I built a tool that handles it automatically: the CheckSheet module for Drupal. It takes a spreadsheet describing a site's content types and fields, compares it to a Drupal site's content type settings, and flags any discrepancies for review. It provides an admin screen on the Drupal site for site builders and a Drush command for those who prefer the command line.
In the screenshot above, the Article and Page content types are both out of sync with the spreadsheet. For example, the Page's body
field should be required, and it should have a publish_date
field -- the CheckSheet module spotted that mismatch and alerted us.
How it works
During the MSNBC development process, we used Google Docs to store the "master" spreadsheet: it was treated as the canonical source for information on our content types. If the CMS didn't match the spreadsheet, we assumed the CMS was wrong. We exported this spreadsheet to .ods format, and checked it into the project's source control tree to preserve a historical record of the type definitions.
Whenever someone wanted to verify that the site was "in sync," they ran the Drush command or checked the admin page to spot mismatches. Because it's available as a drush command, it's relatively easy to make it part of an automated testing and continuous integration process. It's still up to the developers to fix the mismatches, but the process of spotting them is unambiguous and easy to document.
Use it, improve it, and share your techniques
The CheckSheet module is a quick-and-dirty tool to simplify our work, not a polished product: it assumes a very specific format for the spreadsheet. It can verify the name, help text, data type, required flag, and "single/multiple value" setting for any given field. Additional columns can be added to the spreadsheet to store more information for documentation purposes, but the module will ignore them.
It also assumes that the spreadsheet is in .ods format, and located in the actual module directory: if you're using Excel, Pages, or Google Docs you'll need to export to .ods before the module can parse it. In the future, we'd like to integrate it with Google Docs directly… for our work on MSNBC.com, though, it served its purpose well.
The CheckSheet module is currently living in Lullabot's GitHub repository, and includes an example .ods format spreadsheet that demonstrates how a few content types can be defined. Give it a spin, add features, and post ways that your team has helped keep strategists, architects, and developers working together smoothly.