Porting to WordPress Part 2: Requirements

In my earlier post about porting from Manila to WordPress, I covered some basics around how and why I decided on the approach I took, and some of the requirements for the new site.

I’ve made a ton of progress—what you’re reading right now is coming from WordPress 4.0, hosted on my own server—but I’ve been remiss on follow-up posts. Fortunately I took lots of notes during this process, since I knew I wanted to write more about it. Probably too many notes in fact. 😉

I also found myself falling diving into the rabbit hole: I’ve been debugging the WordPress importer plug-in, while slowly and osmotically learning PHP, and discovering the wonders … um … fun that is XDebug, Eclipse PhpStorm, and MAMP. (PhpStorm is great so far, but still very unfamiliar.) Why? Two reasons, one of which I touched on before:

  • I get to learn about WordPress internals, PHP, and debugging PHP sites—and learning is always a Good Thing™
  • It turns out that Manila, a product developed over the better part of a decade, is quite complicated (duh), and I get to figure out how to re-simplify my legacy websites

I know Manila better than almost(?) anyone, so even years after developing in that environment full-time, its nooks and crannies are mostly familiar to me. Manila is an old friend, and we have a relationship complicated by the history of our mutual growth. Because Manila and I learned the Web organically over the last decade or so, we share shall we say, breadth. 😉

It’s a valuable trait that makes us both very flexible, but it also means that we’re sometimes hard to understand. And in doing this project, it’s likely I would need to make some difficult trade-offs, or else suffer endless debugging and long-term maintenance complexity, both with rapidly diminishing returns.

I’ll give you a few examples, and will tease out requirements as I talk through them:

Home Pages vs News Items

In its original incarnation, Manila only understood one “Home Page” per day. You could write as much as you want, add as many links as you want, and format however you want. But the content for a given day had no set structure or order.

√ Requirement: Ability to link to a day-archive in my WordPress site, not just to a post

Relatively early in Manila’s product lifetime, Brent Simmons implemented News Items in Manila, which enabled Manila sites to have the same kind of structure we think of today as a Blog—a series of reverse-chronological posts, usually with a title, and sometimes with a link. As I recall, Manila’s News Items were inspired by Slashdot’s format which was essentially blogs+categories—but the reverse-chronological collection of posts was key.

As the platform grew, News Items eventually had other data associated with them too: They supported per-item comments and trackbacks (like WordPress posts), and they separated the concept of “last update” from “published” though differently than WordPress does.

For the purpose of this project, it’s important to understand that a “news-day” post and “news item post” are different things, and needed to be dealt with accordingly.

√ Requirement: Handle both day-post-style sites and item-post-style sites

√ Requirement: Translate News Item departments into WordPress categories

To make matters more complicated, the Managing Editor (admin) of a Manila site could switch between News Items and Home Pages at will. So some days might have a single, monolithic post while other days may have many separate posts.

√ Requirement: Support both per-day and per-post styles within a single site

For content on JakeSavin.com this won’t be much of an issue since it’s always been a News Items (per-post) style site, and I’ve rarely made more than one post per day. But in the long run I also want to bring in content from Jake.EditThisPage.Com—years worth of content that I don’t want to lose—and it’s one of these mixed sites with some day-page style content, and some blog post style content, and a mix that sometimes included many posts each day.

⇒ Insight: I don’t need to deal with day-type sites right now, but I shouldn’t design myself into a corner that precludes them.

Permalinks, GUIDs, and IDs

So what the heck are these things? I mean I’ve heard of a permalink but a GUID? I get what an ID is, but why do I need to understand it?

Permalink

The permalink to a post is a URL which doesn’t change over time, which goes straight to the post. It’s important to preserve these links, since every time someone links to a post on your site, the place they’re linking to (ideally) is your post’s permalink. If that URL changes then all of those incoming links will break, and The Web will be just a tiny bit more lonely: On The Web, broken links == sadness.

It turns out that by default, WordPress and Manila format blog post URLs quite differently. Moreover, WordPress pages typically live at only one URL (really two—one by its link [path], and one by its ID), whereas in Manila, “Stories” (Pages in WordPress) and sometimes even individual posts can live at any number of URLs, some of which are generated, and some of which may be added by the user.

For example a blog post (news item) in Manila is most often accessed via a calendar-style URL off of the root of the site, like http://example.com/2014/10/01#1234, but it may also appear at any the following (or more) URLs:

  • http://example.com/discuss/msgReader$1234 — note the $ delimiter
  • http://example.com/stories/storyReader$1234 — if promoted to a story [page]
  • http://example.com/my-super-awesome-post — user-entered path
  • http://example.com/awesome/firstPost — another user-entered path
  • http://example.com/newsItems/departments/superAwesome/2014/10/01#1234 — from department (category)

If I want to preserve my site’s existing web presence, then I should do whatever I can to make sure that incoming links continue to work. And while I control all the domains involved, I also don’t want to have to maintain a giant list of redirects…

√ Requirement: Support at least one of Manila’s canonical URLs for transferred content

√ Stretch-goal: Support all URLs for a given bit of content, including user-generated ones

GUID

A post’s GUID is its canonical and unchanging identifier that signals to feed readers (RSS, Atom, etc), that if it sees this post again, it doesn’t need to show it to users, since they’ve already seen it.

But if the post’s URL ever changes, a well-behaving content management system should remember the original GUID and not change it, so that folks who subscribe to the site in a feed reader don’t get blasted with a whole lot of repeat posts.

There are other potential uses for a post’s GUID. Some systems might use it to identify a post when accessing it via an API. Some (like Manila) use a combination of the site’s URL and a post’s ID instead for API access.

Sometimes it’s easy to generate a GUID just by reusing the value of the post’s permalink. In this case you could add an attribute called isPermalink and make its value true to signal to consuming apps that the GUID actually points at a real web resource. (WordPress doesn’t do this, even when the permalink and GUID are the same.) This could be especially useful if the post has a link which is not a link to the post itself.

Then there’s the ID. Manila and WordPress both have sequential IDs for the super-set of posts and pages. Unlike WordPress though, Manila also keeps comments in the same “table” as posts and pages, whereas WordPress treats comments completely separately. Going from Manila to WordPress then shouldn’t create any issues, since there are no inherent ID conflicts.

Data Hierarchy: What’s the same, what’s different?

Among the reasons I picked WordPress instead of some other platform, is that WordPress and Manila actually have a great deal in common:

  • They both separate content from layout by flowing content through a Theme
  • They both use a database to store the content
  • They both have posts, media, and pages (in Manila, News Items, Pictures & Gems, and Stories)
  • The table used for posts, stories, and media is the same (in Manila it’s the site’s Discussion Group)
  • Both systems use the filesystem for blob storage for media files

But there are some differences:

  • In a Manila site, you can have threaded discussions that aren’t attached to a post or page. Not so in WordPress.
    • This could be faked up with private posts/pages in WordPress, but depending on the site this may not be worth the extra development effort.
  • In Manila, comments are stored in the same table as posts, pages, and media objects, but in WordPress, commentsare stored separately.
    • In theory this shouldn’t be an issue, since as long as I build the WXR file such that WordPress understands it, comment content will import just fine.

I was thrilled to discover that WordPress supports threaded discussions. Though it’s not an issue for JakeSavin.com since it’s always had flat comment threads, when I get around to porting over my other sites, I will want to preserve threaded discussions.


That’s it for this post. In the next post, I’ll talk about the code that I wrote, how I tested and debugged it, and what kind of crazy edge cases I found continue to find.

Be First to Comment

Post a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.