How we build Ampp3d…

in eight weeks six months and counting…

William Turrell

Last revised 8:19am Tue 10 Jun 2014
Note for Instapaper users: some of the footnotes are missing, you're better off using an ordinary browser.

Preamble

Hello. This is an essay all about Ampp3d, a website we launched on 9 December 2013.

It’s also, despite the time I’ve spent on it, kind of a first draft, but – in the interests of hitting the six month anniversary of the site and actually shipping something – I publish it today with the hope I can improve it as time permits.

As you read, maybe you’ll find you have your own opinion on something I’ve written. If so, why not keep it to yourself? feel free to email me; especially if you spot anything out of date or that would be bad advice to someone stumbling across this page.

Blogs [1] about site launches or other major design/backend projects can sometimes feel more like a PR exercise; the project is presented a blissfull utopia with colourful photographs and days where the team bonded to formulate ideas; there are post it notes, user personas and intricately drawn wireframes. The choice of CMS is never in doubt. There are no bugs, arguments or political interference.

I love reading inspiring things like that and I’ve linked to one or two in the footnotes, but I couldn’t write something like that. Even after a project like this where there genuinely haven’t been any disagreements, despite my pride in what we’ve made I still can’t look at the site without noticing all it’s flaws or thinking about the many compromises we’ve had to make - it would be misleading to leave any of the bad stuff out.

Similarly, a year or so ago, Sarah Parmeter did a talk at Responsive Day Out 2013 and put up a slide with screenshots of broken media queries layouts as an acknowledgement of how frustrating web development can be.

I want to borrow that idea; while what you’ll find here is hopefully detailed enough to include at least one good idea you may not have thought of, I’ve tried to remain totally honest about what we (I) could have done better and why.

It was 32 calendar days from my first meeting about Ampp3d to when we launched (~25 from when I started coding the design) – but we’ve been steadily iterating ever since, with a few “big” new features and a considerable amount of time spent fixing the “hard” problems.

Much of what we did at launch has been quietly thrown away and replaced with better code; a lot was riding on all of us, individually and collectively, hitting the launch date. If you set a deadline your credibility depends on having something to show for it so things made it into production which shouldn’t – we just ran out of time.

I find it helps to put something imperfect into the wild but make a private commitment to yourself to come back and deal with it properly later; my record on that is far from 100% but there’s a definite sense of achievement when resolving a long standing bug or paying off your technical debt.

I tried not to say an outright no to any editorial requests.

Our insistence on trying to make everything work in an infinite scroll environment might make us unique[2].

And we’ve pushed responsive design as far as we can, not always successfully; it’s tough making static graphics – let alone interactive charts – work well at all resolutions.

For me personally, as the sole developer on the project, I’ve had to continually switch hats between design, front and back-end and try and maintain a reasonably high standard of each, with the very real fear I’ll just be mediocre at everything. I’d like to think we’ve set a few good examples though - why not have a read and make up your own mind.

UsVsTh3m, 2013

Ampp3d was originally an experimental Trinity Mirror project in social shareable da… – actually read Martin Belam’s explanation first if you’re new to it.

Martin and I first worked together on the launch of UsVsTh3m in May 2013. Compared to Ampp3d, UsVsTh3m was ever so simple (no longer the case incidentally; I’m no longer involved and they have a much larger team now. We had some simple Photoshop templates from Chris Lam - a great designer at the Mirror - and from these we built a Tumblr template.

Martin has huge experience of working with many talented people at the BBC, Guardian and elsewhere but we’d never worked together before, so on my part, as a chronically insecure developer, there was a definitely a degree of wanting to prove myself. I persuaded him to send over the initial designs three days beforehand so I could look them over, and by Sunday night we had some revisions and I felt more relaxed about it.

Our original version was actually multi-column, which both Martin and I liked, but popular opinion was against.

As I recall, neither side articulated their reasons in any great depth at the time, but with hindsight the trend for both UsVsTh3m and Ampp3d has been to write posts with multiple images that can be scrolled through with minimal effort, and creating excerpts[3] might be seen as a barrier to people discovering it[4].

Using Tumblr meant we didn’t have to build our own infrastructure from scratch and we had a captive audience of users ready to share the content (assuming it was any good). Unfortunately it also meant the lack of control that comes with using a third-party platform: if you can imagine the pain of juggling multiple password protected blogs for testing, every design change requiring a manual copy and paste of the entire theme into a web interface and reverse engineering an entirely undocumented infinite scroll mechanism, you’d be on the right lines.

Ampp3d - the beginning

Martin and I met mid-morning at Costa Coffee in Canary Wharf on 6 November 2013. By this point Chris had done some initial Photoshop templates which I saw but didn’t get my hands on until a few days later[5]. We also had a target launch date, Monday 9 December[6] and were pretty sure Wordpress would be our CMS this time.

We were both very keen on Quartz[7] and that inspired Martin’s initial design brief for Chris, as well as several later technical decisions.

I had reservations and limited experience of Wordpress – developers might point to the lack of an MVC framework, some of the database structure and variable quality of plugins – but I hadn’t used Tumblr prior to UsVsTh3m either and it was clear journalists would feel confident enough with Wordpress to get up and running quickly. It would also be easier to justify choosing a widely used platform to Malcolm Coles at The Mirror.[8]

One of the first jobs was to prepare a shortlist of plugins we’d need - I’ll talk in detail about some of them later. I like to think I chose reasonably well but there were some costly decisions which I later ripped out and replaced with better alternatives or code of my own.

Wordpress has been a struggle at times, but I definitely have renewed confidence in it after this project.

Martin and I also chatted about graphs; we wanted fully responsive, interactive graphs that worked on mobile. Martin had been playing with Quartz’s Chartbuilder and our initial plan was to adapt that.

It was a short meeting (we’ve had just the two for Ampp3d) – I’d brought a Macbook Air with me but never took it from my bag; I just went through a list of queries I’d prepared and jotted down a few bits of information in Notational Velocity[9].

Once on the DLR at Canary Wharf[10] I started building a Trello[11] board.

I’ve used it for all my project management for a couple of years now – I find having cards you can (almost) literally drag around between stacks, adding notes, comments, tags and checklists is a much better way to feel in control of complex projects than a linear approach. Also, as Joel Spolsky explains:

Trello works great for a reasonable amount of inventory, but it intentionally starts to get klunky if you have too many cards in one list. And that’s exactly the point: it makes inventory visible so that you know when it’s starting to pile up.

Every day you look at your Trello board and see that there are seventeen completed features that are totally ready to ship but which haven’t shipped for some reason, and you go find the bottleneck and eliminate it.

Every time somebody suggests a crazy feature idea, you look at the Feature Backlog and realize it’s just too long, so you don’t waste any time documenting or designing that crazy idea.

Our Trello lists for Ampp3d:

Handily there’s also an iPhone app, so by the time the train reached Poplar I’d entered 2–3 cards, sent an email invite to Martin and continued to add to it on the journey home.

Later that day, I setup an empty, password protected Wordpress blog and a demo install of Chartbuilder for us to customise (we never did).

If you think the name ‘ampp3d’[12] is difficult to spell, you’re not alone. I’m always misreading things - maybe it’s mild dsylexia or I just read too quickly - but I wasted over an hour trying to track down why the initial Nginx setup wasn’t working, to eventually discover it was because I’d transposed a couple of letters in the document root. I probably should have just used ‘amp’ as the internal naming. I never did.

Browser support

Although I always do a lot of browser testing, I’m never terribly happy with the quantity of browsers or devices I’ve been able to test on or the compromises I usually end up making.

We use Modernizr for feature detection but you still have to try and handle non-standards compliant browsers. I built Ampp3d with the HTML5 Boilerplate theme. It has conditional comments at the very start of the HTML that add class names for old versions of IE, so we can easily target them specifically via CSS and javascript. I’ve found it easier to keep any IE8 declarations in the same CSS file, if they’re elsewhere it’s easy to forget they exist – SASS will help keep your code tidy. The CSS is slightly larger but I feel maintainability wins here.

If you’re still unhappy with your desktop browser your testing method, for desktop browsers use virtual machines (VirtualBox is free and reliable.) It’ll take you much longer to use websites that provide screenshots and even trying to operate a virtual machine over the web is pretty frustrating.

Microsoft deserve credit for what they’re doing with Modern.ie - you can now get all Internet Explorer versions on each operating system for free. Much thought has gone into this; they’ve made the instructions friendly for Mac users, provided the most efficient terminal commands to download and install each VM, disabled automatic updates to stop your browser being overwritten and listed the full login and license info (including how to rearm it) right on the desktop.

The files are multi-gigabyte; download them overngiht and keep the original so you can reinstall it when Windows Activation kicks in [13].

I have a virtual machine running IE10, because it allows you to emulate IE9, 8 and 7 and switch between them quickly. The only problem is it doesn’t get across just how slow javascript execution is in IE8 and below.

It’s hard to judge if the javascript on a site is too slow because of the variation between devices - and you don’t have much option for progressive enhancement other than disabling features completely.

I’ve attempted to retain decent IE8 support in Ampp3d but I feel I’m fighting a losing battle - as we add more features to the site it gets worse and worse (the performance more so than the rendering, due to the speed of javascript execution). Even something like picturefill which we’ve just started using, which should make the world better for everyone by reducing bandwidth, has a terrible impact on IE8. I’ve turned certain things off to alleviate it a little - we don’t load ads on IE8 and Facebook comments are off by default.

Worse, turning javascript off completely to speed everything up is not an option, as like numerous sites we use HTML5 specific tags that IE8 and below are unable to recognise, so the HTML5 shim is required – without javascript that can’t run and consequently none of your CSS loads.

I feel better about the situation now that Windows XP is no longer offically supported, and given that the latest version of Firefox still works with it and has excellent javascript performance (as well a vastly superior rendering engine), but it’s annoying when almost 2% of our users still have IE8 and we’ve fixed most of the display issues. Those people can (assuming they’re not in some sort of corporate network hell) get a much better experience with Firefox at no cost to them - there’s no need to upgrade their operating system or buy a new computer - but they do need to know their browser could be faster or they do need someone else to help them and I’m not sure how many people that applies to.

Adding a discrete banner to politely encourage IE8 users to switch browsers is a possibility and something I’ve done elsewhere.

It’s very frustrating.

For Ampp3d I abandoned the idea of IE6/7 support early on, though again that’s a site-by-site decision (the problem with Ampp3d is that we allow journalists to embed a lot of third-party stuff which just doesn’t work with old browsers, so even if the core site ran in IE6 it would still only be a partial experience). jQuery have just announced they will be dropping IE6/7 support (but retaining IE8) from version 1.13 onwards.

I’m regularing testing in Chrome, Firefox, Safari and Opera on Mac and Windows.

We did have a few issues with Firefox. You should avoid combining display types in unexpected ways – if behavour isn’t defined in the spec there won’t be tests for it; browsers may handle it perfectly but there’s no guarantee.

In our case there was this bug concerning using a max-width element inside an ‘inline-block’. My workaround was to add width: 100% where you can safely do so (i.e. by applying to CSS classes used by your images). However the better solution is to avoid an inline-block there in the first place, it was there because of another hack I’d chosen to make somewhere else. It’s an example of the accumulating technical debt and it’s best to deal with the initial problem as early as you can.

Device testing

We test on different devices all the time, but it’s a pretty limited selection. I use an iPhone 5 and an iPad Mini running iOS7 (I also periodically check an old iPod touch running iOS6) and for Android I have a 2 year-old Google Nexus 4. On iOS I’ll look at Safari and Chrome (same rendering engine but they can behave slightly differently[14] and on Android I’ll use the Android standard/default browser as well as Chrome and Firefox. Other members of the team have different phones, but remember you can’t rely on users or staff to report bugs, they’re just as likely to not notice, ignore it, not have the time to mention it, assume somebody else will, or worst of all for your potential audience, give up and go elsewhere[15].

All the devices can do remote debugging so I can call up Safari or Chrome developer tools and view/adjust the CSS or look at the javascript console.

Testing with physical devices is (almost) essential[16]. You certainly need it for touch events; you can’t properly test a swipe in a browser window, whatever emulation settings you’re using.

For Ampp3d, regular use on a phone highlighted issues with touch targets - even though our button to open/close the navigation is reasonably big, I was finding that my thumb would narrowly miss it, so I extended the target on the right.

Active and focus styles won’t normally be visible in the browser either[17] - in Android activating the button added a blue background, in our CSS we use -webkit-tap-highlight-color to get rid of that.

For our ‘swipe to close sidebar’ UX feature I adjusted the swipe_velocity to make it more sensitive - I wouldn’t have known without a proper device.

We also use the FastClick library by FT Labs[18] to elimate the default 300ms delay when you tap a button on many mobile devices.

If we had more people, more time, or a larger user base, I’d have tried to fit in a visit to Clearleft’s device lab in Brighton. I wish I could, but realistically it would have taken multiple visits; one to identify and log all the problems, maybe several more to test fixes I’d applied.

Doing your own user testing

Many small UX improvements have come from routine use of the site.

I’ll check Ampp3d frequently (every day or two even when I’m working with other clients) - I try to vary the device. I don’t read everything, I’m mainly scanning for rendering issues – content that’s broken out of it’s container, incorrect spacing, line wrap, anything that’s taking too long to load or doesn’t work well when embedded. I probably won’t fix it then and there (assuming I know how to) but I’ll usually make a note and come back to it.

Browser stats

As many have already said, make your decisions about what to support based on your site’s own users, not other people’s.

Our recent Google Analytics stats are more encouraging than I was expecting.

Chrome and Safari are joint top at 31%, but desktop Safari (rather than iOS) accounts for a further 11%. 11% of visits use the standard Android browser. Firefox and Internet Explorer are equal on around 7% each. Opera accounts for just over 1%, Blackberry 0.5% and Amazon Silk 0.25%.

For iOS, we have 90% on version 7 or above, 98.5% on version 6 or later. I still test iOS 6 occasionally despite (because of) it being significantly slower[19].

Of people using Android devices, 80% have version v4 or above (higher than I’d expected). No-one uses Android 3 (the tablet one) but there’s a further 16% on v2.3.

Given the leaps in javascript speed, it’s encouraging people (in the UK anyway) are updating their mobiles relatively often.

Version control

I use Git repositories for Wordpress and Datawrapper. This means I can roll back code changes and potentially use multiple branches to work on multiple features simultaenously - in practice I prefer to work on a single branch at a time and “stash” changes if I get interrupted.

I use Sourcetree as a GUI for Git – I’m familiar with the Git command line but I like being able to instantly switch to it and see what files have changed, drag things into the staging area, look at prettier diffs than you get in a terminal and be able to discard files or individual “hunks” quickly. Plus the workflow to commit changes is faster.

Our .gitignore file includes media uploads and the charts. But when I’m developing the site locally, I need to be able to pull down all the latest content so I can look at actual articles, images and charts. I have a script that remotely runs a MySQL dump, creates a compressed timestamped SQL file and copies it back, before running some queries to switch the domain name from ampp3d.mirror.co.uk to my local version[20].

There’s also another which runs a customised rsync command on the media uploads.

Backups

I keep known working copies of our Nginx, PHP and Varnish configurations separately.

Having copies of everything on your own system means you can take advantage of your local or cloud backups - the ‘versions’ feature in Dropbox is worth remembering if you’ve managed to damage a file you’re yet to commit. Also I use Vim, which has multi-level undo and holds on to everything you cut, copy or paste and Launchbar which has a built in clipboard history.

Power cuts aren’t uncommon where I live and using a desktop machine[21] prepares you for dealing with things like innoDB corruption[22].

We have daily backups on our production server too. Linode have a daily snapshot backup service, but it’s not that flexible - you can only restore a whole drive, not individual files. Bytemark, who now host Ampp3d, set us up with BackupPC writing to a machine elsewhere in the datacentre. Additionally I run backup2l locally as it means I can quickly retrieve specific files without having to ask anyone else. There’s a script that generates a mysqldump of all the databases first. Our server also has RAID1 so if one disk fails (Bytemark scan the serial line output for error messages) they can swap it out with limited downtime, then the array can rebuild in the background while the OS is running as normal.

Wordpress upgrades and improvements

So far, I’ve only upgraded Wordpress when they release security updates (there’s a lot of difference between the speed at which various plugins get updated and several of them are heavily customised for ampp3d). I upgade using the manual procedure[23].

We have a customised dashboard that contains release notes and instructions I’ve written, including shortcodes, CSS classes, colour swatches and so on (intended as a way of cutting down on email, though turns out there’s no guarantee people will read it).

On the Posts index, I’ve added extra columns; a thumbnail preview of the hero image, article word counts and a notes column – there’s a field in the article you can edit and the notes are only visible within the admin area (not used much though, people still prefer email[24]).

It’s not uncommon to find draft posts with “Whatever you do, don’t publish this” in the title and it is remarkably easy to publish something in Wordpress by mistake, so I added some code that looks for the phrase “do not publish” in the post title when you save a draft and disables the publish button[25] until it’s removed.

Great Big Database of Numbers

This was, I think, an idea of Martin’s which I shared genuine enthusiasm for at the time, but in hindsight seems ridiculous. It was to be an intranet set for people to add interesting/important numbers so eventually we could draw interesting or amusing comparisons from them.

You might wonder (a) why we’d need a special place to write our numbers down (b) what sort of person would be dull enough to want to do so and (c) why wouldn’t you just use an established tool like Wolfram Alpha?

I did build it though; a MySQL database running in FuelPHP (my framework of choice[26]) using Bootstrap for the design, DataTables for showing numbers, Mousetrap.js for keyboard shortcuts[27] and select2 which gives you fancy auto-complete select boxes.

For the main site, scaling was a concern from literally day one, so we setup Varnish. For “ndb”, it will be a long time before speed becomes an issue (especially if nobody ever adds a number). It’s a database used (not used) by an incredibly small number (0) of people, so we can simply throw a bit more hardware at it in the first instance.

It’s a refreshing change to work on internal projects because you rarely have to worry about legacy browsers.

I wanted to make it as fast as possible for journalists to use; data entry is boring and if it’s too fiddly nobody will bother. I tried to keep keypresses to a minimum; pressing ‘n’ took you directly to the Add Number page and the instructions for the ‘Units’ dropdown say ‘Just start typing’; it filters all the possible choices as you go and wherever possible I supplied both the full unit name and the abbrievation or symbol people were most likely to type. The ‘date’ field accepts yyyy-mm-dd format or choosing from a datepicker (which pops up immediately the field gets focus). There are keyboard shortcuts for saving too.

The database structure was interesting; normally in a relational db you set the field to the specific type of number (integer, floating point, whether it’s signed/unsigned etc.) Here, we could be dealing with anything including currencies, distances or units of time - that means you can’t index them. I decided to keep things simple and use a single VARCHAR field; if we ever need to typecast certain numbers, we can use the MySQL’s CONVERT function. There was a indexed units field to indicate the number type, we match that first to get fewer rows and a faster query.

But like I say, the access logs show a couple of people logged on and browsed around a bit, and that was that.

Number of numbers entered in the Great Big Database of Numbers: zero.

Ads

Web developers hate ads; you’ve spent ages optimising your page and then suddenly have to include third-party code you have no control over – whcih could add multiple iframes, block loading of the rest of your content, require Flash and make your site CPU intensive, or in the worst case even change the DOM and break the rest of your page.[28].

I wish we could go back to the days of simple 468x60px animated GIFs and find it a bit ironic ads have moved in the other direction - to “rich”, dynamic stuff - when simple cat GIFs and the like are so popular now.

Often your ad-serving platform (e.g. Google/Doubleclick) chooses your banners for you. Nowadays, unless we have a direct relationship with a specific advertiser[29], I tend to take a more “assertive” approach than in previous years: if it’s obvious an ad spot isn’t working or is causing problems with the site then I’ll remove it (selectively if I can, or if the problem only affects certain browsers) without asking anyone for permission first[30].

We’d planned to run three different sizes of ads on Ampp3d, including a 320x50px version for mobile. Despite a lot of searching and testing I could never get them to work properly; either nothing would display at all, we always got full size ads where mobile ones should be or occasionally Google served the 320x50px version desktop users by mistake. The errors were also rather random.

So I gave up and we’ve only ever run adverts on desktop – our code contains a shouldWeShowAds function that uses the modernizr media-query method to check the device size. We also block IE9; it would inexplicably stack two or three banner ads on top of each other.

It’s important to invest some time in getting ads working if that income is contributing to your wages, but if you don’t have much control over the platform and the bugs are serious enough to annoy your users or make your site appear broken then I think you have to take action.

I did make sure if ads get switched off, no Google code will load:

I do run an ad-blocker[31] but I usually turn it off for any sites I’m developing; I want to see exactly what the majority of users are seeing. However occasionally it can work the other way around - I spotted a post where an image was missing; it was of a commercial in a newspaper and it had turned out the filename was advertisement.jpg, which was immediately blocked.

Non-javascript support

The essence of this (as promoted by Jeremy Keith and others) is you should progressively enhance your site so it works if someone has javascript off, but also crucially, just because they haven’t chosen to disable javascript or are using a modern browser is no guarantee that it will execute correctly (there might be an error in your code, a CDN you rely on could be down[32], an advert might be misbehaving… many reasons).

Turning JS off can also provide a new lease of life to older, slower devices. Try it in Safari on an old iPod touch (a CPU with much less oomph compared to modern Apple devices and also restricted to the slower javascript engine in iOS6) - you can certainly use that for web-browsing (providing the sites you’re using are progressively enhanced) and in some ways the no-JS experience can be more pleasurable than the full bells and whistles one because you just get the basic content[33]. It would be terrible if a device like that became ‘too slow’ to access the web.

So building in the ability to cope without javascript is a principle I always try to follow, but Ampp3d certainly isn’t perfect. I’m yet to get the Datawrapper support to work for example (provision for fallback PNG images exists, but…) Also, as I mentioned in Browser support, IE8 and below are a real problem here because if you want to use nice semantic tags like <section> then you need the HTML5 javascript shim, and disabling javascript will break your layout spectacularly.

Nevertheless if you turn javascript off on a standards-compliant browser the site will work pretty well. We have no easy way of telling how many people are actually doing that because we use Google Analytics, which like all third-party analytics is javascript based.

There’s a journalism issue here too I think - from time to time we have interactive stuff that makes no-sense at all without JS. Rather than a ‘sorry, your browser does not support this’ message (or no message at all), should we be adding some sort of proper commentary or editorial akin to audio description in our <noscript> tags?

Finally we should probably starting asking designers to consider how sites with complex tabbed navigation (such as Ampp3d) and other interactive widgets ought to appear in cases where they’ve fallen back to HTML and CSS only.

Measuring Performance

I use a combination of WebPageTest, Google PageSpeed, YSlow! and Zoompf.

I find WebPageTest the most useful; it’s measuring actual performance in a real browser environment rather than looking at your source code and guessing. Google is handy for it’s mobile UX scoring (it analyses touch target size, for example) and Zoompf is simply very fussy about optimising every bit of your page to the fullest extent.

The scores we get with all of them vary depending on the complexity of the article you’re reading at the time, so there are one or two specific posts I use as benchmarks in addition to checking the homepage.

Without going into all the details[34], generally speaking I think we do “OK” (advertising and multiple sets of analytics weigh us down a bit).

WebPageTest has a clever concept called ‘SpeedIndex’ where they analyse a video of your page loading and work out how long bits of it take to render. Occasionally we’ve reached the 15th percentile, typically it’s somewhere in the top 20–30.

It’s interesting to compare us to UsVsTh3m given their use of heavy use of graphics (and the Tumblr limitations I mentioned earlier). When I ran a test recently, their page weight was a huge 20MB with 313 requests (many thumbnails all served from 500px images). The waterfall chart is terrifying.

However most of that’s below the fold, so even though “Document complete” (the point at which absolutely everything is loaded) took about 35 seconds on a 5Mb test link, the important bit (the DOM ready event) was under 2 seconds and they do really well on the speed test. The fact that Ampp3d usually has a huge purple bit where the page is being setup is a reminder of how costly client side code can be - I wish we could cut back a bit more.

I can’t write a section on performance with mentioning The Guardian and the BBC, but also take a look at the ABC and their new mobile site, which really impressed me (they’re using a global CDN now so it’s just as fast in the UK).

Varnish

Ampp3d runs on a single (albeit high-powered) server running Nginx and Varnish Cache.

Varnish is fast, powerful and reliable and I wouldn’t change it for the world.

It was also a steep learning curve made all the more complicated by i) Wordpress ii) we’d already revealed our domain name and ii) having to fit it all into the final week before launch.

Here’s what I learnt (some things are obvious, but only in hindsight):

Ban vs Purge:

This is confusing - learn the difference (definitive explanation).

For Ampp3d I always use ban. We also store everything in memory for speed (remember to set that in your config).

Varnish and Ampp3d

Why do we need Varnish at all? Well, according to ApacheBench Wordpress would stuggle to serve more than around 15 pages per second, indeed that’s crept up to 300ms per request as I write this (largely I suspect, due to scanning for animated GIFs – see responsive images).

Nginx certainly helps with getting all the static files out as quickly as possible - indeed, for simplicity / to avoid writing extra code, I don’t bother with Varnish for Datawrapper charts[35]) – but it can’t do much about PHP[36] and database bottlenecks.

We use the Wordpress Varnish plugin. It’s worth having a look at the source code so you understand exactly what’s going on. It checks your Varnish version to use the correct purge/ban syntax[37] and uses add_action to hook into post/page edit events, anything that happens with comments and when scheduled posts are published. It’s completely transparent to your users and has never gone wrong yet (although it helps to teach people about the benefits of versioning filenames, especially for more interactive posts).

We have of course, configured Varnish to detect when journalists are logged into the system and not cache anything for them.

Edge Side Includes

We use ESIs to handle the publication ‘age’ (“X minutes/hours/days/months ago”) displayed in the byline and our navigation (there’s a separate ESI script that generates the entire sidebar). It means it doesn’t matter if generating the article takes slightly longer ( as the article itself will be permanently cached and the ESI fragments for the sidebar and timestamp quickly swapped in.

The PHP scripts send TTLs using a Cache-Control: public, max-age=X header, served by Nginx, read by Varnish.

The sidebar has a 59 second TTL.
Timestamp are cached depending on how recent it is:

Although I could have written it by hand, the timestamp text is generated using Wordpress code. However I wanted timestamps to be generated instantly - it didn’t make any sense if the entire Wordpress codebase is needed and they took just as long as a full pageview.

So the only files the PHP script loads in are:

Resulting time per time-since ESI request: 4 milliseconds.

The sidebar is still quite slow. There is a different sidebar per article (so if the current article is one of the recent posts, it is visually highlighted).
If I had time to optimise further, I think I’d manually write the 8 versions (7 posts plus one for nothing selected) to disk every minute in one go and have Varnish retrieve that rather than generating it on the fly.

We also tested, though have never used, ESIs on remote servers. This is handy if we ever want to take a feed from elsewhere; Varnish essentially proxies it and preserves whatever cache-control header you supply. It’s another backend in the Varnish config. We had an article which tracked the Bitcoin price generated on a separate server which used the XML-RPC API to update a blog post, but could equally have used this method.

Generating the navigation fragments at regular intervals solved another potential problem - Wordpress scheduled posts. Wordpress doesn’t have proper scheduling support, so at the time your pre-scheduled post is due to go live, nothing actually happens. It’s only when someone subsequently visits the site that the list of posts is checked (using the ‘wp-cron’ script) and it’s publication status is updated.

With Varnish, if all your pages are stored in memory, Wordpress isn’t being asked to do anything, so the danger is won’t be published. Fortunately the act of generating the sidebar also triggers wp-cron (which in turn fires off the events the Varnish plugin is hooked into; the post is updated and the cache reset).

Six months on

Everything above sounds really complicated, and it was. But I’ve not had to worry about it since.

Varnish has been incredibly reliable. Journalists don’t need to worry about it and I have a series of steps for updating site components which consistently works. We’ve never had any weird error messages and I can think of only one occasion where I’ve had to manually clear a file from the cache because it had been marked as 404 for some reason.

I also fully endorse what Bytemark and others have long argued; that for all but the biggest sites, it’s preferable to have a powerful, reliable single server with the single point of failure that gives you, rather than spending time and money engineering a complex clustering solution which makes you reliant on multiple systems functioning correctly for your site to stay up.

LEMP and LAMP may not be the most fashionable web architecture in 2014, but there are reasons many of us stick with them.

Hosting

We use Bytemark (Martin’s choice). As I’d started building the site on Linode before the server was ordered and provisioned, time was incredibly tight and I was unfamiliar with how Bytemark worked, I persuaded him to let us launch on Linode and move it later.

It wasn’t until January I actually migrated it (at 10pm on a Sunday evening). That involved a significant amount of prep, but it all went as planned and it would have been trivial to roll back.

I speak highly of both Linode and Bytemark. You’ll pay more for the latter (very few can match Linode’s bandwidth prices) but there are benefits to having a dedicated machine and the price is very competitive.

Also, the quality of Bytemark’s status updates and RFOs for even minor incidents is, in the 15 years I’ve been doing this, the best I’ve seen from a hosting provider. Prompt responses, plenty of detail and all in plain English.

Linode are extremely quick at dealing with support tickets (always within a few minutes) but they’re occasionally reluctant to talk about stuff (especially outages) or I have to hassle to get an answer, which irritates me. Managing data centre outages from a different country can be a bit worrying too.

Bytemark have a nice habit of thinking about non-urgent tickets for a while but then responding in a more personal way - it’s quite reassuring to get an email from the MD at twenty-past-one in the morning.

I’d say if you only consider features, it’s neck and neck between the two for virtual server provision at the moment (we’ve been using a virtual machine on Bytemark’s BigV platform as a spare development server for ampp3d and I have multiple Linode servers for other clients); BigV has been very nearly 100% stable and they’ve just improved their GUI to bring it up to Linode’s standard, they also offer a range of disk choices (price vs speed). But Linode have just finished a massive rollout of new hardware including SSD everywhere, fancy new server monitoring and much simpler (but still granular) pricing.

The more curious among you may notice this domain is hosted on C4L. I will be moving it, their virtual hosting isn’t great in all honesty but to be fair it’s hardly their their core business – if you’re looking for co-location they’d make an excellent choice. My main complaint is I can’t get hold of them on IRC and I have to open a ticket through a web interface then phone to ensure they’re aware of it.

Oh and if you’re looking for a broadband provider - Andrews and Arnold.

UX Features

Changing browser URL

A few people have noticed we update the browser URL as you move through stories. An idea I stole from Quartz.

Our initial implementation – hurriedly thrown together – was a disaster, but I replaced it a week after launch.

One issue was the choice of library to detect the scroll events, initially I tried jQuery inview, but jQuery.appear suited our needs much better; it gives you a single list of visible elements for the selector you’ve provided (in our case the article tag) so you can then analyse the order and figure out which is goning in and out of view.

When the next story is two-thirds of the way up the page, we update the URL. But we try to preserve the initial URL if people start on the homepage/category pages, scroll through a few articles then go back to the top again.

Mobile navigation

Opening the sidebar in our design requires scrolling the page all the way to the top so all it’s contents are visible, but if you close it again without picking anything we’ll take you back down to wherever you came from.

I wanted to make it very easy to dismiss:

If you tap the ampp3d logo at the top of the page once, you scroll to the top of the page (intended to replicate common iOS behaviour) but if you’re already at the top, the logo reverts to a standard homepage anchor link that will, by it’s nature, also refresh.

Newsletter subscription

We wanted to promote the newsletter, but I didn’t want to annoy people.

We added a signup box to the bottom of the sidebar, but also more prominently to the bottom of each post. There’s a “Please don’t mention the newsletter again” link too.

These days I never use cookies for storing preferences, I just do a Modernizr test for web storage (the simple key/value version is widely supported). If you click the link we hide the newsletter stuff and don’t show it again[38].

We’ve ripped out javascript validation and use the new native HTML5 form validation with type="email" and required.

Mailchimp have sample code you can paste in to add a subscribe form, but it’s way too heavy so you need to pull it apart; all you need is the form and a jQuery AJAX GET (not POST) request to .list-manage.com/subscribe/post-json?u=… - then you handle the error or success response you get back appropriately.

Facebook comments

The team thought adding comments would boost sharing/reading of articles (and they’ve been proved right so far) but I was concerned about slowing the page down[39] and how much it would annoy people like me who hate comments.

So the plan was to put the comments in, but allow users to turn them off with a single click and without losing their place (we use a jQuery animation to slide them out of the way). If they do turn them off, the Facebook library will never load.

Again we use LocalStorage to save the preference. If comments haven’t been disabled, we load the Facebook SDK once the page is ready, then there’s a function which adds a .fb-comments HTML fragment below the post and calls FB.XFBML.parse on that element; Facebook’s code then converts it into a comments area.

Facebook’s documentation isn’t, to be polite, as complete and up-to-date as perhaps it could be, so simply getting the SDK to load properly meant taking advantage of other peoples’ experiences on Stackoverflow[40]. It’s all run very smoothly since we turned it on though.

You’ll notice there’s a (relatively) big gap underneath the Facebook comment area for stories that have no comments yet. Don’t waste your time trying to “fix” this (or follow all the advice from 2–3 years ago that’s now out of date) - Facebook make it impossible to reduce the height because the whole thing is inside an iframe and they need sufficient room for the ‘Comment using’ dropdown (that appears when you’re not logged in) to be fully visible when open.

I was hoping to try and show the number of comments that had been left if they had been hidden – e.g. ‘Add/View comments - 3 so far’ – but I didn’t trust the number I was getting from the graph API (which in any case was going to get muddied with the number of general Facebook shares the article might have had) and also I wasn’t sure people who’d turned them off really cared[41].

Email sharing

All we have is a simple mailto: link with prefilled subject and body text.

For years people have made ‘Email this story’ functions far more complicated that they ever need be. No-one wants to have to fill in a form, try and remember the recipient’s address or manually copy and paste it from another application, be asked to complete a CAPTCHA or worry what their friend will actually receive or if they’re going to be added to a mailing list without their permission.

Also, I’m amazed so many of us wasted our time building these things; were mail clients ever really that bad? It’s not as though it made the process any easier on mobile either. Some websites don’t even allow you to add a personal message and of course you have no record of what you sent.

I hope in a year or two everyone will just be using mailto: and passing control to the user.

Keyboard shortcuts

As an experiment I added some simple keyboard shortcuts for switching tabs and search, like the numbers database I used Mousetrap for this. The obvious next step would be j/k to navigate between posts.

URLs

Wordpress gives you a memorable, easy to understand URL structure out of the box. I thought it important you could easily tell when a story was written, so our permalinks contain the full date.

I used the wp_users.user_nicename field (not visible in the admin interface, the ‘irregularly’ named journalists might not realise it exists) to map a proper /author/firstname-lastname URL to our not-very-consistent Wordpress logins.

Something else worth knowing: if Wordpress is given an incomplete URL or one that would otherwise 404, it will query the database to find the closest match (useful if the address gets truncated or split in two).

Many of our assets are served from http://media.ampp3d.co.uk. This is on the same machine - it’s nothing to do with speed as Varnish does an excellent job, but it means they are on a cookieless domain. There’s a slight speed/performance benefit to your users as their browser doesn’t have to attach many irrelevant advertising/tracking cookies to each HTTP request for an image.

Other presentation touches

scrollToMe is a small function we use to smoothly scroll between different articles when using the sidebar, the back button or pressing the logo to navigate to the top of the page.

We’re using Adobe’s BalanceText – a jQuery plugin for neatening ragged headlines. Centred text with lines of unequal length annoys me[42]. BalanceText adds words to each line until they don’t fit and then adds line breaks in suitable places. It’s lightweight (if you already use jQuery) and ideal for responsive sites; it would be non-semantic to add presentational line-breaks by hand. Adobe have proposed a CSS class so it wouldn’t require javascript and a brief flash of unstyled text prior to rendering (which isn’t, in my opinion, as distracting as when webfonts load). Hopefully browser makers will take notice.

Note you can’t apply BalanceText to anything with other inline child DOM elements amongst the text - they will be lost during the rendering. It needs to be a single paragraph or <a href= tag.

Also I’ve struggled to make it work correctly on the initial page load in situations where the webfont isn’t in the browser cache, despite calling it from the webfont loaded event.

Other polyfills:

Our plugin code also has the (now standard in H5BP) function to prevent console errors on old browsers that don’t support it if you leave console.log in production by mistake.

Bandwidth & page weight - the background

Martin has written about the importance of checking that content works (both technically and editorially) in a mobile browser and the idea that people in newsrooms should work on their phones; I’d argue their connection speed is equally important.

If Martin’s desktop ban were ever enforced, chances are everyone would still insist on connecting by wi-fi; but many of your users may have 2 Mbit/s ADSL broadband or slower[43] and struggle to ever get a stable 3G signal[44] . Connectivity in central London tends to be excellent by comparison - The Mirror office is right in the middle of Docklands and a short DLR ride from many Tier 3/4 datacentres, including the main UK internet backbone at Telehouse North. Everyone has fibre and although some mobile networks are finding it harder than others to handle demand in the Canary Wharf area[45], generally the quality and availability of 3G/4G in London is pretty good.

There’s also no guarantee wi-fi will necessarily be fast. I was in a hotel last week where 500 Kbit/s was the free offering and “fast” access was (over) £15 a day. I declined as my phone was giving me a much better connection and I could tether it.

It’s a real problem if those creating your content have much better bandwidth than the ones viewing it[46], and it may always be this way. You might argue telecoms providers should get a move on with fibre rollout and allow the rest of the country to “catch up”, but maybe they never will, as by the time all rural areas have FTTC connection speeds in urban areas will have risen even more.

Also, mobile data = money. A 3.8MB animated gif of a cat accidentally sitting on a hedgehog may be amusing, but how many would have looked if they beforehand it would cost four pence?[47].

Any solution should be automated too - few people are going to spend time adjusting JPEG quality settings through choice[48].

“Getting primarily journalists to optimise images - losing battle mate, losing battle…”
Martin to me by email, 12 Feb

Loading libraries

I’ve gone to quite a bit of trouble (through editing of wp_enqueue in plugins plus manual coding) to keep the vast majority of javascript requests at the bottom of the page. The exceptions are things like Modernizr and Picturefill which have to go at the top, plus in our case the code for ads and Google Analytics.

We use BWP Minify which combines, minifies and versions CSS & JS (you could use something like Grunt of course, but for any site with a CMS it’s easier to let the CMS handle it). Varnish complicates things slightly, I have a routine where I have to clear the cache for all HTML files whenever the CSS or JS is updated otherwise many pages will be running old versions of the code.

When infinite scrolling requires a plugin, we normally use jQuery.getScript.

We’ve used yepnope.js for a couple of things but I wouldn’t recommend it, not because I had any problems, but simply that almost immediately after I implemented it I saw a talk where it’s author advised against it (his recent recommendations here).

Responsive viewports

Designing in Photoshop means fixed breakpoints, of course, but I’ve done my best to make things work at as many, if not all, sizes as possible in between (it’s not perfect: certain third-party content can still break things and not everything gets reflowed if you resize your browser without refreshing).

For our two column layouts we simplified CSS from Responsive Grid System; customised to match our own margins.

I’ve avoided user-scaleable=no or maximum-scale in the meta viewport on a point of principle.

Responsive images, retina displays and Picturefill

I hadn’t expected that soon after launch, at least a couple of the staff would buy Retina Macbooks (I thought high resolution laptops were a niche product and would only be of interest to photographers, video editors and graphic designers). Overnight, they were uploading 2dppx images, with the potential to be four times the filesize.[49].

Anyone in the industry knows how confusing the responsive image situation is, so I’m not going to try to cover it all here[50].

Also, I definitely wouldn’t recommend anyone blindly follow our approach, it’s a series of compromises, held together by string in places and if the site were to survive in it’s present form would certainly need to be changed again as browser support evolves. However I’ll walk you through my decisions:

On the left of the page in our navigation we have small thumbnail images for our recent posts. We have a number of custom thumbnail sizes. Originally we used srcset to swap in a 486x240px image on retina displays instead of the standard 243x120px one. One problem however is that whilst we have a single responsive design for all users, the navigation bar isn’t visible by default on mobile devices and indeed’s it’s perfectly possible a visiting user might not look at it at all. However the navigation appears above the page content in the source code, which means all the images are loaded first, which effectively blocks the rest of the page content if you look at a waterfall chart.

We now have an image placeholder (a 43-byte spacer GIF encoded as a data-URI[51] and a couple of data attributes with a standard and retina image. Then, when the DOM is ready, we look at the device resolution and substitute the correct image. The result is the main content loads a bit quicker.

For articles (both any images within them and the Wordpress featured image - or what we call a hero image) until very recently we didn’t have responsive images at all.

We have our own custom code to manage the display of hero images & captions, but for everything else we use the HTML Wordpress adds to the editor when you select an image from the media library. This is a double-edged sword; it gives journalists plenty of flexibility but having actual editable HTML in your CMS can be a bit dangerous. It’s important not to be a total control freak – other people are going to edit your code eventually – but it makes modifying things harder at a later date; if your image data is abstracted (e.g. JSON or YAML) it’s going to be easier to modify than if you have to write regular expressions to find and replace actual markup.

That’s ultimately what we’re doing at the moment to make picturefill work. I’ve written code that hooks into the_content filter with a regexp that matches “full size” images[52].

If an image is 450 pixels or narrower we skip it as the saving is negligible.

Another complicaiton: when Wordpress resizes thumbnails, any animation in an animated GIF is lost. Some of our hero images and many of our body images are animated and we need to retain this. For now this means skipping picturefill and using the full size version [53]. We could ask people to tick a box to indicate if a GIF was animated, but they’d probably forget, so we actually detect it automatically by reading the binary file in PHP and counting the number of frames - which might ordinarily be too processor intensive, but we have a fast server and we’re using Varnish.

Scott Jehl and Filament Group are doing truly fantastic work (you should buy Scott’s book on Responsible Responsive Design - out later this year).

Picturefill has two versions and although he’s encouraging people to try the newest (v2), which uses the native <picture> tag, frankly it’s not ready yet, for me anyway (and through no fault of Scott’s) – mainly because we’re waiting (and hoping) browser support will catch up[54]

You ought to read the caveats at the bottom of the page first.

Deciding factors against v2.0 for me:

Version 1.2 uses <span> tags and supports <noscript>, as we’re implementing it at CSS level, it’s not a big issue to use it for now at least as we can replace the code as/when browser support improves and picturefill is further developed.

Things you should be aware of:

We have two versions of the code, one for hero images and one for those in the post body. They have slightly different media queries (the maximum 1dppx size of the former is 890 pixels, the latter only 810 due to margins). Also you need to remember to exclude any thumbnails that are vertically cropped (like our 486x240px retina sidebar thumbnail - fine for the sidebar, but you don’t want to cut off part of an infographic when it’s shown inline in your article).

We might have too many media queries, the sizes aren’t necessarily the best and the breakpoints may need tweaking a bit (you have to take a decision at which point a 1dppx image should change into a 2dppx image, and choosing the sizes is already hard enough given the variety of mobile devices).

Also if you are doing something like this, it’s important to have a radio button in the CMS to disable it. I checked loads of posts before putting it live, but subsequently have discovered at least one that was broken and was grateful I could just quickly switch it off.

FInally Jason Grigsby makes a good point (some time ago) that you should expect whatever you do to be deprecated - at least with a CMS we don’t have to do it manually every time.

Is it worth it?

I think it’s great on modern browsers and especially on mobile which gets the smaller images, the site is noticeably snappier.
But I also appear to have wrecked the performance in IE8 so maybe I should consider it a failure, rip it all out and start again. Perhaps, custom javascript of our own to fill in image placeholders on the fly, as we do with the sidebar, would make more sense.

I also only last week, unbelievably, read about The Guardian’s modern browser test and their use of the navigation timing API – perhaps we should be doing that too.

The concept of responsive images is sound, clearly:

I’ve not mentioned art direction (cropping etc.) - Wordpress doesn’t have any support for this, you need to upload a separate image.

Should you develop using a retina display?

I’ve chosen not to so far. I can test if retina images are working properly using a phone and tablets, but on my computer I’d rather have the same type of display as most people[55].

Remember to check your site at different zoom levels but also on a desktop size retina display (or emulation); early on someone from The Guardian’s QA team send us a screenshot of our navigation tabs wrapping, which I thought I fixed, only for Conrad to notice the same problem with his retina laptop. It wasn’t visible on my 24" desktop display.

You can simulate a retina display easily:

Equal height columns and tables for layout

Some of you may be painfully familiar with this page.

Our situation: We’ve a sidebar and main column that need to be equal height on desktop[56]. The design is more of a challenge than many sites - the sidebar has background colour and drop shadows either side and sticky-out arrows[57]. The main column uses infinite scroll, growing taller and taller as we load new articles and the whole thing is centered on larger displays.

We started off using the “One True Layout” method (the one where you mess around with overflow and add huge amounts of padding) but eventually I switched to CSS tables.

Reasons:

So moving to CSS tables was a process of elimination.

I think the main reason I rejected Faux Columns was we needed box-shadow (or there were problems with the background or something)[59].

The Doug Neiner and Nicholas Gallagher methods firstly aren’t intended for non-fluid widths, but the first doesn’t produce realistic box-shadow either. It seems some browsers can’t use the multiple color-stop trick on extremely small gradients either.

Drop-shadow may sound an incredibly fussy reason for rejecting multiple techniques - all I can say is I thought it was important. Turn it off and the site looks incredibly flat. I’d compromised Chris’ original design in a couple of other places. I felt this was worth the effort.

Browser compatibility for Flexbox is still limited (however if the site lasts long enough[60], I’d like to replace css tables with flexbox for browsers that support it – mainly to avoid the slightly awkward ‘one column then another’ rendering effect you can get with display: table on certain devices on certain occasions.

We do use flexbox to progressively enhance our two-column grids within articles; large numbers can be vertically aligned with accompanying text for a more visually pleasing, symmetrical feel.

CSS tables aren’t evil.

We’re not talking about adding <table> markup, all your HTML is still semantic. You’re using something that’s in the spec. The W3C aren’t going to withdraw it - it won’t go away. There’s great browser support[61]. The behaviour is predictable. There’s a test suite. It’s recommended as a technique to rearrange your source code[62].

CSS

We wouldn’t have launched on time without SASS. Variables and mixins make the CSS much easier to manage, cut down on repetition and help you keep your design consistent.

That said, I look at my code and generally think “that’s awful”[63]. If starting again, I’d want it to be much more object oriented (you’ll find my early experiments with BEM in the .email-signup form between posts.)

We have too much specificity (everything in article.page, article.post, for example), over-qualification and too many places where we use IDs[64] instead of classes.

As of SASS 3.2 you can include content blocks in mixins, so using that and an idea by Kendall Totten I could use media queries like “@include respond(sidebar)” for situations where the navigation is visible and also to take care of IE8 support.

I was pleased that all the code I wrote avoided !important, however you often find it littered in plugins (wpPro_Quiz, for example) and it’s not always easy to unpick.

Harry Roberts (whose blog you should read) has an interesting idea; start collecting all your hacks in a shame.css file.

My other lesson from this project: I used jshint loads, csslint not nearly (or early) enough.

Infinite scroll

“A web design technique that prevents the browser scroll bar from scrolling to the bottom of the page, causing the page to grow with additional content instead.”

Brief pros and cons:

But:

The most common complaint we had about UsVsTh3m - not being able to reach navigation at the bottom of the page - is easily solved by moving it somewhere else.

If you’ve dumped it in the footer it may suggest you don’t consider it important enough. If it is important, put it below every article if it is, or promote to the main navigation, otherwise consign to a subpage somewhere.

Ampp3d server side

We use a customised version of the Wordpress infinite scroll plugin. It’s based on work by Paul Irish[65].

Infinite scroll is built to accept a complete page (e.g. a /page/2 style Wordpress URL) and extract the correct story using jQuery. This is inefficient though, we strip out all the header and footer and just send a single article on it’s own. Title, hero image, standfirst, timestamp and sharing links at top and bottom are included, but newsletter signup and Facebook comments are added client side if the user has enabled them.

All the usual gzipping and Varnish work (caching with edge side includes) happens too.

It’s necessary to correctly handle different types of pages. The homepage is straightforward enough, a simple continuous stream of articles in reverse order of publication. But you might be looking at an category, tag or author page. The original plugin uses Wordpress pagination to do this, which may be enough, but I needed greater control[66].

One problem is Wordpress code doesn’t do posts with multiple categories well - long story short, get_adjacent_post isn’t precise enough. The $in_same_cat parameter is a boolean true/false rather than a category ID and there’s no way of making it stick to the same category if the next post has more than one. You might jump from Politics to Economy to Science to Technology to Entertainment as you scroll.

I wrote a new function with new SQL that takes a category ID.

We add an attribute to the .entry-meta tag containing the permalink of the correct preceding post; this is used when infinite scroll is triggered to determine what to load next.

When you first load a page

We set the main infinite_scroll options in a JSON object. Including:

When you scroll

How we handle “interactive” posts

What else do we do?

We execute any inline javascript in infinite-scroll retrieved posts automatically. Script tags also won’t work if you just add them to the DOM, you need to call .getScript (well, you can split them into fragments and use document write, but that’s messy when we have jQuery available).

I’ve been a little more cautious with these (not for security, because they’d be parsed anyway on a static page view) but just so I can feel a little more in control of what’s going on. The infinite scroll callback function scans <script> tags and compares the URLs to a domain whitelist so that, for example, the javascript for embedded tweets will load. Tweets are written as a barebones <blockquote> tag which is progressively enhanced by Twitter’s own javascript. Likewise, any custom javascript someone has uploaded to the Wordpress media library will be loaded (like our bingo game).

We used to preload the hero image for the following post (which the server sends as a data attribute) but because of introducing Picturefill, this is on hold pending a rewrite.

Scroll performance on iOS Safari

One disadvantage is all the time a page is being scrolled by a user’s finger, DOM manipulation is suspended: means you can’t add / remove or change anything, including inserting the next post at the bottom of the page. The page must come to a complete stop too (if they flicked their finger to scroll rapidly, the page will still have momentum when it leaves the screen), so you have to hope that people slow down sufficiently to at least attempt to read some of your article.

Not a problem on Android. No sign it’s going to change in iOS8.

Don’t waste your time looking for workarounds for this, there really aren’t any.

Charts

I mentioned in the introduction the requirement for responsive charts.

As I’ll say again to end this section, there are many different ways to do charts and there’s no right or wrong solution. Also, you kind of need to actually use several in production to get a feel for their strengths/weaknesses.

Back in November, as well as us looking at Quartz’s open-source generator, the Mirror were developing their own system - since launched - based on amcharts. They gave us their latest code and we were going to use that (it looked great, though truthfully it would have been extremely hard to have ready in time) until Annabel Church suggested Datawrapper because it had already had a web interface for journalists to use.

Datawrapper

Probably Datawrapper’s key benefits are the web interface to live preview your chart and a database so you can modify (or clone) charts later.

There’s a slight learning curve to understand how to install it[68] but it’s well coded underneath: everything is modular; each chart type and most features have their own plugin you can switch on and off.

Our Datawrapper install is customised a little (e.g. stripping out chart titles, adjusting fonts, making the source/attribution links at the bottom more subtle, some concatentation of script tags to make the iframe a bit faster.)

Iframes are bad for performance - it wouldn’t be impossible to get rid of them with Wordpress and Datawrapper running on the same server, but the code for combining multiple charts per page and dealing with infinite scroll would be hard.

DimpleJS charts

Datawrapper doesn’t do scatterplots and we wanted one for the Scottish referendum, so – partly as an experiment – I ended up writing my own using another library entirely - the increasingly popular DimpleJS.

Datawrapper have forked D3 into a simpler visualisation library called D3Light - smaller with fewer features but increased IE compatibility. This means there’s some D3 stuff you don’t get, e.g. axis features such as labelling are a bit limited in comparison.

DimpleJS uses the full D3 library and is great for programmers because you write your own javascript. To make it accessible to journalists I thought I’d make a JSON object that contains all the chart configuration in a human readable form and they only have to edit that, still leaving the door open to writing a proper web interface should there be demand.

It’s frustrating if you can’t tweak chart layouts, so there are several settings for that sort of thing.

As it’s literally only just finished I can’t show you a working chart (we do have one and I’ll update this), but in the meantime here’s the settings file.

Features:

Making charts that are also acceptable at mobile size is hard - it’s mostly a case of trying to simplify everything.

Adaptations for mobile:

I’m yet to find a reliable way to make the touch targets for data points sufficiently large for mobile devices, but still display them as tiny dots.

We’ve slightly modified the DimpleJS core code:

I may have talked it up too much and you’ll think “is that it?” when I show you it.

But I’m extremely grateful for all the help DimpleJS’s author John Kiernander has given me. He’s responsive and patient with questions and bug reports and has kindly added a couple of small improvements I suggested. The library is very much in active development and you should definitely take a look if you’re not familiar with it.

So which charting solution should I use?

“It depends”. I’m still far from sure if we have the right ones. It depends on your circumstances: are you a designer, developer, writer/journalist? How much time do you want to spend setting it up, which type of charts do you want to use, what’s your budget? etc.

Do look at those I’ve mentioned plus amcharts and highcharts. There are hosted services too like the Google Chart API - I’ve stayed away from them (though we allow staff to embed anything they want) as I’m wary of trusting core content to a free service which could be fundamentally changed or even withdrawn.

Internal Search

We launched the site using WP Ultimate Search. This was a bad decision, but I didn’t have any experience of Wordpress search plugins and we needed something quickly. The front end look of the search box is extremely difficult to style (many nested divs), it has a complicated “facets” feature we didn’t actually want and it’s tricky to hook it into simple javascript events (but it had a heavy payload of JS and CSS files). Maybe that sort of search is useful to someone, but…

Later I replaced it with Relevanssi (the free version has been more than adequate for us so far). The front end is extremely light weight (so light you’ll need to build your own AJAX search from scratch - I used this solution as a starting point) and the back-end includes a much better index. Plus there’s support for stop words, weighting, exclusions, logging, snippets… It's also easy to use the same index for your admin pages.

NB: Because we normally only show one post per page, I had to manually change posts_per_page to 10 for the non-javascript searches to paginate correctly. Also you might run into this issue – the searchform.php file in our theme was being ignored and I had to disable the get_search_form filters.

DNS

We host our DNS on Amazon Route 53.

Several years ago a sizeable chunk of the web community were using Zerigo, a cloud-based DNS provider with a lovely UI and plenty of features. Many of us who’d hosted our own DNS (using BIND etc.) got excited about it and at this stage Route 53 was still in it’s infancy. Then Zerigo (a single developer startup) got acquired by 8x8 and in July 2012 suffered a disasterous outage; a prolonged DDoS that took thousands of popular sites offline.

By this time Route 53 had a user-friendly web interface and my thinking was that if anyone was going to be able to survive a DDoS attack then it’s Amazon[70]. They’re also the only provider I know with a 100% availability SLA (not that the money’s important).

Route 53 DNS records propogate quickly, there’s support for very low TTLs, simple load balancing and health checks should you be using multiple servers. Most cleverly of all I think, Amazon give you nameservers unique to each domain (to reduce collateral damage during a DDoS) and spread across different TLDs (to add a further bit of resiliency if there’s a root server problem[71].)

It was vital to have direct control of our DNS (I know how easy it is to break and frankly I didn’t want to risk anyone else screwing it up), but at the same time wanted Martin to have the ability to get in (without sharing passwords) and change stuff in case I did the same or wasn’t unavailable in an emergency.

This meant using AWS’s Identity and Access Management (IAM). It can be confusing, so here’s what I learnt about Route 53 IAM policies:

Analytics and Monitoring

Analytics and tracking can be overused on many sites; we have three seperate systems on Ampp3d - I’d love it if it was just the one. Everyone mostly quotes stats from Google Analytics but we also have Chartbeat and the Mirror need us to use Adobe Omniture because it’s what they have for their other properties.

I use New Relic to monitor server performance - besides the typical CPU, RAM and disk utilisation it tracks network traffic and specific PHP/other errors – but the client-side “Real User Monitoring” (page load times etc.) is turned off to avoid yet more javascript.

Pingdom checks our site is still up once a minute and there are five-minutely Nagios checks from a server of my own[72].

Measuring popular posts

One of the tabs in the sidebar shows the “Most Read” items. Initially this was generated with the WordPress Popular Posts plugin - it’s well written and simple for anyone to install, but the problem is scale. The plugin maintains it’s own database of page views and allows you to count pages ‘normally’ or via some AJAX depending on whether you’re using any sort of caching (Varnish doesn’t - and frankly shouldn’t - cache POST requests). I’d already modified the plugin so that displaying the list of popular items no longer used a live lookup but a cached set of results, to reduce load. (During) the night before launch, I realised this would be a disaster as my basic benchmarking showed Wordpress could only handle 15 or so POST requests per second, so all our hard work ensuring we could serve pages to as many users as possible would be undone just by trying to log them.

As we were already using Google Analytics, it seemed to make sense to retrieve the “Most Read” list from there if we could. First priority was to immediately turn the Popular Posts plugin off; for a week or so I updated the ‘Most Read’ page entirely by hand, logging into Analytics and copying and pasting the URLs into a static HTML file. Then I worked out how to pull the data in.

Other people might be wanting to do something similar, so I’ll give you our code and walk you through the setup process.

We use the Google API PHP client (which now has a pretty active Github repository).

Here’s our script - you’ll need to fill all your authentication values. It runs severals times a day as a cron, grabs the 30 top URLs, uses url_to_postid to verify each is a valid post and finally writes a simple HTML file with a list, pulled in by Wordpress.

Setting up the Google API access is confusing. This article was invaluable but Google have changed it all since then. What you need to know:

Security

It’s never advisable to list all the security measures you use (partly as it reveals which you don’t), but I would mention I’ve found the Wordpress two-factor auth plugin useful.

Wordpress users should be aware of a small security risk with draft posts.

Wordpress has two main cookies:

If you’re using public wifi and access an http:// page, anyone else on the network can steal your cookies. Assuming your site is correctly configured people will be redirected to an https:// page (where cookies can’t be stolen) before logging on. However it’s extremely likely they’ll browse the non-https:// site by following public links etc. and normally you don’t redirect blog readers to SSL - Varnish, for example, doesn’t support it[73]. So the less secure cookie will still be sent.

This is what Wordpress uses to decide if you can view a preview of a post, so if someone were to obtain your wordpress_logged_in cookie they could try p=XXXXX&preview=true URLs until they found the post.

Very recently there were reports about potentially more serious security risks with this particular cookie; however these mainly apply (or rather, applied) to a configuration problem with Wordpress.com rather than privately hosted blogs and actually some of it isn’t true - for example having the logged_in cookie won’t allow you to make a new post, moderate comments, access the dashboard or anything else – providing you have SSL setup you’ll simply be redirected to an https:// login screen. Also the standard cookie expiry is 2 weeks, not years, and you can modify this yourself.

But it’s correct that 2-factor auth is no help whatsoever with this sort of problem.

There’s a technique called ‘forward secrecy’ your ought to enable for you SSL certificates. In simple terms it stops someone from recording your encrypted traffic now, then stealing your private key at a later date and using it for decryption.

It’s easy to setup[74] (no need to regenerate certificates or involve your SSL provider, just a few lines in your apache or nginx config) and there’s a great online checker that will look for this and many other certificate issues.

I remain nervous about us getting hacked one day.

SEO

Although its traffic from Facebook and Twitter that much of the industry is obsessed with, in fact we get much more when stories are featured on Google News – indeed it’s almost as though people no longer bookmark the homepages of traditional news organisations[75]

We use the BWP Google XML Sitemaps plugin to generate our own sitemaps. I modified it to add support for images.

One thing to be aware of: news sitemaps only show pages created in the last 48 hours.

A big advantage of the site being part of Trinity Mirror was their ability to get us listed on Google News at all - it’s hard for smaller sites to be approved. Also The Mirror’s team have been helpful in noticing a stupid mistake with our robots.txt file before I did…

Unfortunately, although plenty of our stories do get indexed by Google News, I’m not sure if the sitemap has made any difference. Even after changing our sitename to match The Mirror’s, Google Webmaster Tools still (occasionally) protests the publication name doesn’t match. You may also note the image page warns about hosting images on a separate domain to your main site – “it’s very unlikely we’ll be able to crawl them” – so watch out for that.

Alex Norcliffe was kind enough to suggest a way we could improve our organic search results. We’re trialing an extra field for articles: a search term we think users are more likely to search for (perhaps a more informal tone, rather than the typical ‘editorial voice’) which journalists manually add to the article body (so it is genuinely part of the content) but we also tell Wordpress to insert in the title attribute of headings and internal site links.

(I did want to check if this would be bad for accessibility but it turns out the title tag is – contrary to received wisdom – rarely read out by screenreaders anyway.)

Since the site launched we’ve supplied custom meta data for Twitter cards, though it’s not enabled at the Twitter end yet. We have custom fields in the CMS which can override the default values.

Software development methodology

We’ve err, never really had one?

So “agile” then.

IDE / Tool recommendations

Warning

Working

Planning / Organising / Writing

Working environment

Acknowledgements

Sorry if you were expecting a big finish – there isn’t one (you could always re-read the introduction, some of my best stuff went in there).

I’d like to thank Martin Belam for choosing to work with me and everyone at Ampp3d and Trinity Mirror who’s been involved with the project.

Now that the site has proven itself editorially, The Mirror want to bring it within their own CMS, but my intention is to archive the present version (and it’s URLs) for posterity.

Thank you for reading.

william@wturrell.co.uk / wturrell.co.uk / @williamt


  1. As I insist on calling them.  ↩

  2. Well probably not, but I’m keen to hear of other sites doing the same.  ↩

  3. Chris’ design and our Ampp3d Wordpress theme does have a ‘Read More’ button - we’ve just never used it.  ↩

  4. Only a very small proportion (Martin reckons 3%) of Ampp3d’s traffic is direct to the homepage anyway.  ↩

  5. Chris – who I’ve never actually met – saved me some time by providing adjustments in CSS once I’d coded the initial templates; always useful as Photoshop’s typography rarely matches rendering in the browser.  ↩

  6. We launched over 12 hours early - full testing the prevous night involved removing password protection and disabling the landing page so I could check Varnish was working how I expected. We asked Malcolm and he said we might as well leave it up.  ↩

  7. Quartz use Wordpress VIP (enterprise hosting/support) - it’s worth reading the case study.  ↩

  8. I did float the possibility of using PyroCMS. It’s developer friendly in the ways I’ve mentioned Wordpress is lacking and I use it for other clients, however in hindsight it would have been a disaster for Ampp3d simply because Wordpress’s multi-user editing, revisions, media library and so on are so strong. I doubt we’d have got it ready in time.  ↩

  9. Notational Velocity is fantastic but you’ll probably like Brett Terpstra’s fork, NVALt, even more. Use it with Simplenote; you then have an iPhone app and, crucially, a ‘history’ function; quickly recover earlier revisions of a note you’ve accidentally deleted or otherwise messed up.  ↩

  10. I will be linking to all the locations.  ↩

  11. Trello is made by Fog Creek software – you may enjoy their account of keeping a flooded New York datacentre operational during Hurricane Sandy. Also described in the StackExchange podcast.  ↩

  12. Someone suggested that due to Martin’s obsession with venn diagrams we call the site UsVsVenn, so he asked me to put that in the source code as a joke. I made it smaller as part of a spring clean to reduce page weight, but it’s still there.  ↩

  13. I’d suggest testing on Windows 7 in preference to Windows 8 because it’s easier to rearm.  ↩

  14. e.g. Safari modifies the DOM if it finds something it thinks is a phone number.  ↩

  15. Same with bugs in the CMS - I installed a plugin which broke a menu option I didn’t know existed and it was weeks before anybody mentioned it.  ↩

  16. Have plenty of spare USB ports; it’s not uncommon for me to have 2 or 3 things connected at once. Also it helps with the charging (remember to shut them down properly to save battery).  ↩

  17. Though you can manually activate them in Devtools.  ↩

  18. Well worth keeping an eye on their work – they’re also responsible for EdgeConf  ↩

  19. 4th generation iPod touch = nearly 3 seconds to run the Sunspider javascript benchmark; iPhone 5 = 4 times faster; Google Nexus 4 (running Chrome) = ~ x 2.5  ↩

  20. One issue with Wordpress is it doesn’t handle multiple domains, it hardcodes URLs in several places in the DB (as well as in posts). This is a problem if you’re using separate development or production servers.  ↩

  21. I recommend the Mac Mini - I selected (or rather bought separately then self-installed) 16GB of RAM - the maximum amount - this makes a big difference when running virtual machines. Also the hybrid HDD/flash “fusion” drive has been flawless. The Mini is small and light (1.22kg); portable enough if you need to work from a different location for a period of time and minimal hassle if you ever need it repaired. You can use whichever display(s) you like.  ↩

  22. If MySQL refuses to start, add innodb_force_recovery = 1 to the [mysqld] secton of your config file, start MySQL and it should fix the tables and let you run a SELECT statement. Immediately remove the option and restart again to restore full read/write access.  ↩

  23. NB: Wordpress doesn’t apply automatic updates if it finds a version control system (Git, Subversion etc.) - Details  ↩

  24. If you’re setting up a new team, perhaps consider using a persistent chatroom like HipChat or Campfire to cut down on the number of emails flying around - see Zach Holman on how GitHub staff stay in the loop.  ↩

  25. Using the admin_print_footer_scripts filter similar to here.  ↩

  26. Although a lot of people have jumped to Laravel, FuelPHP is a strong framework, still very much in active development and has powerful ORM and plenty of useful packages like SimpleAuth. Also, Oil lets you quickly install packages, generate complex database scaffolding or create user logins – all from the command line.  ↩

  27. Compared to other plugins, Mousetrap is compact (1.9kb minified) and ever so simple to implement. We have a few simple keyboard shortcuts on the main site - check the source code.  ↩

  28. All these things have happened to me over the years.  ↩

  29. I think sponsored content can work brilliantly for everyone if you’re prepared to put the effort in. if you can find someone relevant to your users and you’re prepared to put the effort into integrating it and making the campaign work for your advertiser. Also (in previous jobs) I found well produced text ads worked just as effectively as graphics.  ↩

  30. Case study with ampp3d: Rather than put the problem ad spots live, I prepared some barebones test pages to verify they were genuinely broken and it wasn’t our fault (make sure you test these on the same domain) and wrote a quick email about it, ready to follow it up if we heard back from anyone.  ↩

  31. In 2012 Martin wrote about the ethics of ad-blocking. Personal opinion: I would find it incredibly frustrating to use the internet without one (although it’s not too bad on an iPad - whether enough people click on them is another matter). I am more willing than ever to pay small amounts of money either for good content or well-written software. There needs to be respect on both sides too - I’ve lost count of the number of news sites I’ve seen run the “99% of Gmail users don’t know about this trick” story - I’ve no problem with properly labelled sponsored content, but I don’t think it’s ethical to abuse the trust of your users with clickbait masquerading as “Related News”. Things like the ‘reading’ mode in iOS and apps such as Instapaper have become popular for a reason.  ↩

  32. Yes you can use a local fallback, but it’s not the only thing that could go wrong.  ↩

  33. Also, turning javascript off is my tricking for extending your battery life in an emergency.  ↩

  34. Why not run a test yourself?  ↩

  35. All the Datawrapper files are static - it’s fully compatible with basic cloud storage like S3.  ↩

  36. PHP 5.4 is also significantly faster than 5.3. You should upgrade.  ↩

  37. It’s doing a ban, to be clear.  ↩

  38. Be aware the test returns false if you’re using private browsing.  ↩

  39. For the same reason our Twitter and Facebook sharing buttons below the post title are just static buttons, to avoid all the extra overhead of the ‘official’ versions.  ↩

  40. For example, you need to pass xfbml=1&appId=YOUR_FACEBOOK_ID&status=0 in the query string when calling the all.js script.  ↩

  41. Also quantity does not necessarily equal quality when it comes to comments, or if I’m being fair, the article you’re currently reading. Plus we don’t show how many people have shared something, so why show the number of comments?  ↩

  42. Also, bad kerning.  ↩

  43. Roughly the same for me, and it ought to be perfectly adequate for web browsing, provided we develop sites responsibly.  ↩

  44. If I want a reliable 3G signal here, I have to walk up a hill and stand in the middle of this field. Otherwise it’s GPRS on all networks, which isn’t exactly quick, as Matt Andrews demonstrates.  ↩

  45. *cough* Vodafone  ↩

  46. The Network Link Conditioner preference pane will help you simulate a slow or unreliable connection.  ↩

  47. A worst-case scenario; an expensive, £3, 100MB Pay as You Go bolt-on on o2. There are many better tariffs. Also in fairness I think we used a smaller version of the cat gif, but I can’t find it now.  ↩

  48. I would have liked to spend a bit of time on automatic conversion to progrssive JPEGs, seeing as that’s now become a best practice. Also we did try Smush.it and compress media files on upload, but despite me writing workarounds to correctly handle the varnish cache, it broke something else and I had to turn it off - looking for an alternative is on the to do list.  ↩

  49. It does depend on the image; four times the area isn’t the same as four times the download. High resolution PNG bar charts with a few blocks of solid colour compress well. But there’s also just been a rumour Apple may launch a 3x display which would be nine times the area of the original image. This is terrifying.  ↩

  50. I’d recommend Bruce Lawson’s blog where he’s covered every development on responsive images and links to some of the best tutorials. I supported this too - Blink picture element project - it’s now closed for funding but keep an eye on the outcome.  ↩

  51. Some drawing packages produce spacer GIFs bigger than this, remember to compress them. Data URIs are roughly a third bigger, so not suitable for images larger than a few KB. Being especially pedantic, the src URL for the spacer.gif (which has to be absolute because we want to serve images from a cookieless domain) would have been 74 characters, so only 7 bytes smaller. (Oh and we gzip all our HTML too, so who knows).  ↩

  52. It would be more reliable to use a HTML parser than regular expressions, as it would cope better with attributes listed in a different order.  ↩

  53. There are plugins that will resize animated GIFs, but the one I tried didn’t work and I’ve not had time to investigate others properly (also I’d want to apply it selectively - keeping animations in hero images but not in the sidebar).  ↩

  54. At the time of writing, Chrome Canary v37 has just had <picture> support added.  ↩

  55. Also, they’re expensive and put a lot of strain on the CPU/GPU.  ↩

  56. Making the sidebar scroll like Quartz was something I wanted, but it comes back to the dropshadow and other ‘extending’ elements again – you’ll note their left-hand nav is nice and rectangular. If anyone reading this knows how to do it maybe you’ll put me out of my misery… That said, maybe our design works better with it all visible as you move down the page - maybe not all Quartz users realise you can scroll, or do so routinely, so the lower items on Quartz are liable to get less attention.  ↩

  57. I made the arrows with a neat trick involving borders and :before/:after pseudo elements.  ↩

  58. You could argue the same about Phark Image Replacement I suppose, but that usually goes up to –9999px, not 10 or 20 times the size.  ↩

  59. It was a long time ago…  ↩

  60. In it’s present location I mean - Ampp3d’s editorial future is secure.  ↩

  61. IE7 support was a step too far for me, given how low the mainstream usage has become. I take the point about trying to do a single column format if you can.  ↩

  62. Another thing we should have done. On that specific technique, I had a problem with it in a previous project: I found use of ‘table-caption’ and ‘table-cell’ restricted the width of the table. But I fixed it by replacing ‘table-caption’ with ‘table-header-group’. There’s also a ‘table-footer-group’ if you have three sections to arrange.  ↩

  63. It is a good sign if you look at anything you’ve written a few months later and can recognise things that are wrong with it, but it doesn’t stop it being awful.  ↩

  64. The conventional wisdom for matching single elements in CSS  ↩

  65. You should definitely be following Paul Irish. Note the infinite scroll documentation is a bit scattered around - have a look at the wiki and the plugin site.  ↩

  66. At launch we had Wordpress configured to show 3 or 4 blog posts at once - that’s reduced as I’ve optimised, it’s now just one.  ↩

  67. If you drag a browser window to resize it, window.resize fires multiple times; inefficient and in some cases with unpredictable results. Use jQuery doTimeout plugin to avoid this.  ↩

  68. It’ll work fine on Nginx, by the way.  ↩

  69. We actually take the lowest and highest X values and discard the rest: if you plot all the points Safari renders a slightly jagged line rather than a smooth one line other browsers.  ↩

  70. You might pickup some useful sysops tips from this AWS Q&A by the Obama for America team.  ↩

  71. You’ll typically have one nameserver matching your own TLD, e.g. .co.uk, plus a .org, a .com and a .net.  ↩

  72. Whatever alert system you choose, get out of the habit of setting your phone to silent overnight. Definitely don’t do that after working late, then oversleep and wake up to discover the server went down overnight and didn’t come back automatically…  ↩

  73. I refer you to this wonderfully prescient “Why no SSL” FAQ from 2011 where the Varnish author quotes from the OpenSSL sourcecode and remarks “I hope they know what they are doing, but this comment doesn’t exactly carry that point home, does it ?”.  ↩

  74. Of course few banks or corporations are using it yet…  ↩

  75. Martin wrote a frustrated blog post about the New York Times story.  ↩

  76. Related tip: put your .vim directory in Dropbox and symlink to it then you can share your configuration across multiple computers.  ↩

  77. This used to be the ADB plugin, it’s now built into the browser (Tools > Inspect Devices), bringing Chrome into line with Safari.  ↩

  78. The Github app is nice too but I can’t say I use it a lot.  ↩

  79. I wrote all this with Vim and Marked.  ↩

  80. Also a reason I use a Dell 24" display rather than iMac/Thunderbolt display - Apple screens are just too bright even on the lowest setting.  ↩