by William Turrell
First published June 2014. Last revised 10:08am Thu 19 Mar 2015.
Note for Instapaper users: use a browser, some of the footnotes are missing.
Hello. This article is all about the launch of Ampp3d, a data-journalism website for Trinity Mirror that initially ran on WordPress before it merged with mirror.co.uk. You can still access the old site.
We launched in December 2013 and I first wrote this in time for the site's six-month anniversary – it was 17,500 words and incomplete but I wanted to ship something. I've made a few revisions since, including removing some of the more dangerous deprecated advice.
Have an opinion on what I've written?
Then why not keep it to yourself? Please email me.
Blogs  about site launches or other major design/backend projects can sometimes feel more like a PR exercise; your project must sound utopian; with colourful photographs and days where the team bonded to formulate ideas. Post it notes, user personas, intricately drawn wireframes. The choice of CMS is never questioned. There are no bugs, arguments or political interference.
There are exceptions where people are refreshingly honest. In 2013, Sarah Parmenter spoke at Responsive Day Out 2013; for one of her slides she used screenshots of broken media query layouts as an acknowledgement of the pain and frustration everyone goes through when learning something new (or even doing things they think they ought to already know.)
I like reading inspiring things (I’ve linked to one or two in the footnotes). However even after a project like this where there genuinely haven’t been any disagreements, despite my pride in what we’ve made, I still can’t look at the site without noticing all it’s flaws or thinking about the many compromises we’ve had to make – it would be misleading to leave out what I wasn't happy with.
Hopefully what you'll find here includes at least one good idea you might not otherwise have thought of, but you'll also be able to learn from our (my) mistakes.
It was 32 calendar days from my first meeting about Ampp3d to when we launched (about 25 from when I started coding the design) – but we continued to iterate over the following months, with a few “big” new features and a considerable amount of time spent fixing the “hard” problems.
Much of what we did at launch was quietly thrown away and replaced with better code.
A lot was riding on all of us, individually and collectively, hitting the launch date. If you set a deadline your credibility depends on having something to show for it; so things made it into production which shouldn’t – we simply ran out of time.
It helps to put something imperfect into the wild but make a private commitment to yourself to return to it later; though my record on this is far from 100%, there’s a definite sense of achievement when paying off some technical debt.
I tried not to say an outright no to any editorial requests.
Our insistence on trying to make everything work in an infinite scroll environment might make us unique.
I pushed responsive design as far as I could, not always successfully. It's especially hard for journalists producing new content every day; it’s tough making static graphics – let alone interactive charts – work well on all devices.
Personally, as the sole developer on the project, I had to continually switch hats between design, front and back-end and try and maintain a reasonably high standard of each, with the very real fear I’ll just be mediocre at everything. I’d like to think we’ve set a few good examples though - why not read on and make up your own mind.
Martin and I first worked together on the launch of UsVsTh3m in May 2013. Compared to Ampp3d, UsVsTh3m was very simple (no longer the case; their team has grown rapidly.) We had some basic Photoshop designs from Chris Lam - a great designer at the Mirror - which we used to build a Tumblr template.
Martin has huge experience of working with many talented people at the BBC, Guardian and elsewhere but we’d never worked together before, so on my part, as a chronically insecure developer, there was a definitely a degree of wanting to prove myself. Usually I do this by trying to get a head start, and I persuaded him to send over the initial designs three days beforehand so I could look them over. By the Sunday night we'd made some revisions and I felt more relaxed about it.
Our original version was actually multi-column, which both Martin and I liked, but popular opinion was against.
As I recall, neither side articulated their reasons in any great depth at the time, but with hindsight the trend for both UsVsTh3m and Ampp3d has been to write full posts with multiple images that can be scrolled through with minimal effort, rather than the traditional back and forth navigation between an index and individual stories; creating excerpts might be seen as a barrier to people discovering the full content.
There are pros and cons to using a third-party product like Tumblr:
But there's a real lack of control; if you're a developer, imagine the pain of juggling multiple password protected blogs for development and staging, every design change requiring a manual copy and paste of the entire theme into a web interface (no Github for you…) and reverse engineering an undocumented infinite scroll mechanism.
When UsVsTh3m started making their own interactive content (such as games) they did host a lot of it on their own server and using Amazon S3 - and they did eventually move the entire thing to WordPress themselves before being becoming part of mirror.co.uk.
Martin and I met mid-morning at Costa Coffee in Canary Wharf on 6 November 2013. By this point Chris had done some initial Photoshop templates which I saw but didn’t get my hands on until a few days later. Martin had a target launch date, Monday 9 December 2013 and had chosen WordPress as the CMS.
I had reservations and limited experience of WordPress – developers might point to the lack of an MVC framework, the limitations of the default database structure and variable quality of plugins – but I hadn’t used Tumblr prior to UsVsTh3m either and it was clear journalists would feel confident enough with WordPress to get up and running quickly. It'd also be easier to justify choosing a widely used platform to Malcolm Coles at The Mirror.
One of the first jobs was to prepare a shortlist of plugins we’d need - I’ll talk in detail about some of them later. I like to think I chose reasonably well but there were some costly decisions which I later ripped out and replaced with better alternatives or code of my own.
WordPress was a struggle at times, but I actually went on to specialise in it after this project. It has its drawbacks, but there's a big community and it's very actively developed. They core team take care to release version updates that don't break existing code (though moribund plugins can be problematic).
Recently one of the trends in PHP has been the emergency of lightweight / fat-free frameworks: by keeping features to a minimum (you're encouraged to use composer packages instead) you significantly increase the stability and make updates less of a challenge. If you're choosing a framework or content management system, I'd encourage you to think about this.
Martin and I also chatted about graphs; we wanted fully responsive, interactive graphs that worked on mobile. Martin had been playing with Quartz’s Chartbuilder and our initial plan was to adapt that.
It was a short meeting (we had just the two for Ampp3d) – I never needed to remove my Macbook Air from my bag – I just went through a list of queries I’d prepared and made some notes in Notational Velocity.
Having boarded the DLR at Canary Wharf I started building a Trello board. It has an iPhone app, so by the time I was at Poplar I’d already entered 2–3 cards and sent Martin an invite to the board.
I’ve used Trello for all my project management for some years now – having cards you can (almost) literally drag around between stacks, adding notes, comments, tags and checklists helps me feel more in control of complex projects than a linear approach. Also, as Joel Spolsky explains:
Trello works great for a reasonable amount of inventory, but it intentionally starts to get klunky if you have too many cards in one list. And that’s exactly the point: it makes inventory visible so that you know when it’s starting to pile up.
Every day you look at your Trello board and see that there are seventeen completed features that are totally ready to ship but which haven’t shipped for some reason, and you go find the bottleneck and eliminate it.
Every time somebody suggests a crazy feature idea, you look at the Feature Backlog and realize it’s just too long, so you don’t waste any time documenting or designing that crazy idea.
Our Trello lists for Ampp3d:
Read Harry Roberts' Trello workflow for how to do it properly (to better suit Agile and bigger teams).
Next, I setup an new WordPress blog and a demo install of Chartbuilder (all password protected) ready to customise – though as it turned out we never used Chartbuilder.
You're not alone if you find ‘Ampp3d’ tricky to spell. I’m always misreading things – maybe it’s mild dsylexia or simply a habit of scanning too quickly – but I wasted over an hour trying to track down why the initial Nginx setup wasn’t working, to eventually find I’d transposed a couple of letters in the document root. In hindsight I'd have used ‘amp’ everywhere as an internal shorthand.
Although I always do a lot of browser testing, I’m rarely happy with the quantity of browsers or devices I’ve been able to test on or the compromises I usually have to make.
If you’re unhappy with your browser testing method, for desktop browsers (like old versions of Internet Explorer) I suggest using a virtual machine (VirtualBox is free and reliable.) It’ll take you much longer to use those websites that show you screenshots (and sometimes the individual browsers on them stop working.) Even trying to use a virtual machine remotely over the web is pretty frustrating because of all the latency.
Typically with VirtualBox, you save the current state of each VM, and there's a delay of around 10-15 seconds to start it up. Crucially you can also configure your hosts file to view the development version on your local machine, there's no need to make the site publically available, or mess around with whitelisting IP addresses or configuring password protection.
You do need plenty of disk space – each compressed Windows image file is multi-gigabyte – and time to download (keep the originals – therefore allow for 2-3 times as much space) because the Windows Activation window will eventually expire and you'll need to reinstall it.).
modern.ie is a great site by Microsoft where you can download images of all Internet Explorer versions on each version of Windows for free. Much thought has gone into this; they’ve made the instructions friendly for Mac users, provided the most efficient terminal commands to download and install each VM.
When you start it up, there's no installation necessary, automatic updates have been disbled to stop your browser being overwritten by a newer version, and the full login details and license information (including how to rearm it) are printed on the desktop wallpaper.
Remember it's possible to trigger a discrete banner to politely encourage IE8 users to switch browsers using the css classes added to the main HTML tag using conditional comments.
Overall, this was very frustrating.
I abandoned the idea of IE6/7 support early on, but remember this should always be a site-by-site decision. You shouldn't rely on the internet advising you what to do, there's no single answer and the arguments for and against legacy browser support are often polarised.
The problem with Ampp3d is that we allowed journalists to embed a lot of third-party stuff which just doesn’t work with old browsers, so even if the core site ran in IE6 it would still only be a partial experience.
I’ regularly test in Chrome, Firefox, Safari and Opera on Mac and Windows.
We did have a few issues with Firefox. You should avoid combining display types in unexpected ways – if behavour isn’t defined in the spec there won’t be tests for it; browsers may handle it perfectly but there’s no guarantee.
In our case there was this bug concerning using a max-width element inside an ‘inline-block’. My workaround was to add
width: 100% where you can safely do so (i.e. by applying to CSS classes used by your images). However the better solution is to avoid an inline-block there in the first place, it was there because of another hack I’d chosen to make somewhere else.
That bug continues to attract comments over a year later, but it still hasn't been fixed because it's outside of the CSS spec.
Reviewing my CSS in hindsight, it was overly specific: although there weren't that many IDs used as selectors, even a few can be problematic. Plus there was too much nesting – because preprocessors like SASS make that so simple it's easy to assume it's the right thing to do.
Recommended reading: CSS Guidelines by Harry Roberts (which explains this in much more detail.)
Also, pay your technical debt off early if you can. If code "smells" bad, that's a sign you should be refactoring it.
I test on different devices all the time, but it’s a pretty limited selection:
In iOS I’ll look at Safari and Chrome (which use the same rendering engine, but can behave slightly differently and on Android I use Chrome, Firefox and the Android standard/default browser (though the latter stopped working in Android 5.)
Other members of the team have different phones, but remember you can’t rely on users or staff to report bugs, they’re just as likely to not notice, ignore it, not have the time to mention it, assume somebody else will, or worst of all for your potential audience, give up and go elsewhere.
For Ampp3d, regular use on a phone highlighted issues with touch targets - even though our button to open/close the navigation was reasonably big, I found my thumb would narrowly miss it.
Active and focus styles won’t normally be visible in the browser either, In Android, activating the button added a blue background; in our CSS we use
-webkit-tap-highlight-color to get rid of that.
For our ‘swipe to close sidebar’ UX feature I adjusted the
swipe_velocity to make it more sensitive - I wouldn’t have known without a proper device.
There aren't many device labs in the UK yet. If we had more people, more time, or a larger user base, I’d have tried to fit in a visit to Clearleft’s device lab. It would have taken multiple visits; one to identify and log all the problems, maybe several more to test fixes I’d applied.
Many small UX improvements have come from routine use of the site.
While the site was live I'd check Ampp3d frequently (every day or two even when I’m working with other clients.) I tried to vary the device.
I don’t read everything, I’m scanning for rendering issues – content that’s broken out of it’s container, incorrect spacing, line wrap, things that take too long to load or don't embed well.
I probably won’t fix it then and there (assuming I know how), but I’ll make a note so I can come back to it.
Make your decisions about what to support based on your site’s own user base, not someone else's or a national/global average.
Our Google Analytics stats were more encouraging than I was expecting.
Chrome and Safari are joint top at 31%, but desktop Safari (rather than iOS) accounts for a further 11%. 11% of visits use the standard Android browser. Firefox and Internet Explorer are equal on around 7% each. Opera accounts for just over 1%, Blackberry 0.5% and Amazon Silk 0.25%.
For iOS, we had 90% on version 7 or above, 98.5% on version 6 or later. I still tested iOS 6 occasionally despite (because of) it being significantly slower.
80% of Android devices were version v4 or above (higher than I’d expected). No-one used Android 3 (the tablet one) but 16% had v2.3.
I use Git repositories for WordPress and Datawrapper; I can roll back code changes and potentially use multiple branches to work on multiple features simultaenously. In practice, I prefer to work on a single branch at a time and “stash” changes if I get interrupted.
I use SourceTree as a GUI for Git: whilst I’m familiar with the command line, I like being able to instantly switch to SourceTree, see what files have changed, drag things into the staging area, look at prettier diffs than you get in a terminal and be able to discard files or individual “hunks” of code instantly. Generally, I find the workflow to stage and commit changes much faster.
Our .gitignore file includes media uploads and the charts. But when I’m developing the site locally, I need to be able to pull down all the latest content so I can look at actual articles, images and charts. I have a script that remotely runs a MySQL dump, creates a compressed timestamped SQL file and copies it back, before running some queries to switch the domain name from ampp3d.mirror.co.uk to my local version.
There’s also another which runs a customised rsync command on the media uploads.
Nowadays I'd use Ansible to do a lot of this; it also makes it easy to push changes to production without having to login to the remote server.
Backups are extra work, but my experience has been I need to use one at least once a year. You need to backup your servers and your development environment.
I've never had a total hardware failure, just basic mistakes: e.g. deleting a file (like erm, this one), overwriting a WordPress plugin directory or misunderstanding the consequences of a Git command.
I keep known working copies of our Nginx, PHP and Varnish configurations separately.
Having copies of everything on your own system means you can take advantage of your local or cloud backups - the ‘versions’ feature in Dropbox is worth remembering if you’ve managed to damage a file you’re yet to commit. Also I use Vim, which has multi-level undo and holds on to each block of code you cut, copy or paste (PhpStorm's history features are even more advanced) and Launchbar, which retains everything you copy to your clipboard.
We have daily backups on our production server too. Linode have a daily snapshot backup service, but it’s not that flexible - you can only restore a whole drive, not individual files. Bytemark, who now host Ampp3d, set us up with BackupPC writing to a machine elsewhere in the datacentre. Additionally I run backup2l locally as it means I can quickly retrieve specific files without having to ask anyone else. There’s a script that generates a mysqldump of all the databases first. Our server also has RAID1 so if one disk fails (Bytemark scan the serial line output for error messages) they can swap it out with limited downtime, then the array can rebuild in the background while the OS is running as normal.
For the period the site was live I only upgraded Ampp3d's WordPress installation (by hand) with all the necessary security patches, rather than newer feature releases. There’s a lot of variation in when plugins get necessary fixes and some were forked and heavily customised for Ampp3d.
We had a customised dashboard that contains release notes and instructions I wrote, including shortcodes, CSS classes, colour swatches and so on (intended as a way of cutting down on email, though turns out there’s no guarantee people will read it).
On the Posts index, I added extra columns; a thumbnail preview of the “hero” image, article word counts and a notes column – there’s a field in the article you can edit and the notes are only visible within the admin area (though in practice people mostly continued to send emails).
It’s not uncommon to find draft posts with “Whatever you do, don’t publish this” in the title and it's remarkably in WordPress to put something live by mistake. I added some code that looks for the phrase “do not publish” in the post title when you save a draft and disables the publish button until it’s removed.
This was, I think, an idea of Martin’s which I shared genuine enthusiasm for at the time, but in hindsight seems ridiculous. It was to be an intranet set for people to add interesting/important numbers so eventually we could draw interesting or amusing comparisons from them.
You might wonder (a) why we’d need a special place to write our numbers down (b) what sort of person would be dull enough to want to do so and (c) why wouldn’t you just use an established tool like Wolfram Alpha?
I did build it though; a MySQL database running in FuelPHP (my framework of choice) using Bootstrap for the design, DataTables for showing numbers, Mousetrap.js for keyboard shortcuts and select2 which gives you fancy auto-complete select boxes.
For the main site, scaling was a concern from literally day one, so we setup Varnish. For “ndb”, it will be a long time before speed becomes an issue (especially if nobody ever adds a number). It’s a database used by an incredibly small number of people, so we can simply throw a bit more hardware at it in the first instance.
It’s a refreshing change to work on internal projects because you rarely have to worry about legacy browsers.
I wanted to make it as fast as possible for journalists to use; data entry is boring and if it’s too fiddly nobody will bother. I tried to keep keypresses to a minimum; pressing ‘n’ took you directly to the Add Number page and the instructions for the ‘Units’ dropdown say ‘Just start typing’; it filters all the possible choices as you go and wherever possible I supplied both the full unit name and the abbrievation or symbol people were most likely to type. The ‘date’ field accepts yyyy-mm-dd format or choosing from a datepicker (which pops up immediately the field gets focus). There are keyboard shortcuts for saving too.
This seems a good point to mention that Alice Barlett of GDS recently explained how using
<select> tags (aka dropdown fields) actually causes a lot of problems for usability. So bear that in mind when you're designing forms.
The database structure for the numbers database was interesting; normally in a relational db you set the field to the specific type of number (integer, floating point, whether it’s signed/unsigned etc.) Here, we could be dealing with anything including currencies, distances or units of time - that means you can’t index them.
Perhaps that's a red flag that your database is much too broad.
I decided to keep things simple and use a single VARCHAR field; if we ever need to typecast certain numbers, we can use the MySQL’s CONVERT function. There was a indexed units field to indicate the number type, we match that first to get fewer rows and a faster query.
So I'm pleased with what I was able to quickly throw together, but the reality is we didn't think it through and it never got off the ground.
Web developers hate ads; you’ve spent ages carefully optimising your page and then suddenly you're injecting third-party code over which you have no control. It might add multiple iframes, block the loading of the rest of your content, require Flash and make your site CPU intensive, or in the worst case even change the DOM and break the rest of your page..
I wish we could go back to the days of simple 468x60px animated GIFs; in an age where animated GIFs are so popular because no-one's prepared to wait for videos to buffer, it seems ironic ironic ads have become far more complicated.
Often your ad-serving platform (e.g. Google/Doubleclick) chooses your banners for you. With clients generally, unless we had a direct relationship with a specific advertiser, I tend to be more “assertive” approach than in previous years: if it’s obvious an ad spot isn’t working or is causing problems with the site then I’ll remove it (selectively if I can, or should it only affect certain browsers) without asking permission first.
We’d planned to run three different sizes of ads on Ampp3d, including a 320x50px version for mobile. Despite a lot of searching and testing I could never get them to work properly; either nothing would display at all, we always got full size ads where mobile ones should be or occasionally Google served the 320x50px version desktop users by mistake. The errors were also rather random.
So I gave up and we only ran adverts on desktop – our code contained a
shouldWeShowAds function that uses the modernizr media-query method to check the device size. We also blocked IE9; every so often it would inexplicably stack two or three banner ads on top of each other.
It’s important to invest some time in getting ads working if that income is contributing to your wages, but if you don’t have much control over the platform and the bugs are serious enough to annoy your users or make your site appear broken then I think you have to take action.
I did make sure if ads get switched off, no Google code will load:
I do run an ad-blocker but I usually turn it off for any sites I’m developing; I want to see exactly what the majority of users are seeing. However occasionally it can work the other way around - I spotted a post where an image was missing; it was of a commercial in a newspaper and it had turned out the filename was
advertisement.jpg, which was immediately blocked.
There’s a journalism issue here too I think - from time to time we have interactive stuff that makes no-sense at all without JS. Rather than a ‘sorry, your browser does not support this’ message (or no message at all), should we be adding some sort of proper commentary or editorial akin to audio description in our
Finally we should probably starting asking designers to consider how sites with complex tabbed navigation (such as Ampp3d) and other interactive widgets ought to appear in cases where they’ve fallen back to HTML and CSS only.
I find WebPageTest the most useful; it’s measuring actual performance in a real browser environment rather than looking at your source code and guessing. Google is handy for it’s mobile UX scoring (it analyses touch target size, for example) and Zoompf is simply very fussy about optimising every bit of your page to the fullest extent.
The scores we get with all of them vary depending on the complexity of the article you’re reading at the time, so there are one or two specific posts I use as benchmarks in addition to checking the homepage.
Without going into all the details, generally speaking I think we do “OK” (advertising and multiple sets of analytics weigh us down a bit).
WebPageTest has a clever concept called ‘SpeedIndex’ where they analyse a video of your page loading and work out how long bits of it take to render. Occasionally we’ve reached the 15th percentile, typically it’s somewhere in the top 20–30.
It’s interesting to compare us to UsVsTh3m given their use of heavy use of graphics (and the Tumblr limitations I mentioned earlier). When I ran a test recently, their page weight was a huge 20MB with 313 requests (many thumbnails all served from 500px images). The waterfall chart is terrifying.
However most of that’s below the fold, so even though “Document complete” (the point at which absolutely everything is loaded) took about 35 seconds on a 5Mb test link, the important bit (the DOM ready event) was under 2 seconds and they do really well on the speed test. The fact that Ampp3d usually has a huge purple bit where the page is being setup is a reminder of how costly client side code can be - I wish we could cut back a bit more.
I can’t write a section on performance with mentioning The Guardian and the BBC, but also take a look at the ABC and their new mobile site, which really impressed me (they’re using a global CDN now so it’s just as fast in the UK).
Varnish is fast, powerful and reliable and I wouldn’t change it for the world.
It was also a steep learning curve made all the more complicated by i) WordPress ii) we’d already revealed our domain name and ii) having to fit it all into the final week before launch.
Here’s what I learnt (some things are obvious, but only in hindsight):
sudo /usr/sbin/varnishd -C -f /etc/varnish/default.vcl. It’s compiling your vcl file into actual code, so if it gives you lots of code back, it’s compiled correctly. If you see a syntax error, it hasn’t.
Ban vs Purge:
This is confusing - learn the difference (definitive explanation).
For Ampp3d I always use ban. We also store everything in memory for speed (remember to set that in your config).
Why do we need Varnish at all? Well, according to ApacheBench WordPress would stuggle to serve more than around 15 pages per second, indeed that’s crept up to 300ms per request as I write this (largely I suspect, due to scanning for animated GIFs – see responsive images).
Nginx certainly helps with getting all the static files out as quickly as possible - indeed, for simplicity / to avoid writing extra code, I don’t bother with Varnish for Datawrapper charts) – but it can’t do much about PHP and database bottlenecks.
We use the WordPress Varnish plugin. It’s worth having a look at the source code so you understand exactly what’s going on. It checks your Varnish version to use the correct purge/ban syntax and uses add_action to hook into post/page edit events, anything that happens with comments and when scheduled posts are published. It’s completely transparent to your users and has never gone wrong yet (although it helps to teach people about the benefits of versioning filenames, especially for more interactive posts).
We have of course, configured Varnish to detect when journalists are logged into the system and not cache anything for them.
Edge Side Includes
We use ESIs to handle the publication ‘age’ (“X minutes/hours/days/months ago”) displayed in the byline and our navigation (there’s a separate ESI script that generates the entire sidebar). It means it doesn’t matter if generating the article takes slightly longer ( as the article itself will be permanently cached and the ESI fragments for the sidebar and timestamp quickly swapped in.
The PHP scripts send TTLs using a
Cache-Control: public, max-age=X header, served by Nginx, read by Varnish.
The sidebar has a 59 second TTL.
Timestamp are cached depending on how recent it is:
Although I could have written it by hand, the timestamp text is generated using WordPress code. However I wanted timestamps to be generated instantly - it didn’t make any sense if the entire WordPress codebase is needed and they took just as long as a full pageview.
So the only files the PHP script loads in are:
Resulting time per time-since ESI request: 4 milliseconds.
The sidebar is still quite slow. There is a different sidebar per article (so if the current article is one of the recent posts, it is visually highlighted).
If I had time to optimise further, I think I’d manually write the 8 versions (7 posts plus one for nothing selected) to disk every minute in one go and have Varnish retrieve that rather than generating it on the fly.
We also tested, though have never used, ESIs on remote servers. This is handy if we ever want to take a feed from elsewhere; Varnish essentially proxies it and preserves whatever cache-control header you supply. It’s another backend in the Varnish config. We had an article which tracked the Bitcoin price generated on a separate server which used the XML-RPC API to update a blog post, but could equally have used this method.
Generating the navigation fragments at regular intervals solved another potential problem - WordPress scheduled posts. WordPress doesn’t have proper scheduling support, so at the time your pre-scheduled post is due to go live, nothing actually happens. It’s only when someone subsequently visits the site that the list of posts is checked (using the ‘wp-cron’ script) and it’s publication status is updated.
With Varnish, if all your pages are stored in memory, WordPress isn’t being asked to do anything, so the danger is won’t be published. Fortunately the act of generating the sidebar also triggers wp-cron (which in turn fires off the events the Varnish plugin is hooked into; the post is updated and the cache reset).
Everything above sounds really complicated, and it was. But I’ve not had to worry about it since.
Varnish has been incredibly reliable. Journalists don’t need to worry about it and I have a series of steps for updating site components which consistently works. We’ve never had any weird error messages and I can think of only one occasion where I’ve had to manually clear a file from the cache because it had been marked as 404 for some reason.
I also fully endorse what Bytemark and others have long argued; that for all but the biggest sites, it’s preferable to have a powerful, reliable single server with the single point of failure that gives you, rather than spending time and money engineering a complex clustering solution which makes you reliant on multiple systems functioning correctly for your site to stay up.
LEMP and LAMP may not be the most fashionable web architecture in 2014, but there are reasons many of us stick with them.
We use Bytemark (Martin’s choice). As I’d started building the site on Linode before the server was ordered and provisioned, time was incredibly tight and I was unfamiliar with how Bytemark worked, I persuaded him to let us launch on Linode and move it later.
It wasn’t until January I actually migrated it (at 10pm on a Sunday evening). That involved a significant amount of prep, but it all went as planned and it would have been trivial to roll back.
I speak highly of both Linode and Bytemark. You’ll pay more for the latter (very few can match Linode’s bandwidth prices) but there are benefits to having a dedicated machine and the price is very competitive.
Also, the quality of Bytemark’s status updates and RFOs for even minor incidents is, in the 15 years I’ve been doing this, the best I’ve seen from a hosting provider. Prompt responses, plenty of detail and all in plain English.
Linode are extremely quick at dealing with support tickets (always within a few minutes) but they’re occasionally reluctant to talk about stuff (especially outages) or I have to hassle to get an answer, which irritates me. Managing data centre outages from a different country can be a bit worrying too.
Bytemark have a nice habit of thinking about non-urgent tickets for a while but then responding in a more personal way - it’s quite reassuring to get an email from the MD at twenty-past-one in the morning.
I’d say if you only consider features, it’s neck and neck between the two for virtual server provision at the moment (we’ve been using a virtual machine on Bytemark’s BigV platform as a spare development server for ampp3d and I have multiple Linode servers for other clients); BigV has been very nearly 100% stable and they’ve just improved their GUI to bring it up to Linode’s standard, they also offer a range of disk choices (price vs speed). But Linode have just finished a massive rollout of new hardware including SSD everywhere, fancy new server monitoring and much simpler (but still granular) pricing.
The more curious among you may notice this domain is hosted on C4L. I will be moving it, their virtual hosting isn’t great in all honesty but to be fair it’s hardly their their core business – if you’re looking for co-location they’d make an excellent choice. My main complaint is I can’t get hold of them on IRC and I have to open a ticket through a web interface then phone to ensure they’re aware of it.
Oh and if you’re looking for a broadband provider - Andrews and Arnold.
A few people have noticed we update the browser URL as you move through stories. An idea I stole from Quartz.
Our initial implementation – hurriedly thrown together – was a disaster, but I replaced it a week after launch.
One issue was the choice of library to detect the scroll events, initially I tried jQuery inview, but jQuery.appear suited our needs much better; it gives you a single list of visible elements for the selector you’ve provided (in our case the
article tag) so you can then analyse the order and figure out which is goning in and out of view.
When the next story is two-thirds of the way up the page, we update the URL. But we try to preserve the initial URL if people start on the homepage/category pages, scroll through a few articles then go back to the top again.
Opening the sidebar in our design requires scrolling the page all the way to the top so all it’s contents are visible, but if you close it again without picking anything we’ll take you back down to wherever you came from.
I wanted to make it very easy to dismiss:
If you tap the ampp3d logo at the top of the page once, you scroll to the top of the page (intended to replicate common iOS behaviour) but if you’re already at the top, the logo reverts to a standard homepage anchor link that will, by it’s nature, also refresh.
We wanted to promote the newsletter, but I didn’t want to annoy people.
We added a signup box to the bottom of the sidebar, but also more prominently to the bottom of each post. There’s a “Please don’t mention the newsletter again” link too.
Mailchimp have sample code you can paste in to add a subscribe form, but it’s way too heavy so you need to pull it apart; all you need is the form and a jQuery AJAX GET (not POST) request to
.list-manage.com/subscribe/post-json?u=… - then you handle the error or success response you get back appropriately.
The team thought adding comments would boost sharing/reading of articles (and they’ve been proved right so far) but I was concerned about slowing the page down and how much it would annoy people like me who hate comments.
So the plan was to put the comments in, but allow users to turn them off with a single click and without losing their place (we use a jQuery animation to slide them out of the way). If they do turn them off, the Facebook library will never load.
Again we use LocalStorage to save the preference. If comments haven’t been disabled, we load the Facebook SDK once the page is ready, then there’s a function which adds a
.fb-comments HTML fragment below the post and calls FB.XFBML.parse on that element; Facebook’s code then converts it into a comments area.
Facebook’s documentation isn’t, to be polite, as complete and up-to-date as perhaps it could be, so simply getting the SDK to load properly meant taking advantage of other peoples’ experiences on Stackoverflow. It’s all run very smoothly since we turned it on though.
You’ll notice there’s a (relatively) big gap underneath the Facebook comment area for stories that have no comments yet. Don’t waste your time trying to “fix” this (or follow all the advice from 2–3 years ago that’s now out of date) - Facebook make it impossible to reduce the height because the whole thing is inside an iframe and they need sufficient room for the ‘Comment using’ dropdown (that appears when you’re not logged in) to be fully visible when open.
I was hoping to try and show the number of comments that had been left if they had been hidden – e.g. ‘Add/View comments - 3 so far’ – but I didn’t trust the number I was getting from the graph API (which in any case was going to get muddied with the number of general Facebook shares the article might have had) and also I wasn’t sure people who’d turned them off really cared.
All we have is a simple mailto: link with prefilled subject and body text.
For years people have made ‘Email this story’ functions far more complicated that they ever need be. No-one wants to have to fill in a form, try and remember the recipient’s address or manually copy and paste it from another application, be asked to complete a CAPTCHA or worry what their friend will actually receive or if they’re going to be added to a mailing list without their permission.
Also, I’m amazed so many of us wasted our time building these things; were mail clients ever really that bad? It’s not as though it made the process any easier on mobile either. Some websites don’t even allow you to add a personal message and of course you have no record of what you sent.
I hope in a year or two everyone will just be using mailto: and passing control to the user.
As an experiment I added some simple keyboard shortcuts for switching tabs and search, like the numbers database I used Mousetrap for this. The obvious next step would be j/k to navigate between posts.
WordPress gives you a memorable, easy to understand URL structure out of the box. I thought it important you could easily tell when a story was written, so our permalinks contain the full date.
I used the wp_users.user_nicename field (not visible in the admin interface, the ‘irregularly’ named journalists might not realise it exists) to map a proper
/author/firstname-lastname URL to our not-very-consistent WordPress logins.
Something else worth knowing: if WordPress is given an incomplete URL or one that would otherwise 404, it will query the database to find the closest match (useful if the address gets truncated or split in two).
Many of our assets are served from http://media.ampp3d.co.uk. This is on the same machine - it’s nothing to do with speed as Varnish does an excellent job, but it means they are on a cookieless domain. There’s a slight speed/performance benefit to your users as their browser doesn’t have to attach many irrelevant advertising/tracking cookies to each HTTP request for an image.
scrollToMe is a small function we use to smoothly scroll between different articles when using the sidebar, the back button or pressing the logo to navigate to the top of the page.
Note you can’t apply BalanceText to anything with other inline child DOM elements amongst the text - they will be lost during the rendering. It needs to be a single paragraph or
<a href= tag.
Also I’ve struggled to make it work correctly on the initial page load in situations where the webfont isn’t in the browser cache, despite calling it from the webfont loaded event.
box-shadow, gradients etc. to old IE.
Martin has written about the importance of checking that content works (both technically and editorially) in a mobile browser and the idea that people in newsrooms should work on their phones; I’d argue their connection speed is equally important.
If Martin’s desktop ban were ever enforced, chances are everyone would still insist on connecting by wi-fi; but many of your users may have 2 Mbit/s ADSL broadband or slower and struggle to ever get a stable 3G signal . Connectivity in central London tends to be excellent by comparison - The Mirror office is right in the middle of Docklands and a short DLR ride from many Tier 3/4 datacentres, including the main UK internet backbone at Telehouse North. Everyone has fibre and although some mobile networks are finding it harder than others to handle demand in the Canary Wharf area, generally the quality and availability of 3G/4G in London is pretty good.
There’s also no guarantee wi-fi will necessarily be fast. I was in a hotel last week where 500 Kbit/s was the free offering and “fast” access was (over) £15 a day. I declined as my phone was giving me a much better connection and I could tether it.
It’s a real problem if those creating your content have much better bandwidth than the ones viewing it, and it may always be this way. You might argue telecoms providers should get a move on with fibre rollout and allow the rest of the country to “catch up”, but maybe they never will, as by the time all rural areas have FTTC connection speeds in urban areas will have risen even more.
Also, mobile data = money. A 3.8MB animated gif of a cat accidentally sitting on a hedgehog may be amusing, but how many would have looked if they beforehand it would cost four pence?.
Any solution should be automated too - few people are going to spend time adjusting JPEG quality settings through choice.
“Getting primarily journalists to optimise images - losing battle mate, losing battle…”
Martin to me by email, 12 Feb
We use BWP Minify which combines, minifies and versions CSS & JS. It's a customisable, reliable and solid plugin I've used for many projects. Remember to turn on versioning in the query string (I usually use a unix timestamp.) You could also something like Grunt of course, but for any site with a CMS it’s easier to let the CMS itself handle it.
Using Varnish means whenever the CSS or JS changes all existing HTML files need to be purged using varnishadm: they'll have an out of date query string for the combined/minified files:
ban obj.http.content-type ~ "html"
This is something you can automate as part of your deployment process.
Designing in Photoshop means starting with fixed breakpoints, of course, but I’ve done my best to make things work at as many, if not all, sizes as possible in between.
It's not perfect:
For our two column layouts I used this Responsive Grid System and simplified the CSS, customising it to match our own margins.
I hadn’t expected that soon after launch, at least a couple of the staff would buy Retina Macbooks (I thought high resolution laptops were a niche product and would only be of interest to photographers, video editors and graphic designers). Overnight, they were uploading 2dppx images, with the potential to be four times the filesize..
Also, I definitely wouldn’t recommend anyone blindly follow our approach, it’s a series of compromises, held together by string in places and if the site were to survive in it’s present form would certainly need to be changed again as browser support evolves. However I’ll walk you through my decisions:
On the left of the page in our navigation we have small thumbnail images for our recent posts. We have a number of custom thumbnail sizes. Originally we used
srcset to swap in a 486x240px image on retina displays instead of the standard 243x120px one. One problem however is that whilst we have a single responsive design for all users, the navigation bar isn’t visible by default on mobile devices and indeed’s it’s perfectly possible a visiting user might not look at it at all. However the navigation appears above the page content in the source code, which means all the images are loaded first, which effectively blocks the rest of the page content if you look at a waterfall chart.
We now have an image placeholder (a 43-byte spacer GIF encoded as a data-URI and a couple of data attributes with a standard and retina image. Then, when the DOM is ready, we look at the device resolution and substitute the correct image. The result is the main content loads a bit quicker.
We have our own custom code to manage the display of hero images & captions, but for everything else we use the HTML WordPress adds to the editor when you select an image from the media library. This is a double-edged sword; it gives journalists plenty of flexibility but having actual editable HTML in your CMS can be a bit dangerous. It’s important not to be a total control freak – other people are going to edit your code eventually – but it makes modifying things harder at a later date; if your image data is abstracted (e.g. JSON or YAML) it’s going to be easier to modify than if you have to write regular expressions to find and replace actual markup.
If an image is 450 pixels or narrower we skip it as the saving is negligible.
Another complicaiton: when WordPress resizes thumbnails, any animation in an animated GIF is lost. Some of our hero images and many of our body images are animated and we need to retain this. For now this means skipping PictureFill and using the full size version . We could ask people to tick a box to indicate if a GIF was animated, but they’d probably forget, so we actually detect it automatically by reading the binary file in PHP and counting the number of frames - which might ordinarily be too processor intensive, but we have a fast server and we’re using Varnish.
PictureFill has two versions and although he’s encouraging people to try the newest (v2), which uses the native
<picture> tag, frankly it’s not ready yet, for me anyway (and through no fault of Scott’s) – mainly because we’re waiting (and hoping) browser support will catch up
You ought to read the caveats at the bottom of the page first.
Deciding factors against v2.0 for me:
Version 1.2 uses
<span> tags and supports
<noscript>, as we’re implementing it at CSS level, it’s not a big issue to use it for now at least as we can replace the code as/when browser support improves and pictureFill is further developed.
Things you should be aware of:
min-resolution; support for resolution is OK on desktop but rubbish on mobile so you’ll need to double up your statements;
We have two versions of the code, one for hero images and one for those in the post body. They have slightly different media queries (the maximum 1dppx size of the former is 890 pixels, the latter only 810 due to margins). Also you need to remember to exclude any thumbnails that are vertically cropped (like our 486x240px retina sidebar thumbnail - fine for the sidebar, but you don’t want to cut off part of an infographic when it’s shown inline in your article).
We might have too many media queries, the sizes aren’t necessarily the best and the breakpoints may need tweaking a bit (you have to take a decision at which point a 1dppx image should change into a 2dppx image, and choosing the sizes is already hard enough given the variety of mobile devices).
Also if you are doing something like this, it’s important to have a radio button in the CMS to disable it. I checked loads of posts before putting it live, but subsequently have discovered at least one that was broken and was grateful I could just quickly switch it off.
FInally Jason Grigsby makes a good point (some time ago) that you should expect whatever you do to be deprecated - at least with a CMS we don’t have to do it manually every time.
I think it’s great on modern browsers and especially on mobile which gets the smaller images, the site is noticeably snappier.
The concept of responsive images is sound, clearly:
I’ve not mentioned art direction (cropping etc.) - WordPress doesn’t have any support for this, you need to upload a separate image.
I’ve chosen not to so far (and reviewing this in early 2015, that's still my advice.) Use a display more of your users are likely to have.
You can still test if retina images are being loaded correctly, and how they look, using a smartphone or tablet.
It's easy to simulate a retina display too:
Remember to check your site at different zoom levels and play around with screen widths and heights in DevTools. Early on, someone from The Guardian’s QA team kindly send us a screenshot of our navigation tabs wrapping. I thought I'd fixed it, only for Conrad to later notice the same problem with his new retina laptop. Yet it wasn’t visible on my large desktop monitor because it wasn't quite wide enough.
Some of you may be painfully familiar with this.
Our situation: We had a sidebar and main column that needed to be equal height on desktop. The design is more of a challenge than many sites - the sidebar has background colour and drop shadows either side and sticky-out arrows. The main column uses infinite scroll, growing taller and taller as we load new articles and the whole thing is centered on larger displays.
We started off using the “One True Layout” method (the one where you mess around with overflow and add huge amounts of padding) but eventually I switched to CSS tables.
So moving to CSS tables was a process of elimination.
The Doug Neiner and Nicholas Gallagher methods firstly aren’t intended for non-fluid widths, but the first doesn’t produce realistic box-shadow either. It seems some browsers can’t use the multiple color-stop trick on extremely small gradients either.
Drop-shadow may sound an incredibly fussy reason for rejecting multiple techniques - all I can say is I thought it was important. Turn it off and the site looks incredibly flat. I’d compromised Chris’ original design in a couple of other places. I felt this was worth the effort.
Browser compatibility for Flexbox is still limited (however if the site lasts long enough, I’d like to replace css tables with flexbox for browsers that support it – mainly to avoid the slightly awkward ‘one column then another’ rendering effect you can get with
display: table on certain devices on certain occasions.
We do use flexbox to progressively enhance our two-column grids within articles; large numbers can be vertically aligned with accompanying text for a more visually pleasing, symmetrical feel.
CSS tables aren’t evil.
We’re not talking about adding
<table> markup, all your HTML is still semantic. You’re using something that’s in the spec. The W3C aren’t going to withdraw it - it won’t go away. There’s great browser support. The behaviour is predictable. There’s a test suite. It’s recommended as a technique to rearrange your source code.
We wouldn’t have launched on time without SASS. Variables and mixins make the CSS much easier to manage, cut down on repetition and help you keep your design consistent.
That said, I look at my code and generally think “that’s awful”. If starting again, I’d want it to be much more object oriented (you’ll find my early experiments with BEM in the .email-signup form between posts.)
We have too much specificity (everything in article.page, article.post, for example), over-qualification and too many places where we use IDs instead of classes.
As of SASS 3.2 you can include content blocks in mixins, so using that and an idea by Kendall Totten I could use media queries like “@include respond(sidebar)” for situations where the navigation is visible and also to take care of IE8 support.
I was pleased that all the code I wrote avoided
!important, however you often find it littered in plugins (wpPro_Quiz, for example) and it’s not always easy to unpick.
“A web design technique that prevents the browser scroll bar from scrolling to the bottom of the page, causing the page to grow with additional content instead.”
Brief pros and cons:
The most common complaint we had about UsVsTh3m - not being able to reach navigation at the bottom of the page - is easily solved by moving it somewhere else.
If you’ve dumped it in the footer it may suggest you don’t consider it important enough. If it is important, put it below every article if it is, or promote to the main navigation, otherwise consign to a subpage somewhere.
Infinite scroll is built to accept a complete page (e.g. a /page/2 style WordPress URL) and extract the correct story using jQuery. This is inefficient though, we strip out all the header and footer and just send a single article on it’s own. Title, hero image, standfirst, timestamp and sharing links at top and bottom are included, but newsletter signup and Facebook comments are added client side if the user has enabled them.
All the usual gzipping and Varnish work (caching with edge side includes) happens too.
It’s necessary to correctly handle different types of pages. The homepage is straightforward enough, a simple continuous stream of articles in reverse order of publication. But you might be looking at an category, tag or author page. The original plugin uses WordPress pagination to do this, which may be enough, but I needed greater control.
One problem is WordPress code doesn’t do posts with multiple categories well - long story short, get_adjacent_post isn’t precise enough. The
$in_same_cat parameter is a boolean true/false rather than a category ID and there’s no way of making it stick to the same category if the next post has more than one. You might jump from Politics to Economy to Science to Technology to Entertainment as you scroll.
I wrote a new function with new SQL that takes a category ID.
We add an attribute to the
.entry-meta tag containing the permalink of the correct preceding post; this is used when infinite scroll is triggered to determine what to load next.
We set the main infinite_scroll options in a JSON object. Including:
<body>class to figure out what sort of page we’re on (category, tag etc.) and add some query parameters at the end
How we handle “interactive” posts
eval. Then the table loads with the correct settings.
What else do we do?
I’ve been a little more cautious with these (not for security, because they’d be parsed anyway on a static page view) but just so I can feel a little more in control of what’s going on. The infinite scroll callback function scans
We used to preload the hero image for the following post (which the server sends as a data attribute) but because of introducing PictureFill, this is on hold pending a rewrite.
One disadvantage is all the time a page is being scrolled by a user’s finger, DOM manipulation is suspended: means you can’t add / remove or change anything, including inserting the next post at the bottom of the page. The page must come to a complete stop too (if they flicked their finger to scroll rapidly, the page will still have momentum when it leaves the screen), so you have to hope that people slow down sufficiently to at least attempt to read some of your article.
This did not change in iOS8.
It's not an issue in Android.
Don’t waste time looking for workarounds for this, as there aren’t any.
I mentioned in the introduction the requirement for responsive charts.
As I’ll say again to end this section, there are many different ways to do charts and there’s no right or wrong solution. Also, you kind of need to actually use several in production to get a feel for their strengths/weaknesses.
Back in November 2013, as well as us looking at Quartz’s open-source generator, the Mirror were developing their own system - since launched - based on amcharts. They gave us their latest code and we were going to use that (it looked great, though truthfully it would have been extremely hard to have ready in time) until Annabel Church suggested Datawrapper because it had already had a web interface for journalists to use.
Probably Datawrapper’s key benefits are the web interface to live preview your chart and a database so you can modify (or clone) charts later.
There’s a slight learning curve to understand how to install it but it’s well coded underneath: everything is modular; each chart type and most features have their own plugin you can switch on and off.
Our Datawrapper install is customised a little (e.g. stripping out chart titles, adjusting fonts, making the source/attribution links at the bottom more subtle, some concatentation of script tags to make the iframe a bit faster.)
Iframes are bad for performance - it wouldn’t be impossible to get rid of them with WordPress and Datawrapper running on the same server, but the code for combining multiple charts per page and dealing with infinite scroll would be hard.
Datawrapper doesn’t do scatterplots and we wanted one for the Scottish referendum, so – partly as an experiment – I ended up writing my own using another library entirely - the increasingly popular DimpleJS.
Datawrapper have forked D3 into a simpler visualisation library called D3Light - smaller with fewer features but increased IE compatibility. This means there’s some D3 stuff you don’t get, e.g. axis features such as labelling are a bit limited in comparison.
It’s frustrating if you can’t tweak chart layouts, so there are several settings for that sort of thing.
As it’s literally only just finished I can’t show you a working chart (we do have one and I’ll update this), but in the meantime here’s the settings file.
Making charts that are also acceptable at mobile size is hard - it’s mostly a case of trying to simplify everything.
Adaptations for mobile:
I’m yet to find a reliable way to make the touch targets for data points sufficiently large for mobile devices, but still display them as tiny dots.
We’ve slightly modified the DimpleJS core code:
I may have talked it up too much and you’ll think “is that it?” when I show you it.
But I’m extremely grateful for all the help DimpleJS’s author John Kiernander has given me. He’s responsive and patient with questions and bug reports and has kindly added a couple of small improvements I suggested. The library is very much in active development and you should definitely take a look if you’re not familiar with it.
“It depends”. I’m still far from sure if we have the right ones. It depends on your circumstances: are you a designer, developer, writer/journalist? How much time do you want to spend setting it up, which type of charts do you want to use, what’s your budget? etc.
Do look at those I’ve mentioned plus amcharts and highcharts. There are hosted services too like the Google Chart API - I’ve stayed away from them (though we allow staff to embed anything they want) as I’m wary of trusting core content to a free service which could be fundamentally changed or even withdrawn.
Later I replaced it with Relevanssi (the free version has been more than adequate for us so far). The front end is extremely light weight (so light you’ll need to build your own AJAX search from scratch - I used this solution as a starting point) and the back-end includes a much better index. Plus there’s support for stop words, weighting, exclusions, logging, snippets… It's also easy to use the same index for your admin pages.
We host our DNS on Amazon Route 53.
Several years ago a sizeable chunk of the web community were using Zerigo, a cloud-based DNS provider with a lovely UI and plenty of features. Many of us who’d hosted our own DNS (using BIND etc.) got excited about it and at this stage Route 53 was still in it’s infancy. Then Zerigo (a single developer startup) got acquired by 8x8 and in July 2012 suffered a disasterous outage; a prolonged DDoS that took thousands of popular sites offline.
By this time Route 53 had a user-friendly web interface and my thinking was that if anyone was going to be able to survive a DDoS attack then it’s Amazon. They’re also the only provider I know with a 100% availability SLA (not that the money’s important).
Route 53 DNS records propogate quickly, there’s support for very low TTLs, simple load balancing and health checks should you be using multiple servers. Most cleverly of all I think, Amazon give you nameservers unique to each domain (to reduce collateral damage during a DDoS) and spread across different TLDs (to add a further bit of resiliency if there’s a root server problem.)
It was vital to have direct control of our DNS (I know how easy it is to break and frankly I didn’t want to risk anyone else screwing it up), but at the same time wanted Martin to have the ability to get in (without sharing passwords) and change stuff in case I did the same or wasn’t unavailable in an emergency.
This meant using AWS’s Identity and Access Management (IAM). It can be confusing, so here’s what I learnt about Route 53 IAM policies:
Analytics and tracking can be overused on many sites; we have three seperate systems on Ampp3d - I’d love it if it was just the one. Everyone mostly quotes stats from Google Analytics but we also have Chartbeat and the Mirror need us to use Adobe Omniture because it’s what they have for their other properties.
One of the tabs in the sidebar shows the “Most Read” items. Initially this was generated with the WordPress Popular Posts plugin - it’s well written and simple for anyone to install, but the problem is scale. The plugin maintains it’s own database of page views and allows you to count pages ‘normally’ or via some AJAX depending on whether you’re using any sort of caching (Varnish doesn’t - and frankly shouldn’t - cache POST requests). I’d already modified the plugin so that displaying the list of popular items no longer used a live lookup but a cached set of results, to reduce load. (During) the night before launch, I realised this would be a disaster as my basic benchmarking showed WordPress could only handle 15 or so POST requests per second, so all our hard work ensuring we could serve pages to as many users as possible would be undone just by trying to log them.
As we were already using Google Analytics, it seemed to make sense to retrieve the “Most Read” list from there if we could. First priority was to immediately turn the Popular Posts plugin off; for a week or so I updated the ‘Most Read’ page entirely by hand, logging into Analytics and copying and pasting the URLs into a static HTML file. Then I worked out how to pull the data in.
Other people might be wanting to do something similar, so I’ll give you our code and walk you through the setup process.
Here’s our script - you’ll need to fill all your authentication values. It runs severals times a day as a cron, grabs the 30 top URLs, uses url_to_postid to verify each is a valid post and finally writes a simple HTML file with a list, pulled in by WordPress.
Setting up the Google API access is confusing. This article was invaluable but Google have changed it all since then. What you need to know:
/src/contrib/Google_AnalyticsService.phpis now found in
It’s never advisable to list all the security measures you use (partly as it reveals which you don’t), but I would mention I’ve found the WordPress two-factor auth plugin useful.
WordPress users should be aware of a small security risk with draft posts.
WordPress has two main cookies:
If you’re using public wifi and access an http:// page, anyone else on the network can steal your cookies. Assuming your site is correctly configured people will be redirected to an https:// page (where cookies can’t be stolen) before logging on. However it’s extremely likely they’ll browse the non-https:// site by following public links etc. and normally you don’t redirect blog readers to SSL - Varnish, for example, doesn’t support it. So the less secure cookie will still be sent.
This is what WordPress uses to decide if you can view a preview of a post, so if someone were to obtain your wordpress_logged_in cookie they could try
p=XXXXX&preview=true URLs until they found the post.
Very recently there were reports about potentially more serious security risks with this particular cookie; however these mainly apply (or rather, applied) to a configuration problem with Wordpress.com rather than privately hosted blogs and actually some of it isn’t true - for example having the logged_in cookie won’t allow you to make a new post, moderate comments, access the dashboard or anything else – providing you have SSL setup you’ll simply be redirected to an https:// login screen. Also the standard cookie expiry is 2 weeks, not years, and you can modify this yourself.
But it’s correct that 2-factor auth is no help whatsoever with this sort of problem.
There’s a technique called ‘forward secrecy’ your ought to enable for you SSL certificates. In simple terms it stops someone from recording your encrypted traffic now, then stealing your private key at a later date and using it for decryption.
It’s easy to setup (no need to regenerate certificates or involve your SSL provider, just a few lines in your apache or nginx config) and there’s a great online checker that will look for this and many other certificate issues.
I remain nervous about us getting hacked one day.
Although its traffic from Facebook and Twitter that much of the industry is obsessed with, in fact we get much more when stories are featured on Google News – indeed it’s almost as though people no longer bookmark the homepages of traditional news organisations…
One thing to be aware of: news sitemaps only show pages created in the last 48 hours.
A big advantage of the site being part of Trinity Mirror was their ability to get us listed on Google News at all - it’s hard for smaller sites to be approved. Also The Mirror’s team have been helpful in noticing a stupid mistake with our robots.txt file before I did…
Unfortunately, although plenty of our stories do get indexed by Google News, I’m not sure if the sitemap has made any difference. Even after changing our sitename to match The Mirror’s, Google Webmaster Tools still (occasionally) protests the publication name doesn’t match. You may also note the image page warns about hosting images on a separate domain to your main site – “it’s very unlikely we’ll be able to crawl them” – so watch out for that.
Alex Norcliffe was kind enough to suggest a way we could improve our organic search results. We’re trialing an extra field for articles: a search term we think users are more likely to search for (perhaps a more informal tone, rather than the typical ‘editorial voice’) which journalists manually add to the article body (so it is genuinely part of the content) but we also tell WordPress to insert in the
title attribute of headings and internal site links.
(I did want to check if this would be bad for accessibility but it turns out the title tag is – contrary to received wisdom – rarely read out by screenreaders anyway.)
Since the site launched we’ve supplied custom meta data for Twitter cards, though it’s not enabled at the Twitter end yet. We have custom fields in the CMS which can override the default values.
Update: Originally, I flippantly said we were using Agile, just because everyone else claims to.
When you work by yourself all the itme it's very hard to adopt something like that, but nowadays I'm less cynical and more receptive to it, having heard plenty of examples of how it's helped people.
We did have a product backlog and set some general goals, but there were no formal sprints. No daily standup meetings either, though I tried to set myself small targets each day. It can be useful to plan your day the night before, so you don't get sidetracked.
Update: I now highly recommend PhpStorm as an IDE (it has full WordPress integration). Also, I'm pleased to say that a year on I still use all the other apps mentioned below.
Thanks for reading this.
I’d like to thank Martin Belam for choosing to work with me and everyone at Ampp3d and Trinity Mirror who’s been involved with the project.
The site ran on WordPress for nine months, by which time it had proved itself editorially, but because it was on a separate domain wasn't getting anything like the traffic levels it could be, so Trinity Mirror decided to move it inside their own CMS. This allowed for cross-promotion on mirror.co.uk and also Trinity Mirror's regional news sites (and since I wrote this UsVsTh3m has also finally moved.)
You should always try not to break the internet by deleting old content, so with that in mind we've kept the old sites running.
As I insist on calling them. ↩
At least in using infinite scroll for everything. I’m keen to hear of other sites that have tried the same thing. ↩
Chris – who I’ve never actually met – saved me some time by providing adjustments in CSS once I’d coded the initial templates; always useful as Photoshop’s typography rarely matches rendering in the browser. ↩
We launched over 12 hours early - full testing the prevous night involved removing password protection and disabling the landing page so I could check Varnish was working how I expected. We asked Malcolm and he said we might as well leave it up. ↩
I did float with Martin the possibility of using PyroCMS. It’s developer friendly in the ways I’ve mentioned WordPress is lacking and I use it for other clients, however in hindsight it would have been a disaster for Ampp3d simply because Wordpress’s multi-user editing, revisions, media library and so on are so strong. I doubt we’d have got it ready in time. ↩
Notational Velocity is fantastic. Briefly why: it's a collection of plain text notes. You give each note a title, it's blazingly fast and you can search as you type. No distractions, easily syncs across platforms. You’ll probably like Brett Terpstra’s fork, NVALt, even more. Use it with Simplenote account (aside: Simplenote has been acquired by Automattic, who run WordPress.com); you then have an iPhone app and, crucially, a ‘history’ function; quickly recover earlier revisions of a note you’ve accidentally deleted or otherwise messed up. ↩
I will be linking to all the locations. ↩
Someone suggested that due to Martin’s obsession with venn diagrams we call the site UsVsVenn, so he asked me to put that in the source code as a joke. I made it smaller as part of a spring clean to reduce page weight, but it’s still there. ↩
I’d suggest testing on Windows 7 in preference to Windows 8 because it’s easier to rearm. ↩
e.g. Safari modifies the DOM if it finds something it thinks is a phone number. ↩
Same with bugs in the CMS - I installed a plugin which broke a menu option I didn’t know existed and it was weeks before anybody mentioned it. ↩
Have plenty of spare USB ports; it’s not uncommon for me to have 2 or 3 things connected at once. Also it helps with the charging (remember to shut them down properly to save battery). ↩
One issue with WordPress is it doesn’t handle multiple domains, it hardcodes URLs in several places in the DB (as well as in posts). This is a problem if you’re using separate development or production servers. ↩
I recommend the Mac Mini. The 2012 model is in some ways better than the 2014 model, there's a Quad-Core processor option and you can buy and install your own RAM. I selected the maximum amount – 16GB: memory makes a big difference when using not only an IDE and many browser windows, but various virtual machines. You do still need to manually "flush" (
sudo purge) the inactive memory occasionally if the system starts to use swap. Also the hybrid HDD/flash “fusion” drive has been flawless. The Mini is small and light (1.22kg); portable enough if you need to work from a different location for a period of time and minimal hassle if you ever need it repaired. You can also use whichever display(s) you like. ↩
If MySQL refuses to start, add
innodb_force_recovery = 1 to the [mysqld] secton of your config file, start MySQL and it should fix the tables and let you run a SELECT statement. Immediately remove the option and restart again to restore full read/write access. ↩
If you’re setting up a new team, perhaps consider using a persistent chatroom like HipChat, Campfire or Slack to cut down on the number of emails flying around - see Zach Holman on how GitHub staff stay in the loop. ↩
Although a lot of people have jumped to Laravel, FuelPHP is a strong framework, still very much in active development and has powerful ORM and plenty of useful packages like SimpleAuth. Also, Oil lets you quickly install packages, generate complex database scaffolding or create user logins – all from the command line. ↩
Compared to other plugins, Mousetrap is compact (1.9kb minified) and ever so simple to implement. We have a few simple keyboard shortcuts on the main site - check the source code. ↩
All these things have happened to me over the years. ↩
I think sponsored content can work brilliantly for everyone if you’re prepared to put the effort in. if you can find someone relevant to your users and you’re prepared to put the effort into integrating it and making the campaign work for your advertiser. Also (in previous jobs) I found well produced text ads worked just as effectively as graphics. ↩
Case study with ampp3d: Rather than put the problem ad spots live, I prepared some barebones test pages to verify they were genuinely broken and it wasn’t our fault (make sure you test these on the same domain) and wrote a quick email about it, ready to follow it up if we heard back from anyone. ↩
In 2012 Martin wrote about the ethics of ad-blocking. Personal opinion: I would find it incredibly frustrating to use the internet without one (although it’s not too bad on an iPad - whether enough people click on them is another matter). I am more willing than ever to pay small amounts of money either for good content or well-written software. There needs to be respect on both sides too - I’ve lost count of the number of news sites I’ve seen run the “99% of Gmail users don’t know about this trick” story - I’ve no problem with properly labelled sponsored content, but I don’t think it’s ethical to abuse the trust of your users with clickbait masquerading as “Related News”. Things like the ‘reading’ mode in iOS and apps such as Instapaper have become popular for a reason. ↩
Yes you can use a local fallback, but it’s not the only thing that could go wrong. ↩
Why not run a test yourself? ↩
All the Datawrapper files are static - it’s fully compatible with basic cloud storage like S3. ↩
PHP 5.4 is also significantly faster than 5.3. You should upgrade. ↩
It’s doing a ban, to be clear. ↩
Be aware the test returns false if you’re using private browsing. ↩
For the same reason our Twitter and Facebook sharing buttons below the post title are just static buttons, to avoid all the extra overhead of the ‘official’ versions. ↩
For example, you need to pass
xfbml=1&appId=YOUR_FACEBOOK_ID&status=0 in the query string when calling the all.js script. ↩
Also quantity does not necessarily equal quality when it comes to comments, or if I’m being fair, the article you’re currently reading. Plus we don’t show how many people have shared something, so why show the number of comments? ↩
This is still roughly the same for me – but such speeds ought to be perfectly adequate for web browsing, provided we develop sites responsibly. ↩
Until late 2014, to get a 3G signal I had to walk up a hill and stand in the middle of this field. Otherwise I could only get GPRS on all networks, which isn’t exactly quick, as Matt Andrews demonstrates. Things have improved since then and I now have reliable 3G or 4G, but the latter only because I have a very new phone. Also, in congested city centres, coverage is much more variable. Many more shops and businesses provide free wi-fi, but even if the one you choose is working reliably, there's still a time cost (registration or a login screen) and therefore a degree of effort involved in connecting. ↩
A worst-case scenario; an expensive, £3, 100MB Pay as You Go bolt-on on o2. There are many better tariffs. Also in fairness I think we used a smaller version of the cat gif, but I can’t find it now. ↩
I would have liked to spend a bit of time on automatic conversion to progrssive JPEGs, seeing as that’s now become a best practice. Also we did try Smush.it and compress media files on upload, but despite me writing workarounds to correctly handle the varnish cache, it broke something else and I had to turn it off - looking for an alternative is on the to do list. ↩
It does depend on the image; four times the area isn’t the same as four times the download. High resolution PNG bar charts with a few blocks of solid colour compress well. But there’s also just been a rumour Apple may launch a 3x display which would be nine times the area of the original image. This is terrifying. ↩
I’d recommend Bruce Lawson’s blog where he’s covered every development on responsive images and links to some of the best tutorials. I supported this too - Blink picture element project - it’s now closed for funding but keep an eye on the outcome. ↩
Some drawing packages produce spacer GIFs bigger than this, remember to compress them. Data URIs are roughly a third bigger, so not suitable for images larger than a few KB. Being especially pedantic, the src URL for the spacer.gif (which has to be absolute because we want to serve images from a cookieless domain) would have been 74 characters, so only 7 bytes smaller. (Oh and we gzip all our HTML too, so who knows). ↩
It would be more reliable to use a HTML parser than regular expressions, as it would cope better with attributes listed in a different order. ↩
There are plugins that will resize animated GIFs, but the one I tried didn’t work and I’ve not had time to investigate others properly (also I’d want to apply it selectively - keeping animations in hero images but not in the sidebar). ↩
At the time of writing, Chrome Canary v37 has just had
<picture> support added. ↩
Making the sidebar scroll like Quartz was something I wanted, but it comes back to the dropshadow and other ‘extending’ elements again – you’ll note their left-hand nav is nice and rectangular. If anyone reading this knows how to do it maybe you’ll put me out of my misery… That said, maybe our design works better with it all visible as you move down the page - maybe not all Quartz users realise you can scroll, or do so routinely, so the lower items on Quartz are liable to get less attention. ↩
I made the arrows with a neat trick involving borders and
:after pseudo elements. ↩
You could argue the same about Phark Image Replacement I suppose, but that usually goes up to –9999px, not 10 or 20 times the size. ↩
It was a long time ago… ↩
In it’s present location I mean - Ampp3d’s editorial future is secure. ↩
IE7 support was a step too far for me, given how low the mainstream usage has become. I take the point about trying to do a single column format if you can. ↩
Another thing we should have done. On that specific technique, I had a problem with it in a previous project: I found use of ‘table-caption’ and ‘table-cell’ restricted the width of the table. But I fixed it by replacing ‘table-caption’ with ‘table-header-group’. There’s also a ‘table-footer-group’ if you have three sections to arrange. ↩
It is a good sign if you look at anything you’ve written a few months later and can recognise things that are wrong with it, but it doesn’t stop it being awful. ↩
The conventional wisdom for matching single elements in CSS ↩
At launch we had WordPress configured to show 3 or 4 blog posts at once - that’s reduced as I’ve optimised, it’s now just one. ↩
It’ll work fine on Nginx, by the way. ↩
We actually take the lowest and highest X values and discard the rest: if you plot all the points Safari renders a slightly jagged line rather than a smooth one line other browsers. ↩
You’ll typically have one nameserver matching your own TLD, e.g. .co.uk, plus a .org, a .com and a .net. ↩
Whatever alert system you choose, get out of the habit of setting your phone to silent overnight. Definitely don’t do that after working late, then oversleep and wake up to discover the server went down overnight and didn’t come back automatically… ↩
I refer you to this wonderfully prescient “Why no SSL” FAQ from 2011 where the Varnish author quotes from the OpenSSL sourcecode and remarks “I hope they know what they are doing, but this comment doesn’t exactly carry that point home, does it ?”. ↩
Of course few banks or corporations are using it yet… ↩
Related tip: put your .vim directory in Dropbox and symlink to it then you can share your configuration across multiple computers. ↩
This used to be the ADB plugin, it’s now built into the browser (Tools > Inspect Devices), bringing Chrome into line with Safari. ↩
I wrote all this with Vim and Marked. ↩
Also a reason I use a Dell 24" display rather than iMac/Thunderbolt display - Apple screens are just too bright even on the lowest setting. ↩