Never pile stuff on top of a container

The moment you put something on a box with a lid, you’re going to have to shuffle things around every time you want anything inside the box.

And most likely, the stuff inside the box is important, but not important enough that you’ll bother going inside the box if you have to work at it. So now you’ll never go inside the box.

Understanding Terrible Code, Part One: CKEditor is not to be trusted

Understanding Terrible Code, Part One:

Writing a CKEditor Plugin

Exposition: Why should you care about a random text editor?

Modding CKEditor is an exercise in extreme self-flagellation.

CKEditor is a text editor plugin that works with rich text (bold, italics, tables, links, etc.). If you’ve ever used Gmail or Hotmail, you’ve used something similar.  CKEditor is relatively easy to use, powerful and has lots of features, and it’s used everywhere. But while the code is well written and well documented, at a macro level there are terrible design things going on, and if you want to modify it, you’ll want to shoot yourself shortly thereafter.

Since I’ve been wrestling with it for quite a while now, I wanted to share my thoughts:

  • In case you happen to be forced to alter CKEditor as well. Hopefully this will be a useful breadcrumb trail.
  • To convince myself I’m not faking it. CKEditor has well written source, so someone could argue “maybe they’re doing it well and you just don’t get it.”
    This is perfectly valid. The best way to figure out if I’m faking it is to actually try and explain why it’s bad, and see if this post turns into “CKEditor is an architectural marvel” in the process.
  • Complaining is whining unless you can explain why, and how to make things better.
  • I need to understand this stuff myself, and nothing helps with understanding like trying to explain.
  • Because I’ve been staring at this for days and need to say something.

Act One: Parallel Universe APIs

Half of CKEditor code essentially replicates JavaScript DOM APIs and jQuery. Let’s use the most radical example of CKEditor manipulation – the dialog architecture.

CKEditor dialogs are used for special properties. If you want to add a link, you would create a dialog to set the URL.

What language do you think dialogs would be designed in? Presumably a language that is used to lay things out in a web browser. Hmmm…what language do we use for that…HT nope, you’re right, it’s JavaScript:

The new dialog system in CKEditor is to be written from scratch. One of the key things in CKEditor is that it doesn’t rely on HTML pages to run. Everything is created on the fly in JavaScript. In this way, we avoid limitations about cross domain serving of the editor code, like CDN solutions, and enhance the editor performance.

http://docs.cksource.com/FCKeditor_3.x/Design_and_Architecture/Dialog_System#Dialog_Definition

We’ll get back to the benefits CKEditor claims later, but let’s think about this. You’re laying out a mini-web page. You’re a web developer and probably lay out web pages all the time. You read A List Apart obsessively; you understand the difference between content (HTML), presentation (CSS), and behavior (JavaScript), and the benefits of separating them.

And then you see this:

(This is actual dialog code that CKEditor uses to show a link dialog.)

Is that HTML? Yup – it looks like a <select> – there’s a list of options (items), and an id, and even a label.
Is that JavaScript? Yup – it’s got an onChange, a setup, and a commit, whatever that is.

This confuzzlement is compounded by the fact that there are about 1000 (literally, see for yourself) lines of this JavaScript/HTML hybrid. If you separated this into HTML and JavaScript, it would be easy to read, not indented six ten tabs in, and it wouldn’t take you an hour of sorting to figure out what a dialog that’s basically a handful of selects and some text fields is doing.

By the way, there’s CSS too:

		var basicCss =
			'background:url(' + CKEDITOR.getUrl( this.path + 'images/anchor.gif' ) + ') no-repeat ' + side + ' center;' +
			'border:1px dotted #00f;';

Let’s establish some of the downsides of this mixed-content architecture:

  • Wading a thousand four-hundred line morass of mixed behavior and content every time you want to change things
  • No way to test the layout of your dialog without loading all of CKEditor (as opposed to HTML, which can be tested by anyone with a web browser)
  • You have to relearn all your HTML/JavaScript/CSS ninja skills in a weird JS hybrid
  • Oh and if you miss a brace, it’s not like missing an angle bracket – everything dies. JS is brittle. Did I mention you have to reload all of CKEditor to test your fixes?

So how about the benefits? The CKEditor docs claim that you can do cross domain* loading (for content delivery networks, etc.). While there are plenty of things wrong with the solution, the CDN problem is serious and it would be irresponsible to ignore it. Is there another way we can allow CDN loading without replacing all of our HTML with JavaScript?

One option: Write HTML for the dialogs, but convert it into escaped JavaScript internally. Escaped HTML is ugly, but if we compile it, we needn’t use it except in production.

This rant is just barely getting started, but I’ll stop here for the moment.

“When you have enough data, sometimes, you don’t have to be too clever”

Peter Norvig, famed AI researcher and one of the creators of the Stanford AI online class:

Past, Present, Future Vision of AI – Google and AAAI 2011

“And it was fun looking at the comments, because you’d see things like ‘well, I’m throwing in this naive Bayes now, but I’m gonna come back and fix it it up and come up with something better later.’  And the comment would be from 2006. [laughter] And I think what that says is, when you have enough data, sometimes, you don’t have to be too clever about coming up with the best algorithm.”

(Peter’s speaking about search algorithms, but what if you applied this to running a startup? Or life in general?)

Using code coverage to decide what to deprecate

A recent post on building a slimmer jQuery got me thinking about the process of deciding what to cut.

From a code perspective, it’s easier to add new features than remove old ones.

From a user perspective, it’s much, much easier to add new features than remove old ones.

Nobody knows about your new feature, but there are at least three people who love that button you added to the toolbar two years ago. Even if no one else cares, they’ll make sure to post angry requests on your forums asking why you got rid of it. If you’re not careful, they’ll accuse your company of “not caring about its users,” or they might even say you “used to be great, but ever since they got successful they’ve started to go downhill”…ever hear that before? [1]

At the same time, it’s important to cut features, or you end up sliding down this curve:

So how do you decide what to cut?

Why Not Just Count?

One of the great things about analytics is getting real data about who’s using what. Every month, Google Analytics sends me an email that tells me exactly what people are reading on this blog. (Hint: It’s not the minesweeper articles)

That gave me an idea: Why not generate similar analytics for a library like jQuery?

There’s a nice list of sites using jQuery available here:

http://docs.jquery.com/Sites_Using_jQuery

Let’s take a site and see what it’s actually using:

(Speaking of wanting to stick to old things, Match.com uses jQuery 1.2…)

Match.com’s landing page is simple enough that we can do an actual list of what calls they make to jQuery, which I’ve uploaded here.

From this, we can tell that the standard selector function $() gets used a lot, as does readCookie and addClass. From there, the next step is to figure out what they’re actually calling.

In the case of jQuery, this is not too hard in principle. Get a version of jQuery, add a line that increments a counter every time a function is used, and inject it into your favorite website. It’s a little too involved for this post, but perhaps in the future (earlier if someone decides to jump in and do it!)

 

 

[1] Back in the 90s, Microsoft obsessed about making sure everything from the past worked, even if it mean keeping broken things from earlier versions of Windows:

The Windows testing team is huge and one of their most important responsibilities is guaranteeing that everyone can safely upgrade their operating system, no matter what applications they have installed, and those applications will continue to run, even if those applications do bad things or use undocumented functions or rely on buggy behavior that happens to be buggy in Windows n but is no longer buggy in Windows n+1. In fact if you poke around in the AppCompatibility section of your registry you’ll see a whole list of applications that Windows treats specially, emulating various old bugs and quirky behaviors so they’ll continue to work.