You Got Your Javascript In My Peanut Butter

I have always tried to embrace a minimalist approach to software design. Even early on in my career when I had no idea what I was doing. In those days I rode a horse-drawn carriage to work and used a pointed stick to scratch PHP3 code directly onto the server’s hard-drive platter. Good times. Back then JavaScript was like window dressing – a bit of flair for the UI – definitely not something you relied on for the web application to function.

It was possible some of your users might not have JavaScript support, and at the time we actually cared about those users. The first two major Open Source web applications I built both worked without JavaScript. I’m pretty sure that increased our user-base over the period of a decade by at least 1. Totally worth it.

These days it seems nobody cares about users without JavaScript. Even me. I care only enough to display a noscript tag with a message stating it’s required, but only when I’m feeling particularly ambitious. In my head I picture users with JavaScript disabled as middle-aged versions of Kip from Napoleon Dynamite, all of whom used punch cards in college, started in IT as a data processor in the late 80’s, and are still running Windows NT (or wish they were). Now that I think of it, if they are running NT they probably should have JavaScript disabled.

As much as I dislike the language – and I do – it makes sense to offload parts of an application to the client. Let’s face it, you can’t be all super-web 3.0.1 if you don’t make constant AJAX requests in the background the moment a page loads. I may be a neck-bearded curmudgeon, but I have begrudgingly come to accept that JavaScript is the best way available at this point in time to improve web applications past what HTML5 brings to the table. To a point.

As an industry we seem to be lurching towards JavaScripting all the things. JavaScript only interfaces! JavaScript on the server! JavaScript interpreters written in JavaScript! The older I get the more I try to balance my skepticism of new ideas with a willingness to keep an open mind, but I am having a hard time catching a ride on the all-JavaScript-all-the-time bandwagon. Maybe I am biased by the five years I was forced to use a bloated unintuitive framework for a former employer, of which the only thing I remember (aside from being gigantic) was how effin horrible it was to work with. But probably not. Additionally, I seem to have missed the memo to web developers that everyone in the world now has high speed internet so we can push gobs of scripts client-side even to perform the most simplistic task. I’m slightly offended nobody told me. Have I also been walking around with a booger hanging out of my nose? Come on people, throw me a bone here.

Running JavaScript on the server is interesting, but for me it’s interesting in the same way that a squirrel water-skiing behind a toy boat in a backyard pool is interesting. I realize event-driven non-blocking I/O is the new Holy Grail of server-side processing, and buzzword-laden start-ups are required to use it in their stack in order to get funding, but I just can’t seem to let go of a well tuned web-server with a fast scripting language and moderate use of JavaScript in the client as an effective foundation for building web applications. Then again I don’t have good reading comprehension because my eyes are worn out from years of articles about the latest technology fad revolutionizing web development, so maybe I am just missing the toy boat.

I didn’t start this post with the intent of bashing JavaScript, and the fact that I now rank it slightly above “necessary evil” (but still below “creepy cousin”) is as close to an endorsement of the language as I have ever come. One could even conceivably describe it as being “handy” for limited purposes *grinds teeth*. Maybe someday I will be telling my great grand-kids about the crazy old days when there were other programming languages, way back before every conceivable app was ported to JavaScript and all knowledge of anything else was lost to generations past. Somehow I doubt it.

The IMAP BODYSTRUCTURE command, and a bug in the Gmail IMAP service

Now that is one catchy post title. Who DOESN’T like to discuss the nuances of the IMAP BODYSTRUCTURE command? I guess if there are other “E-mail geeks” out there like me – toiling away in the evening’s waning hours writing E-mail software for no good reason – they might find it mildly interesting. At best. The only saving grace to this entire post is the fact that I found what appears to be a legitimate bug in the Gmail IMAP service. Of course this means I have bested all of Google’s E-mail engineers at their own game. I expect penance in the form of exhalation of my prowess, and perhaps a competitive job offer. I doubt either will materialize, but I am prepared to wait.

In the meantime let’s get on with the boring. It’s no secret I have a love-hate relationship with the IMAP protocol, and one of its painfully wonderful features is the BODYSTRUCTURE command. This command returns a string representation of the structure of a MIME formatted message. What makes this command wonderful is that it provides access to all the message parts in a bandwidth limited fashion. What makes it painful is the fact that it’s a disaster to parse, like most IMAP responses, though this one really takes the cake. The BODYSTRUCTURE command gives a mail client enough information to determine the “message part ids” needed to access particular sections of the message. Unlike more simplistic protocols like POP3, we can use this information to selectively choose the message part we want to display, and only fetch that content without having to download the entire thing. Take that simplistic protocols like POP3!

MIME message structure can get really complicated, since message parts can be contained inside other message parts that are inside other message parts (etc). It is critical for a client to accurately represent the structure, and to properly assign the “message part ids” so they can be individually viewed or downloaded. This is the point at which I stumbled on a problem with the Gmail IMAP service. The message in question is a digest E-mail from the Bugtraq mailing list. This message is formatted as a “MESSAGE/DIGEST” MIME type. Digests generally have a text part summary of the included messages, then a list of RFC822 parts containing the original E-mails sent to the list (an RFC822 part is like a container for an entire E-mail message). The Bugtraq digest E-mail follows this pattern. The BODYSTRUCTURE response from Gmail’s IMAP interface for these types of messages appears correct, however the “message part ids” derived from it do not work – they are rejected as invalid.

At first I assumed I was doing something wrong, usually a solid assumption when you are working on an IMAP client. Or you are me. The BODYSTRUCTURE response parsing code in my client was ungainly as sin, so I took this opportunity to re-factor it into something slightly less ugly. Even with the improved code, the message part ids continued to fail, so I decided to copy the message to a local IMAP account using Dovecot to compare the results. Surprisingly, it appears that Gmail’s IMAP server is doing it wrong. The BODYSTRUCTURE response from both IMAP servers is identical. The way my client is parsing the responses and determining the message part ids is also identical. Attempting to access the individual message parts using these ids fails in Gmail, but works in Dovecot. Dun-dun-dun!

Combined with some deep diving in the form of casually skimming the IMAP and MIME RFC’s, I’m convinced this is a bug in Gmail’s IMAP service. Interestingly the Gmail web interface displays digest messages as one big text blob and dumps all the parts out in a row. This might be simpler for users, but wading through raw message text is cumbersome for large digests. For the I’m-sticking-it-to-Gmail-in-this-post record, it also violates the RFC recommendations for clients displaying complex message structure.

Honestly though, I love Gmail, and it’s great that they allow IMAP access. They have also added some neat extensions to the IMAP protocol, such as a Google-like search command that kicks the crap out of the default IMAP search. Since the BODYSTRUCTURE response from Gmail’s IMAP service is correct, I suspect the problem with message part ids not working is a relatively simple fix to their IMAP implementation. True to form, I suspect this without any clue as to the inner workings of the Gmail IMAP implementation.

The Itch I Can’t Stop Scratching

My first official contribution to an Open Source software project was way back in 2002.  I was solving a problem for my employer, and ended up becoming a developer for the venerable Squirrelmail project. It was an exciting time. The community was vibrant, active, and surprisingly welcoming to a near-complete novice willing to get their hands dirty. Looking back at the code I wrote lo those many years ago makes me want to gouge my eyes out with red-hot sporks, but I can’t deny the impact contributing to that project had on both my mindset and career path. Since then my involvement in Open Source has waxed and waned, but has always remained. That seemingly innocent interaction sparked a lifelong interest in webmail applications, and I have been tinkering with them ever since.

After a brief 5-year stint writing mostly Python and C++ , I started working with PHP full-time again last May when I joined Automattic. I realized pretty soon after starting that my skills were rusty. Like PHP4 rusty. I needed to experiment with the latest-greatest the language had to offer, but in a safe way, and on my own terms. For the third time in my life, I decided to unleash yet another Open Source webmail client on the world. That surge of excitement you are not feeling at this point is totally understandable. Especially considering the code I wrote the first two times would best be stowed away in the “how not to write complex software” file.

I set out with a newly provisioned github repo, the enthusiasm of someone half my age, and some lofty goals:

  1. Build a client with combined views from multiple E-mail accounts, able to speak both IMAP and POP3, and flexible enough to merge other data sources
  2. Turn security up to 11. Perhaps 12
  3. Make it fast, compact, and compliant
  4. Utilize a modular system that all components outside the bare bones framework use. Like an uber-plugin system the whole app runs on
  5. Do all this while pushing myself to learn what great features new versions of PHP have to offer

To get started, I ferreted out and cleaned up the core IMAP, POP3, and SMTP routines from my last webmail project. While I was at it, I modernized the IMAP library to support some useful protocol extensions, and even built some unit tests *gasp*. These libs have been battle-tested against real world server idiosyncrasies for over a decade, so while they may not be ideal from a code design standpoint, they have an established record of compatibility. This is important when dealing with complex protocols that have a myriad of server implementations. I’m looking at you IMAP.

Next I set out to create a simple request and response processing framework – one that uses “modules” to do the real work of building the resulting page. The framework is lightweight (request processing uses on average 2MB of server memory), and leverages some nifty code features. With a framework in place, the next step was to start cranking out module sets for specific functionality. I started with core requirements like laying out the page content and logging in and out. Next I dove into IMAP, since it would be the primary protocol for E-mail access, and easily the most complicated data source to implement.

9 months later I am happy to say I have a pleasant to use E-mail and RSS reader including preliminary SMTP support for outbound mail (very preliminary). It’s easy on the server and the browser, and has some interesting features for combined content views. It is still very much a work in progress, but here are some highlights:

  • Super small pages with minimal server requests. A single page load only requires 3 HTTP requests with a combined response size of about 30KB (gzipped). Email and Feed data are populated via one parallel AJAX call per source, with response sizes of ~1KB. All interface icons are served inline with data urls to keep request count low.
  • Oodles of security features: TLS/STARTTLS support for all protocols; forced HTTPS for browser requests; secure HTTP-only session level cookies; AES compatible encryption for session and persistent data using unique keys; white-listed and typed user input; built-in HTTP POST nonce enforcement; HTTP header fingerprinting; easy output escaping; a two factor authentication module; probably more I’m forgetting.
  • Modules for IMAP, POP3, SMTP, RSS, and several other app components with more on the way. Modules can be enabled or disabled independently. The module system is super flexible and lends itself to some interesting customization options. It might even turn out to be too flexible.
  • Easy-to-extend session management including stock PHP session support and custom DB sessions. The DB session support is not a registered PHP session handler – it is a completely independent implementation.
  • Authentication is also easy to extend and already supports authenticating via IMAP, POP3, or an included PBKDF2 compliant database schema.
  • Database access is not required (unless used for authentication), but can be leveraged for session and persistent data storage with any PDO supported DB. Table definitions are included for Mysql and Postgresql.
  • Validated HTML5 output, including responsive views for mobile devices and HTLM5 local session storage for caching.
  • Lots of other boring technical details really neat stuff!

I could ramble on about this forever, better stop now before I get carried away. No post about half-done probably soon-to-be obsolete software is complete without at least one screenshot. Here is a look at the interface with a combined view of 9 different RSS feeds.

hm3_feedsIt’s not only been a great learning experience to work on this code, it’s been a lot of fun too. The repository is at http://github.com/jasonmunro/hm3/ for anyone who wants to take a look. Documentation is scarce and things are changing quickly, so if you do check it out, use caution :).

The Humble Commuter Part 3 – Planes

It’s been over a year since I ditched my daily commute for a remote position at Automattic. These days I rarely drive, and when I do, it’s almost enjoyable again. Almost. I do find myself careening through the atmosphere in a flimsy metal tube a lot more often than ever before. Just as I learned a bevy of invaluable lessons about surviving the soul crushing daily drive into the office  (here and here), I have been carefully observing my fellow airborne travelers to provide you with the best tips for making your in-flight and airport experience as painless as possible.

  • When the flight attendant “suggests” you turn off your cell phone or put it in airplane mode, this does not apply to you. The fact that your brother’s friend’s cousin worked at an airport Cinnabon ® and overheard someone say that cell signals don’t interfere with the plane’s operation is enough evidence to risk the safety of your fellow passengers so you can squeak out one more tweet about whatever inane crap you tweet about.
  • From the very first moment you take your seat on a plane, the battle for control over the shared armrest begins. Plant your arm down on that sucker and keep it there. If you move it, you will lose it. You may want to consider adult incontinence gear so you can avoid having to travel to the lavatory or inadvertently wet yourself on longer flights. Persistence pays off here, so don’t let your guard down.
  • Really long trips can take a toll on you. If you have been bouncing around airports and connecting flights for a day or two there is a neat trick you can take advantage of to up your personal comfort level – pop your shoes off. Those dogs have been bottled up long enough, so uncork them and get ready to relax. It’s unlikely anyone in the pressurized cabin of recirculating air will notice your reeking foot odor. Don’t forget to recline the seat the full .1 degrees for the ultimate in relaxation.
  • Wear noise canceling headphones with music so loud someone would have to basically punch you in the face to get your attention. This way you can avoid any important announcements in either the airport or on the plane. When asked why you missed the fact that the gate personal have been calling your name for the last hour and the flight is delayed because of you, just stare blankly. This usually works
  • Gripe constantly. The seats are tiny, the food sucks, the plane is old, the flight attendant is a dick, the airport bathroom floor is sticky, the jerk next to you won’t share the armrest, the in-flight movie stars Adam Sandler – you get the idea. Never ending complaints help everyone around you feel better about their own unvoiced grievances, so really this is like performing a public service.
  • The TSA is a helpful group of friendly professionals there to help make you feel safer. They care about you, and treat all travelers fairly. You will not be singled out because of how you look or dress, and the rules for getting through security are uniformly applied in all airports everywhere. That might be the most off-the-charts satire I have ever written. It was actually physically difficult to type.
  • When getting ready to board the plane, you should start to crowd around the gate area with your bags ready about 30 minutes before any airline personnel show up. The plane will leave without you if you don’t board ahead of everyone in your boarding zone, so try to get to the front, preferably by acting as if other passengers are non-existent. Even better would be to sneak into an earlier boarding group. They probably won’t send you back.
  • It’s fun to see how big of a carry-on bag you can get away with. Avoid airlines that measure your bag for appropriate dimensions. If you end up on one of these, try anyway, and complain like crazy when they insist you check the bag. How can they not realize you don’t have five minutes to wait at the baggage claim with all the other suckers? What the airlines don’t want you to know, is that those overhead bins employ a Dr. Who Tardis-like technology that makes them cavernous on the inside. I have no doubt a full-grown African elephant could fit, nevermind an overstuffed duffle bag with three snowboards in it.

So the next time you are mid-take-off and the guy behind you is tweeting about the lack of legroom while hanging his sweaty socks on your seat back to dry off – Turn around and say hi – it might be me!

Rigged Markets?

Before I start ranting about stock markets, first a disclaimer: I worked as a software engineer for BATS Global Markets for five years and left to pursue other interests last May. BATS runs stock and options exchanges in the US and Europe. My views on this subject are based on my experience working closely with the operations team in the trenches (tranches?) of market infrastructure. If you don’t nod-off before the end of this post, you can decide for yourself if my position is biased or informed.

There has been a media blitz surrounding a book released a few weeks ago titled: “Flash Boys: A Wall Street Revolt“. It claims the markets are rigged, the exchanges enable it, and hyper-speed traders and exchange operators are raking in millions skimming money from your grandmother’s nest-egg. How I wish it were so simple! The market is complex and imperfect, but it’s not rigged – Technically. Actually, I would argue it’s less rigged now than at any time in history, which I will try to do assuming I don’t get sidetracked.

For over 200 years the US stock markets ran in geographically independent locations like New York and Chicago. During this time the floor broker ruled the roost. Don’t live near the exchange? Too bad. Think about the arbitrage opportunities when the telephone was invented – it was the ultimate speed advantage! Over the last 30 or so years the conversion to an all electronic market structure in effect democratized stock trading. It made the practice both more affordable and geographically accessible to significantly more people. It also amplified the problems associated with having disparate liquidity sources.

In 2007 the SEC tried to address this messy market fragmentation with the National Market System, or “Reg NMS”. The idea was to duct tape all the trading venues into one big financial Rube Goldberg machine. Actually the idea was to provide fair and accurate pricing for participants while fostering competition between trading platforms. Unfortunately, Reg NMS managed to achieve both to a certain degree. Current US market structure is effective, but it’s a complicated web layered with intermediaries. On top of that the implementation of this regulatory behemoth has flaws (gasp). If a loophole exists on Wall Street, well, you can probably guess what happens.

Speed has always been an advantage in price-time priority based markets. The “rise of the machines” in stock trading has only refined this reality to a very precise degree. Exchanges are required to provide fair access to all participants, but this only extends to the edges of their network. Outside of that zone, firms can use all the resources at their disposal to gain any advantage possible. NEWS FLASH: the ones with the fattest wallets and biggest brains tend to do better. I believe it’s called capitalism.

Not all market participants are created equal – some have special roles that basically sanction skimming in exchange for providing liquidity, also known as “Market Makers“.  Like it or not they are an integral part of the ecosystem. Electronic market making can be consistently profitable, as evidenced by firms having only 1 losing trading day in 5 years. But the flip side of this is tremendous risk when things don’t go your way. Knight Capital found this out in 2012 when a software snafu cost them over 400 million bucks in less than an hour. Ouch!

The stock market is not fair, and it never was. In my not-so-humble opinion, it was never designed to be. Suggesting this is a new problem because the big bad machines are pinching grandma’s pennies may be nice for book sales, but it conveniently glosses over a couple hundred years of history that suggest otherwise. I see the market as a work in progress. Keep in mind I’m a software developer, so for me nothing is ever “done” (it’s a job security thing).

There needs to be an ongoing discussion about what works and what doesn’t, followed up with good faith efforts to fix inefficiencies. This happens between exchange operators and market regulators on a regular basis. It’s a tediously slow process, but it does happen. Using populist rhetoric to vilify a system you don’t completely understand is counter productive to advancing that discussion.