Recent Posts

Two Years at Dropbox New Site! Timestamps Digitizing Home Videos JSON Never Dies - An Efficient, Queryable Binary Encoding Writing a Fast JSON Parser Assume Good Intentions
  • Two Years at Dropbox

    Disclaimer

    This post is a collection of stories from my time at Dropbox. Inevitably, someone will read too much into it and come away with some overgeneralized lesson, but keep in mind that I was only there for two pre-IPO years and only exposed to a couple specific areas corners of the company.

    I certainly don’t regret my time there - my coworkers were amazing and I learned a lot about myself. In fact, this post says more about me than it does about the company.

    And two years is the average Silicon Valley tenure, right?

    The Interview

    I’d been at IMVU almost ten years (an eternity by Silicon Valley standards!) and realized a year would disappear without me noticing. It was time for something new. I knew someone at Dropbox, and I’d met some more smart people at CppCon 2014, so when they reached out to see if it made sense for me to join, I didn’t say no.

    The interview process was confused. After dinner introductions and recruiter phone calls, I was first invited up to San Francisco to meet with a handful of product division leads. Next, they scheduled a proper technical interview. After passing what ended up being a short half-day round of trivial whiteboard problems, I was annoyed to find that I’d have to take another day off and come back for a second round. After passing those, they invited me again to meet with more product leads. By this point I’d met with something like 14 people. Finally, in frustration, I asked “What are we doing? It costs me real money to take days off, so do you want to offer me a job or not?” They did.

    At the time, it seemed like a good offer – relative to IMVU, a doubling in total compensation! In hindsight, though, I should have negotiated higher. More on that later.

    My excitement started to build. IMVU was a great engineering organization, but the product direction was weak and aimless. And from the interview alone, I could tell Dropbox had an amazing product culture; I had to see how it worked from the inside. The fact that the Dropbox client was 10 years old and remained a simple and refined UX said a lot – any other company would have peppered the product with random features. As a random example of what I mean, Excel currently lists “Bing Maps” in the ribbon before “Recommended Charts”, letting its internal turf wars trump the user experience.

    Also, during the interview process, I got a sneak peak of what’s now called Dropbox Paper. Within minutes, I instantly understood the product, what problem it solved, and why people would want it. Not only that, but it was clean and delightful in a way that other online collaboration tools weren’t. I knew then that Dropbox’s product culture had magic, so I had to see it from the inside.

    Yellow Flags

    Between accepting and starting, three people independently reached out and said “Don’t join Dropbox. You’re making the wrong decision.” One person said the culture wasn’t good – politics and currying favor with an old boys’ network. Another said the commute to San Francisco would do me in, especially since I was having my third child. A third was concerned that I was joining a company too late in its life. Dropbox had already grown substantially and the last round of investment valued the company at $10B. I should buy low instead of high. I decided to join anyway.

    Regarding culture: I’m pleased to say that, while it may have been the case that Dropbox was a frat house in its early years, they had intentionally and decisively solved that problem by the time I joined. Like Facebook, they implemented something like the Rooney Rule in the hiring process. In addition, everyone was encouraged (required?) to take unconscious bias training. It was surprisingly valuable, and it made me notice the dozens of ways that an interview can be biased without realizing it. I felt that Dropbox did a good job of actively striving to make the workplace and hiring process as inclusive and bias-free as you can reasonably hope for.

    The last point, that Dropbox was valued too highly, was probably correct. Shortly after I joined, major investors wrote the stock down about 40%. Oof. (I had factored a possible 50% write-down into the evaluation of my offer, but I realize now, given the illiquidity, I should have pushed the equity component of my offer much higher. Again, more on compensation later.)

    Onboarding

    After IMVU, I was so excited to jump into something new and get the clock rolling on the vesting schedule, I took no time off between IMVU and Dropbox. For anyone reading this: please don’t. Take time off, if only to reset your mind.

    Either way, the recruiter had said my chosen start date would be okay. That was a lie - in reality, Dropbox only onboarded new employees every other Tuesday. This left me without health insurance coverage for a week between jobs. Fortunately, none of the kids got hurt that week. Always quit your previous job at the beginning of the month so you are covered until the new job starts.

    The initial few days of hardware, account setup, and security training went smoothly. Given that IMVU did nothing to onboard people into the culture, this was a big improvement. (But it paled in comparison to what I’d later see in Facebook’s onboarding process.)

    My First Team: Sonoma

    During the interview, the pitch was that I could slot in as the tech lead for the Paper backend team. Its current tech lead was leaving Dropbox. Not everyone on the team knew that yet, and I didn’t know it was secret, so there was an awkward moment when I said to the current lead in a group interview “So where are you headed next?” and he said “I don’t know what you’re talking about.” Oops.

    Before I joined, however, the Paper backend team moved to NYC, so I was instead assigned to a product prototyping team. The team’s average age was in the low 20s and my grandboss wanted someone with experience on how projects play out.

    The prototype’s current iteration was a mess. It was built atop other, failed prototypes, which in turn were built on real shipping features, so you never quite knew which code was alive or dead.

    And Dropbox’s deployment cadence was daily, except that half the time the daily push would fail, meaning we weren’t able to test hypothesis very rapidly. Relative to IMVU’s continuous deployment, this was jarring, but it’s also just not a good way to develop new products.

    Iteration Speed

    Coming from IMVU, my expectations around developer velocity were extremely high. If a unit test took a second to run, that was considered a failure on our part. (It doesn’t mean all of our tests took under a second, but we pushed hard.) We also deployed to production with every commit to master.

    So I was shocked to find that running an empty Python unit test at Dropbox took tens of seconds. Worse, when you fixed a bug and landed it, depending on how the daily push went, it might make it to production within a day? Two? Maybe more? Compared to IMVU, this workflow was unacceptable, especially for a prototyping team that was trying to find product market fit as fast as possible. One day, after struggling to get the simplest diff landed, the frustration overflowed; I stood up and exploded “HOW THE FUCK DOES ANYONE GET ANYTHING DONE AROUND HERE?!?”

    The iteration speeds were bad at every scale. Unit tests were slow; uploads to the code review tool were sluggish; the reviewer might look at your diff after two or three days; and finally the aforementioned deployment issues. The net result is that simple changes took days, so you had to pipeline a lot of work in parallel, resulting in constant context switching. (This was especially painful for someone like me with high context switch overhead. I prefer to dive deep on a problem, move fast, and then come up for air.)

    One thing I don’t understand is why these iteration speeds were tolerated. Is it because the company had a huge number of new graduates who didn’t know better? Maybe the average Dropbox engineer can handle context switches a lot better than I can? Perhaps the situation was better outside of core product engineering? Maybe Dropbox grew so fast that the situation regressed faster than people could fix it? All of the above?

    Greenfield

    Shortly after I joined, the Sonoma team failed. We cancelled the prototype and half the team quit. (Retention was a general issue - I’ll talk more about that later.)

    However, executives decided the initiative was still valuable and needed a fresh look. We rebooted the team with a few people from the old team, some internal hires, an acquisition, and some interns to attack the same problem space.

    To avoid our previous iteration issues, we decoupled from the main Dropbox stack and built our own.

    The new stack was a TypeScript + React + Flux + Webpack frontend deployed directly onto S3 and a Go backend. None of us actually wanted to use Go, but it had momentum at Dropbox and many existing Dropbox systems already had Go APIs. Our iteration speed was great. We could deploy whenever we wanted and were only limited by our ability to think, as it should be.

    This new team, by the way, was the most gelled team I’ve ever worked with. Not only was everyone productive and thoughtful, but our personalities meshed in a way that coming into the office was a pleasure. And talk about cross-functional! It was as if everyone on the team was multiclassing. Some of the engineers on the team would easily have qualified as product managers at IMVU, and our (excellent!) designers regularly wrote code. The level of empathy and thoughtfulness they had about the customer’s emotional state surpassed anything I’d seen.

    I hope someday to work on a team like that again!

    And, in a form only conceptually related to any code I wrote, our project did eventually ship!

    Personal Life Fell Apart

    Shortly after our product team rebooted itself and was in its Sprint Zero, my personal life began to unravel.

    When I joined Dropbox, I knew the 90-minute commute from the south bay to San Francisco would be painful. And my wife was pregnant with our third child, so I’d be taking paternity leave only months after starting. But I did not predict how awful that year would be.

    Days before #3 was born, my beloved grandfather suddenly collapsed and died at 74 years old. One month later, my 54-year-old father was diagnosed with stage 4 lung cancer, and was (accurately) given nine months to live. That fall, my wife’s grandmother passed. I went from being devastated to numb.

    It was easily the worst year of my life, but I could not have asked for Dropbox to provide better support. Even though I’d just joined the company, they let me take all the time I needed to get my personal life back in order.

    Including paternity leave, I took months of paid time off. And when my father passed, my manager gave me all the time I needed to help my mother get her estate in order, and then gradually ease back into a work mindset.

    I’ll forever be grateful for the support Dropbox provided in the worst year of my life.

    Empathy

    You’d think, coming from IMVU – a company built around avatars and self expression – that a B2B file syncing company would be dry and uninteresting.

    But I was thrilled to discover Dropbox implemented all the practices I only dreamed of at IMVU. My team regularly flew around the country to meet with users, deeply understanding their workflows and bringing that knowledge back.

    It’s a common simplification to say that Dropbox is a file-syncing business. But file syncing is a commodity. Dropbox’s value is broader - it’s more accurate to think of Dropbox as an organizing-and-sharing-your-stuff business. This mindset leads to features like the Dropbox Badge and how taking screenshots automatically uploads and copies the URL to your clipboard, because most of the time screenshots are going to be shared.

    Employee Retention

    I’d heard about high rates of employee churn in mainstream Silicon Valley but IMVU had unusually high retention so I didn’t witness it until Dropbox.

    Maybe it’s a San Francisco culture thing. Maybe the employee base is young and not tied down. Maybe I joined right as the valuation peaked and people wanted to cash out. Either way, employee turnover was high. It felt like half the people I met would quit a couple months later. I can understand - if you’re in your 20s and sitting on a few million bucks, why not just go live in a cabin on a lake or spend a year climbing mountains.

    It’s hard for companies in Silicon Valley to keep employees - there are so many opportunities and salaries are so high that even new grads can work on almost any project they personally find fulfilling. I suspect the majority of engineers in SV could quit, walk down the street, and get a raise somewhere else with almost no effort.

    That said, longevity is important. If your goal is to become a senior engineer capable of large-scale impact, you have to be able to see the results of decisions you made years prior. If you jump ship every 18 months, you won’t develop that skill.

    I don’t have data on Dropbox’s retention numbers, but anecdotally it felt like they struggled to keep employees, especially senior engineers. I feel like they’d be well-served by doing a win-loss analysis on every regrettable departure, and solving that, even if throwing money at the problem is a short-term fix.

    No matter the reason, it’s a worrying sign when so many of the good people leave. Companies need to retain their senior talent.

    Programming Languages

    I firmly believe that programming languages matter. They shape your thoughts and strongly influence code’s correctness, runtime behavior, and even team dynamics.

    I’ve always been a fan of Python – we used it heavily at IMVU – but I learned something surprising: what’s worse than IMVU’s two million lines of backend PHP? Two million lines of Python! PHP is a shit language, for sure, but Python has all of the same dynamic language problems, plus high startup costs, plus so much dynamicism and sugar that people feel compelled to build fancy abstractions like decorators, proxy objects, and other forms of cute magic. In PHP, for example, you’d never even imagine a framework inspecting a function’s arguments’ names to determine which values should be passed in. Yet that’s how our Python code at Dropbox worked. PHP’s expressive weakness leads to obvious, straight-line code, which is a strength at scale.

    There was a migration away from Python and towards Go on the backend while I was there. Go is a great language for writing focused network services, but I’m not sure replacing all the complicated business logic with Go would be successful. My experience on our little product team writing an application in Go is that it’s way too easy for someone to introduce a data race on a shared data structure or to implement, for example, the “timeout idiom” incorrectly, leaking goroutines. And no matter how many people say otherwise, the lack of generics really does hurt.

    When I joined the company, the Dropbox.com frontend was written in CoffeeScript. I’ve already written my feelings on that language, and the Dropbox experience didn’t change them. Fortunately, while I was there, the web platform team managed a well-motivated, well-communicated, and well-executed transition to TypeScript. Major props to them - large-scale technology transitions are hard in the best of times, and they did a great job making TypeScript happen.

    Compensation

    I never felt underpaid at IMVU. I joined before the series A and was given a very fair percentage of the company, especially for a new graduate. Sure, my starting salary was pathetic, but it climbed rapidly once we took funding, and my numbers were at the top end of glassdoor’s numbers for engineers. And relative to other startups or mid-sized private companies, I probably wasn’t underpaid.

    But Dropbox competes with the FAANGs for talent and I had yet to realize just how high top-of-market rate for senior engineers had climbed. Also, levels.fyi wasn’t a thing, and since I was poached rather than looking around, I failed to acquire competing offers. So I didn’t know my market worth.

    Now, by any reasonable person’s standards, my Dropbox offer was good. They matched my IMVU salary and gave me an equivalent amount in RSU equity per year (for four years), plus another 20% of my salary as a signing bonus. That should be great, and I was happy with the offer.

    But, in hindsight, I could have gotten the same offer from publicly traded FAANGs, and since Dropbox was private (and possibly overvalued), I should have fought for a 2x equity multiplier at least.

    This would have forced the company to place me into a more impactful role and level, ultimately making my work more satisfying, and perhaps keeping me at the company longer. As it was, I was underleveled.

    Always try to get competing offers. Had I done that, and had Dropbox still been the #1 choice, I might still be there, with both sides happier.

    Credibility

    When I joined IMVU, it was a small team of founders. Eric Ries said to me “For your first week here, you should fix one thing for each person.” This was great advice. Building rapport early is important.

    I’d forgotten that advice by the time I joined Dropbox. I had been in a position of implicit credibility for so long that I assumed it would carry over. It was a splash of cold water to realize nobody cares what you’ve done. Nobody cares about what other people have thought of you. As a new hire, you’re an unknown like everyone else.

    It’s common for people joining from other companies to talk about their experiences. “At Acme Corp, we did this.” But when nobody has any shared context with your time at Acme, that sentence conveys no meaning.

    I spent too much time saying “At IMVU, we did X.” To me, IMVU was a great place that did a lot well. But talking about it wasn’t helpful. Eventually, I learned to rephrase. “I’ve noticed we have X problem. What do you think about trying Y solution?” Nobody cares how you learned the trick, but you’re a wizard if you perform it in front of them.

    Security

    Dropbox cares a lot about security. They’re fully aware that breaches destroy trust, and Dropbox greatly values its customers’ trust.

    As a customer, I’m very happy to know a world-class security team is protecting my data. As a developer, it sometimes was a pain in the ass. :) Culturally, they rounded towards more secure by default, even if it negatively affected velocity. My team had to sign some Windows executables, and because we didn’t have an internal service that handled apps other than the main Dropbox app, I had to be escorted to the vault, where a signing key was briefly plugged into my laptop, and all uses were supervised.

    And the developer laptops were sooo slow. I don’t know how you take a brand new MacBook Pro and turn it into something so sluggish. The week when the virus scanner was broken was glorious. I believe all activity on our machines was monitored too. “You should have zero expectation of privacy on your work hardware.”

    Development gripes aside, as a user, I trust Dropbox to protect my data.

    Diversity and Interviewing

    Dropbox paid a lot of attention to employee diversity. Everyone was required to take unconscious bias training, and the interview process aimed to limit the risk of race or gender or even cultural background clouding a hire decision. For example, it’s common for interviewers to factor in the presence of a GitHub profile as positive signal, but the Dropbox interviewing process cautioned against this, as it biases towards people of a particular background.

    To that end, the interview process for engineers was mechanized. The primary input on your hiring decision was how well you could write correct, complicated, algorithms-and-data-structures code on a given set of whiteboard questions, where questions involving concurrency were considered especially valuable. This process, while attempting to be bias-free, had unintended effects. It resulted in a heavy bias towards new grads from high-end computer science programs, such as the Ivy Leagues.

    And even with the briefest glance around the office, you could tell the employees weren’t a representative slice of society. The company was full of pretty people. I’m sure being headquartered in San Francisco and the relative youth of the employees had an effect. I’m at Facebook now, and it feels a lot more like a normal slice of society. Or at least of suburban Silican Valley.

    I referred two of IMVU’s best engineers – the kinds of people who have average CS backgrounds, but who have shipped a ton of high-value code and led major projects. One didn’t make it through the screen, and the recruiter wrote the note “Declined: we don’t have much signal about IMVU” and the other failed the interview because they didn’t use a heap to solve a certain problem. The moderator in the debrief told the (junior) interviewer “Now, now, a senior industry product candidate probably hasn’t used a heap in 10 years”, but the result remained the same.

    I appreciate Dropbox’s attempts to create a bias-free interview process, but I worry that it values fresh CS knowledge over experience and get-it-done attitude.

    By the way, when the FAANGs and Dropbox are offering compensation packages twice what startups can afford, this is where startups can compete for talent. There are many people who didn’t graduate college but are focused workers or have a knack for understanding users.

    Creative Talent

    The sheer density of creative talent at Dropbox was amazing. Designers and product managers could hack on the code. Product engineers had an amazing sense of empathy for the customer. There was art on the walls and various creative projects all around the office. It seemed like everyone I met had multiple talents.

    Even the interns were amazing. These kids, barely old enough to drink, had a strong grasp on cryptography and networks and distributed systems and programming language theory, on top of all of the basic CS knowledge. Motivated individuals have access to so much more information than when I grew up, and I’m a bit jealous. :)

    Hack Week

    IMVU began company-wide hack weeks in 2007. Eric Prestemon, one of our tech leads, modeled the idea off of Yahoo’s hack days, but as far as I’ve heard, we might have been the first to make it a week-long quarterly event. (I look forward to hearing from you if your company also ran a hack week.) So when I joined Dropbox, the idea was quite familiar, and the benefits obvious.

    The idea is that, on some regular cadence, you give everyone in the company an entire week to work on whatever they want. The normal backlog is paused, product managers have no direct influence, and shipping to 1% of customers is encouraged. It’s good for the business – risky product ideas can be prototyped, some of which become valued parts of the product. And it’s good for employees – everyone gets a chance to drive what they think is important and underserved. Hack weeks inject a dose of positivity into the work environment.

    But I must say that Dropbox ran its hack week better than IMVU ever did. IMVU’s hack week started off open-ended. As long as it was somewhat related to the business, employees could work on anything. But over time, the product managers put an increasing amount of pressure on people to work on their projects and deliver concrete value that week.

    Dropbox, on the other hand, invested substantially more organizational effort into supporting hack week. The dates were announced in advance, giving people time to write up proposals, merge project ideas, and form small teams to work on them.

    While most people applied their creative energy to unexplored product ideas, there was no pressure to do any particular thing. At Dropbox, it was totally cool to spend hack week learning a new skill, blowing glass, or trying to break a Guinness record. There’s nothing like being surrounded by excited people. Passion is contagious.

    Projects were celebrated during Friday’s expo, where the office was arranged into zones and each zone given a time window for presentations. Then, everyone, including executives, would tour the projects. The most impactful or promising would get a chance to be officially funded. I can’t put into words how amazing some of the projects were. Dropbox hack week was like getting a glimpse into what the future of business collaboration will look like. Of course, it takes time to ship features properly, but these weren’t smoke and mirrors demos. Many projects actually had their core loops implemented.

    Community

    If you care about volunteering your time and giving back to the community, Dropbox is a great place to work. Every quarter, you could take two paid days off to volunteer your time. For example, you could work in a local school or a food pantry. Monetary donations to charity were also matched one-to-one up to a cap.

    Charity and service opportunities were regularly by email. Public service was celebrated and part of the culture.

    Code Review

    Dropbox follows a diff-based code review process using Phabricator. I think it was largely copied from Facebook’s. I’ve written before that I don’t think diff-based code reviews are as effective or efficient as project-based, dataflow-based code review.

    And as I expected, the code review process at Dropbox lent itself to bikeshedding. To be fair, code review processes are cultural, so I imagine diff-based review could work well.

    Nonetheless, it was common for me to have diffs blocked for minor things. My diffs were rejected for things like use of tense or capitalization in comments. Meanwhile, important decisions like why I chose a certain hash function or system design would receive no comments at all.

    Also! A common antipattern was for someone to block my diff because, even though it was an improvement over the previous state, it didn’t go far enough. This unnecessary perfectionism slowed progress towards the desired end state.

    The net result was a lot of friction in the development process. At IMVU, we followed a project-structured flow, where a team planned out a body of work, had an informal design review (emphasizing core components over leaf components), implemented the feature with autonomy, and finally, as the project wrapped up, one or more hour-long project-based code review sessions were held. This made sure we got the high-order bits right, while letting the team move quickly during development.

    In contrast, at Dropbox, code review was interleaved throughout development. Coupled with the fact that everyone was busy and team members had different schedules, the turnaround time on diffs was measured in hours or days. In egregious cases, a small diff might have one code review cycle per day, and get bounced back multiple times, resulting in a three-day latency between work starting and the diff landing on master.

    This meant I rarely entered a flow state. I had to keep a handful of diffs in flight at all time which was tremendously inefficient, at least for me – I am not great at context switches.

    [UPDATE: I wrote the above before joining Facebook, and Facebook’s diff-based code review process is much healthier. It’s a combination of culture and tooling. Maybe I’ll write about that sometime.]

    Testing

    My understanding is that Dropbox didn’t form a testing culture until years after the company started. The result is that the basic processes of effective testing at scale were still being figured out. Coming from IMVU, it felt like stepping back in time about five years. (An aside, this made me realize that company maturity is an orthogonal axis to revenue and size.)

    Testing maturity is a progression.

    1. No tests
    2. Occasional, slow, unreliable tests
    3. Semi-comprehensive integration tests
    4. Fast, comprehensive unit tests comprise the bulk of testing
      1. Dependency injection
      2. Composable subsystem design
    5. Real-time test feedback (ideally integrated into the editor)
    6. Tests are extremely reliable or guaranteed reliable by the type system
      1. With tooling that tracks the reliability of tests and provides that feedback to authors.
    7. Fuzzing, statistically automated microbenchmarking, rich testing frameworks for every language and every platform, and a company culture of writing the appropriate number of unit tests and high-value integration tests.

    If IMVU was somewhere around 5 or 6, Dropbox, when I joined, was closer to 3. The situation improved while I was there, but this stuff takes time. And good ideas spread more slowly if you have more junior engineers. Also, every flaky test written – or integration test that could have been a unit test – is a recurring cost on future engineering, so it’s valuable to climb this hierarchy early in an engineering team’s life.

    All of that said, the company did important work on this axis while I was there. They’ll probably catch up eventually.

    EPD

    At IMVU, between 2010-ish and 2015, there was a strong divide between product management, design, and engineering roles. IMVU’s executive leadership was a proponent of “The person who makes the decision must be responsible, and since product management must be held responsible, they must also have full control over their decisions.” The implication is that product management has total say, and engineers must do what they’re told. As you might imagine, people don’t like being told their opinion doesn’t matter, which led to conflict and unhealthy team dynamics.

    I personally favor a soft touch product management style, where product management gathers data, shares context with the team, and guides it to success. (See this excellent interview with Bob Corrigan.) I understand it’s harder to judge the success of a PM than with a black-and-white “was your product successful”, but top-down tactical team management is not healthy, and IMVU was frequently guilty of that.

    Thus, when I went to Dropbox, I was thrilled to see that engineering, product management, and design were considered one unit. The teams I was involved with had frequent open communication between all team members, and I did not observe any disagreements that weren’t quickly resolved by sharing additional context. (Though there were a couple times that context led to people quitting or switching teams, haha. But better that than misery.)

    Now, product management still owned the backlog. Unlike Facebook, engineers did not have true autonomy. But the dynamics were so much healthier than IMVU’s. I’ll grant it’s possible that healthy dynamics are easier when revenue and the stock price are growing.

    The Food

    The food at Dropbox is unreal. On my first day, I had the best fried chicken sandwich of my life. For lunch, it was not uncommon to have to decide between duck confit, swordfish, and braised lamb shank for lunch. I thought “There’s no way this lasts.” But… it did, at least as long as I was there. The quality did briefly dip a bit as the company moved to a bigger office (with a different kitchen) and expanded the food service to all of the new employees, but it recovered.

    Dropbox’s kitchen – known as The Tuck Shop – never made the same dish twice. (Though common themes would pop up every so often.) At first, this made me sad. With an amazing dish would come knowledge that I would never have it again. This was hard to bear. There were no recipes; the dishes came straight from the minds of Chef Brian and his team. Eventually, the Tuck Shop gave me a kind of zen allegory for life. In life, moments are fleeting and you don’t get redos, so enjoy opportunities when they occur.

    I had a very long commute from the South Bay up to the city, so I ate a large breakfast every day. Breakfast is what made the long commute possible.

    Grilled cheese with tomato soup and sweet potato tots
    Breakfast was killer. Check out this grilled cheese with tomato soup and sweet potato tots.

    Oh yeah! It took me too long to learn about this, but the Tuck Shop hosted afternoon tea and cookies or cake. Afternoon desserts you’ve never heard of. The coffee shop made crepes. Fresh young coconuts every day. And even wine pairing on Wednesday nights.

    Wine pairing with cocktail, scallop, and pasta
    Wednesday night wine pairing, with cocktail, scallop, and pasta.

    It’s funny to hear people at other companies talk about how good the food is. (And I’m sure it is good!) But no way does Dropbox not have the best corporate food in Silicon Valley, or even the whole USA.

    Will it last? Hard to say. Is it egregious? It certainly feels like it, but the cost of food is dominated by labor and I’ve heard they’ve managed to get costs to a reasonable amount per day, with almost zero food wastage. Hundreds of restaurant-style entrees were prepared and plated on masse, with copious use of sous vide and diced herbs sprinkled on top. I’m sure food costs were still a drop in the bucket compared to the salaries of thousands of engineers.

    Floss

    This might sound silly, but one of my favorite benefits was the fact that every bathroom was stocked with floss and other oral hygiene items. I’ve always at least tried to make an attempt to floss regularly, but when it’s right there at work, flossing regularly is so much easier. It was especially important after those amazing lunches.

    Wrong Career Trajectory

    About a year into my employment, the honeymoon was over. While I was enjoying my work and the team, the impact I was having on the company was tiny compared to my potential. For one, I was on a product prototyping team, isolated from the main Dropbox offering. Like a startup, the only way for a new effort to have significant impact is by succeeding. And the chances of that are small. While I really enjoy pushing pixels and executing that tight customer-feedback-write-code loop, I wasn’t going to get promoted doing that. In fact, I know a bunch of cheaper engineers than me who write better CSS.

    In hindsight, I probably should have joined an infrastructure team from the beginning. Engineering infrastructure has a lot of visible impact, requires deeper technical leadership than product work, and aligns better with my skillset. That said, I intentionally joined Dropbox to learn about its product culture. I’d also had little exposure to mainstream web stacks (IMVU hand-rolled its own mostly due to unfortunate timing), and no exposure to Electron, iOS, and native macOS development. Plus, again, pushing pixels with world-class collaborative designers and a gelled team is delightful. :)

    I’m conflicted, but I can’t say I regret my time on the prototyping team. The friendships alone were worth it, and I can justify it like getting paid to go back to school after 10 years of IMVU technology and habits.

    Nonetheless, I was unhappy, so I made the transition over to the dropbox.com performance team.

    Web Performance

    Web performance and platform APIs are squarely in my wheelhouse. I led the team that transitioned IMVU from desktop client engineers to web frontend engineers and, with my team, built most of IMVU’s modern web stack (and to this day, a lot of what we did remains better than off-the-shelf open source).

    So when Dropbox kicked off a strike team to reduce its page load times from eight seconds (!) to four, I decided that was a great opportunity, and switched teams.

    This proved to be quite challenging for me. Coming from a prototyping team with its own stack, I had little background on how the core Dropbox.com stack worked. Meanwhile, my new teammates had years of experience with it. And because the effort was a fire drill, it never felt like there was enough time to sit down and properly spin up.

    The big lesson for me here was about setting expectations. While I had a lot of experience in the space, I did a poor job of making it clear that I’d need time to spin up on the team. As a new but senior engineer, I could have managed the dynamics much better. I would have done better with more independence, autonomy, and time.

    Management Churn

    The web platform team also had a lot of management churn. I liked my first manager quite a lot, but we both knew that everything would change once our performance goals were achieved. The company planned to hire a new group manager, who would then hire his own managers for each team.

    In the short term, I reported to the new group manager, but he didn’t last long at the company. So then I reported to the team’s tech lead, who wasn’t planning on being a manager again, but had to.

    The result of all of this was that I had seven managers in two years. I’d heard about Silicon Valley’s high rate employee turnover, but with so many managers it was hard to build rapport and learn each other’s styles.

    It didn’t help that my new manager’s style was very command-and-control. After our team planning, he laid out my next several months’ worth of work. Given my need to work with autonomy, this was probably the beginning of the end of my time at Dropbox.

    Note that I liked everyone I worked with. There just wasn’t enough time or space to build a strong working relationship.

    Impact

    I’d been at Dropbox for about 20 months before I learned how engineers are leveled. The problem boils down to this: given an organization of a thousand or more, how do you decide how to distribute responsibilities and determine compensation? You’d like to maintain some kind of fairness across disciplines, organizations, teams, and managers. In an ideal world, your team and manager are independent of your compensation.

    Dropbox culture derived largely from Facebook (as many early employees had come from Facebook), and Facebook determines level and compensation by gauging each employee’s impact. During review season, managers from across the company are all shuffled into groups that review random engineers. This prevents a manager from biasing their reports upwards or downwards and adds consistency. This calibration process focuses primarily on a sense of that person’s impact on the company.

    At Facebook, impact is a first-class concept. It’s common to hear “I have some impact for you.” But I’d come from a 150-person company where the pay bands were wide, people did a mix of short-term and long-term work, and managers were primarily responsible for placing their employees in said pay bands based on a variety of factors, such as team cohesiveness, giving high-quality feedback to peers, and writing quality code. IMVU was a team culture.

    Dropbox had copied their system from Facebook. Now, I don’t think IMVU’s system was better, and I don’t think Dropbox’s was bad. Here’s the problem: nobody ever told me how it worked.

    If I had known how my level and compensation were determined, I would have made very different decisions. Facebook, on the other hand, has a class during orientation that explains how your compensation and level are determined. They have countless wiki pages, internal posts, and presentations on the subject. The incentives are very explicit.

    I was too naive and trusting and assumed everything would just work out if I was a good teammate and worked hard. I now see the value in grasping the mechanics of the incentive structure early.

    Obviously you can find problems with any incentive system, but the real issue here is that I somehow never learned that Dropbox’s leveling process required you to keep track of your specific contributions, especially the more nebulous ones that don’t bubble up through typical project management, and provide enough data that your manager can make the case for a level adjustment in the company-wide calibration process.

    I was months away from quitting when I found out how this worked, and suddenly it explained why my manager had always wanted specific examples of work I’d done. At the time, it had seemed like whether I’d done such and such optimization was irrelevant on an the annual review. I could have done a much better job of framing my contributions.

    Thinking from First Principles

    I (and several of the early IMVU engineers) are partial to first-principles thinking. What are we trying to do? What are the constraints? What are the options? How do they weigh against the constraints? Decide accordingly. Perhaps this comes from working on games where, at least in the early years, technical constraints had a big influence on the options. Facebook also seems to have a first principles culture.

    But one thing I found frustrating at Dropbox is a kind of… software architecture as religion. “No, don’t structure your code that way, it’s not very Fluxy.” or “[Facebook or Google] do it this way, we should too.” or “We need to compute this data lazily for performance [even though a brute force solution under realistic data sizes can easily achieve our concrete performance targets].” or arguments like “We should base our build system on webpack because it seems to be the winner [without measuring its suitability for the problem at hand]”.

    Again, I’m conflicted. There’s value in going with the flow and not spending forever shaving all the yaks if something already fits the bill. But often, with a bit of research and some thought, you can come up with a better solution. In hindsight, I’m amazed that IMVU was able to build such great, reliable, and fast infrastructure with a small set of talented people, simply by thinking about the problem from the top and precisely solving it. For example, IMVU’s realtime message queue, IMQ, was better than any of the four that Dropbox had built, and was written by three people.

    (Now is a good time to remind you that I was only exposed to a small slice of Dropbox. I would hope that, for example, the data storage teams thought from first principles.)

    Lessons Learned

    In a lot of ways, Dropbox was a great place to work. I loved my teammates and learned a lot. Even though it ended up not working out, it’s hard to regret the time I spent there.

    Here are some lessons I took away:

    • Always get competing offers. Know your worth.
    • When you’re hired, take the time to understand how you’re judged. This would have prevented a lot of confusion on my part.
    • 90-minute door-to-door commutes are horrible.
    • Unless you’re Guido van Rossum or otherwise have a widespread reputation, building credibility is hard and takes conscious effort for months and possibly years.
    • As fantastic as the food perks are, meaningful work is better.
    • Thank you so much, Dropbox, for taking care of me and my family during a very hard year.
  • New Site!

    Finally! My new website is ready.

    What’s New?

    A new, minimal design! No more WordPress. The layout is responsive and looks good on everything from the 4” iPhones to a desktop.

    The RSS feed and Feedly subscription link work again.

    Discussions on Hacker News and Reddit are automatically discovered, linked from each post, and indexed via hackernews and reddit tags.

    Pages load in under 100 milliseconds from near the data center and half a second from Europe.

    Sadly, a casuality of migrating to a static site is losing the ability to post comments.

    Most importantly, now that I’m proud of the site, I am motivated to start writing again. That’s the theory anyway!

    History

    My first website was hand-authored in Netscape Composer and Notepad. Maroon text on a tiled wood texture, hosted by our local dial-up ISP.

    My second was a cgi-bin program written in C that parsed text files and produced HTML, hosted by a friend’s university server. I don’t recommend making websites in C.

    My third was split. Project pages and release notes were rendered with some custom PHP and hosted from my apartment on the domain aegisknight.org. Essays and personal updates went on my LiveJournal.

    As IMVU and I became more well-known, my posts started to appear on social media and LiveJournal became embarrassing. So I bought chadaustin.me, did a proper site design, and set up WordPress.

    Ten years later with a house and three kids I just can’t deal with the maintenance overhead for WordPress, PHP, and MySQL. Even beyond maintenance, WordPress started to feel stifling. Pages loaded slowly. The layout and style were hard to tweak. The latest themes were meh. The spam detection was flaky. I couldn’t easily batch updates across all posts. It looked like crap on mobile. The RSS feed had stopped updating and I couldn’t figure out why.

    So here we are! Back to a static site. This time powered by Jekyll and a bunch of Python scripts. Time is a river and history repeats.

    The Migration

    I wrote a Python script to dump the WordPress database into a JSON file.

    I then wrote a one-time migration script which took all of the WordPress posts and pages and converted them to Jekyll with appropriate YAML front matter. WordPress comments are placed in a list in the front matter which the Liquid template renders.

    Afterwards, on every build of the website, a post-processing script searches Hacker News and Reddit for links and places them into a list in the post’s front matter, which the corresponding liquid template can render.

    One challenge with manipulating the front matter is preserving ordering on the keys (or at least avoiding non-determinism). It’s possible to override pyyaml so it uses OrderedDict for mappings.

    I wrote a pile of automated tests to verify none of the WordPress links were broken, driven primarily by top hits to the access log and external links known to Google’s Search Console.

    Writing these tests was worth it - some links had broken because of inconsistent time zone handling in the source data, moving a post’s permalink to the next or previous day. I told Jekyll to pretend the site is specified in America/Chicago time zone which everything. :P

    Finally, sudo apt remove libapache2-mod-php mysql-server php-mysql. 🎉

    I’m sure there are still bugs. If you notice anything, please let me know!

  • Timestamps

    Working on Eden's timestamp storage.

    Hm, I wonder what ext4's possible timestamp range is?

    Time to read https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout#Inode_Timestamps.

    That's a cute way to extend the legacy timestamp format. One extra 32-bit value gives nanosecond precision and extends the range out to year 2446.

    What happens if we exceed the range?

    dev:~/tmp/timestamps $ df -T .
    Filesystem     Type    ...
    /dev/vda3      ext4    ...
    dev:~/tmp/timestamps $ touch -d '1900-01-01 01:02' a
    dev:~/tmp/timestamps $ ls -l
    total 0
    -rw-r--r--. 1 chadaustin users 0 Jan  1  1900 a
    

    Huh, that's weird. ext4 shouldn't be able to represent 1900.

    dev:~/tmp/timestamps $ touch -d '2800-01-01 01:02' b
    dev:~/tmp/timestamps $ ls -l
    total 0
    -rw-r--r--. 1 chadaustin users 0 Jan  1  1900 a
    -rw-r--r--. 1 chadaustin users 0 Jan  1  2800 b
    

    And it definitely shouldn't be able to represent year 2800...

    dev:~/tmp/timestamps $ sudo bash -c 'echo 2 > /proc/sys/vm/drop_caches'
    dev:~/tmp/timestamps $ ls -l
    total 0
    -rw-r--r--. 1 chadaustin users 0 May 29  2444 a
    -rw-r--r--. 1 chadaustin users 0 Aug  5  2255 b
    

    ... oh.

  • Digitizing Home Videos

    Several years back, my father started a project to digitize our home videos. He purchased an old computer from the IMVU automated builds cluster, bought an AVerMedia C027 capture card, digitized a few tapes… and then his digitization workstation sat there for years, untouched.

    Sadly, he passed away last year, so I picked up the project. There were four types of analog media to digitize: VHS tapes, 8mm tapes, audio cassettes, and old Super 8 film reels.

    Super 8

    The Super 8 film went to a service in Redwood City. I don’t have any relevant equipment to play it and they do a good job – they clean the tape, take high-resolution photos of each frame, and then apply color adjustments to correct for any age-related fading, overexposure, or underexposure. The output format is up to you. I selected an MP4 movie file and a 1080p JPEG for every captured frame. (30 GB of 1080p JPEGs for 20 minutes of video!)

    The service worked out pretty well. My only complaint was that I gave them seven individually labeled 3” film reels but, presumably to make it easier to process, they taped six of the reels into one larger 6” reel, so I had to split the files back up. Avidemux made lossless splitting on the I-frame boundaries trivial.

    Audio Cassettes

    The audio was similarly easy. ION makes an inexpensive tape deck that advertizes itself as a stereo USB microphone. You can capture the audio straight into Audacity and clip, process, and encode as needed.

    VHS and 8mm

    The bulk of the project was VHS and 8mm: we had two medium-sized moving boxes plus a shoebox of VHS tapes and a medium-sized box of 8mm. Probably close to 100 tapes in all.

    Home videos are not worth much if nobody can watch them, so my primary goal was to make the video conveniently accessible to family. I also wanted to minimize unnecessary re-encodes and quality loss. The film and VHS had already degraded over time. Some quality loss, unfortunately, is inevitable without spending $$$ on dedicated equipment that captures frames from the tape.

    My parents happened to own a very high-quality VCR that’s still in great shape. The capture sequence ended up something like this:

    Video Cassette -> VCR -> Composite Cables -> Capture Card -> MPEG-2

    Since each tape contained a hodgepodge of home videos (sometimes interleaved with TV recordings!), they had to be split up. The excellent, open source dvbcut software is perfect for this: it has a quadratic slider for frame-accurate scrubbing and it only recompresses frames when your splits don’t line up precisely with I-frames. I recommend doing your dvbcut work on an SSD. Scrubbing is painful on a spinny disk.

    Converting the 8mm tapes was similar except replace VCR with the (again, still in great shape) Sony camcorder in playback mode. Also, since the 8mm tapes are mono but the capture card always records in stereo, you have an option. You can either run a post-split ffmpeg -map_channel step to convert the stereo MPEG-2 files into mono. (This has to happen after splitting because dvbcut can’t read videos after ffmpeg processes them for some reason.) Or you can tell HandBrake to mixdown the audio to mono from the right channel only. The latter avoids an audio re-encode, but it’s easier to forget when setting up long HandBrake jobs.

    Finally, because the captured MPEG-2 files are large (4 GB per hour of video), I recompressed in HandBrake to H.264. I don’t notice a material quality difference (besides some “free” noise reduction), and the H.264 MP4 files are smaller and have more responsive seeking.

    In the end, the steps that involve quality loss are:

    1. Real-time playback. Tracking glitches, for example, result in a few missed frames. But, like I mentioned, it would take $$$ to do a precise, frame-accurate digitization of each VHS frame.
    2. Composite cables instead of S-Video. I couldn’t find a VCR on Craigslist that supported S-Video output.
    3. Capturing in MPEG-2. I’m not convinced the real-time AVerMedia MPEG-2 encoder is very good - I’d occasionally notice strips of artifacty blocks in high-frequency regions like tree lines.
    4. A few frames of dvbcut’s re-encoding at the beginning and end of every split.
    5. YouTube / HandBrake. Might be slightly better to upload the split MPEG-2 into YouTube and let it recompress, but uploading 2 TB of video to YouTube didn’t seem very fun.

    The bulk of the time in this project went towards capturing the video. It has to play in real time. Each 8mm cassette was 2 hours, and VHS tapes range between 2 and 8 hours.

    The bulk of the effort, on the other hand, went into splitting, labeling, and organizing. I had to rely on clues to figure out when and where some videos were set. There were many duplicate recordings, too, so I had to determine which was higher quality.

    Now that all that’s done, I plan to upload everything to YouTube and make a Google Doc to share with family members, in case anyone wants to write stories about the videos or tag people in them.

  • JSON Never Dies - An Efficient, Queryable Binary Encoding

    A familiar story: Team starts project. Team uses JSON for interchange; it's easy and ubiquitous and debuggable. Project grows. Entity types multiply. Data sizes increase. Finally, team notices hundreds of kilobytes -- or even megabytes -- of JSON is being generated, transferred, and parsed.

    "Just use a binary format!" the Internet cries.

    But by the time a project gets to this stage, it's often a nontrivial amount of work to switch to Protocol Buffers or Cap'n Proto or Thrift or whatever. There might be thousands of lines of code for mapping model objects to and from JSON (that is arrays, objects, numbers, strings, and booleans). And if you're talking about some hand-rolled binary format, it's even worse: you need implementations for all of your languages and to make sure they're fuzzed and secure.

    The fact is, the activation energy required to switch from JSON to something else is high enough that it rarely happens. The JSON data model tends to stick. However, if we could insert a single function into the network layer to represent JSON more efficiently, then that could be an easy, meaningful win.

    "Another binary JSON? We already have CBOR, MsgPack, BSON, and UBSON!" the Internet cries.

    True, but, at first glance, none of those formats actually seem all that great. JSON documents tend to repeat the same strings over and over again, but those formats don't support reusing string values or object shapes.

    This led me to perform some experiments. What might a tighter binary representation of JSON look like?

    What's in a format?

    First, what are some interesting properties of a file format?

    • Flexibility and Evolvability Well, we're talking about representing JSON here, so they're all going to be similar. However, some of the binary JSON replacements also have support for dates, binary blobs, and 64-bit integers.
    • Size How efficiently is the information encoded in memory? Uncompressed size matters because it affects peak memory usage and it's how much data the post-decompression parser has to touch.
    • Compressibility Since you often get some sort of LZ compression "for free" in the network or storage interface, there's value in the representation being amenable to those compression algorithms.
    • Code Size How simple are the encoders and decoders? Beyond the code size benefits, simple encoders and decoders are easier to audit for correctness, resource consumption bounds, and security vulnerabilities.
    • Decoder Speed How quickly can the entire file be scanned or processed? For comparison, JSON can be parsed at a rate of hundreds of MB/s.
    • Queryability Often we only want a subset of the data given to us. Does the format allow O(1) or O(lg N) path queries? Can we read the format without first parsing it into memory?

    Size and parse-less queryability were my primary goals with JND. My hypothesis was that, since many JSON documents have repeating common structures (including string keys), storing strings and object shapes in a table would result in significant size wins.

    Quickly glancing at the mainstream binary JSON encodings...

    MsgPack

    Each value starts with a tag byte followed by its payload. e.g. "0xdc indicates array, followed by a 16-bit length, followed by N values". Big-endian integers.

    Must be parsed before querying.

    BSON

    Per spec, does not actually seem to be a superset of JSON? Disallows nuls in object keys, and does not support arrays as root elements, as far as I can tell.

    Otherwise, similar encoding as MsgPack. Each value has a tag byte followed by a payload. At least it uses little-endian integers.

    UBJSON

    Same idea. One-byte tags for each value, followed by a payload. Notably, lists and objects have terminators, and may not have an explicit length. Kind of an odd decision IMO.

    Big-endian again. Weird.

    CBOR

    IETF standard. Has a very intentional specification with documented rationales. Supports arrays of known sizes and arrays that terminate with a special "break" element. Smart 3-bit major tag followed by 5-bit length with special values for "length follows in next 4 bytes", etc.

    Big-endian again... Endianness doesn't matter all that much, but it's kind of weird to see formats using the endianness that's less common these days.

    CBOR does not support string or object shape tables, but at first glance it does not seem like CBOR sucks. I can imagine legitimate technical reasons to use it, though it is a quite complicated specification.

    JND!

    Okay! All of those formats have roughly the same shape. One byte prefixes on every value, value payloads in line (and thus values are variable-width).

    Now it's time to look at the format I sketched up.

    The file consists of a simple header marking the locations and sizes of three tables: values, strings, and object shapes.

    The string table consists of raw UTF-8 string data.

    In the value table, every value starts with a tag byte (sound familiar?). The high nibble encodes the type. The low nibble contains two 2-bit values, k and n.

    Consider strings:

    0011 kknn [1<<k bytes] [1<<n bytes] - string: offset, length into string table

    0011 indicates this is a string value. k and n are "size tags" which indicate how many bytes encode the integers. The string offset is 1 << k bytes (little-endian) and the string length is 1 << n bytes. Once the size tags are decoded, the actual offset and length values are read following the tag byte, and the resulting indices are used to retrieve the UTF-8 text at the given offset and length from the string table.

    Now let's look at objects:

    1000 kknn [1<<k bytes] - object, object index, then <length> values of width n

    The following 1 << k bytes encode the index into the object shape table, which holds the number and sorted list of object keys. Afterwards is a simple list of indices into the value table, each of size 1 << n bytes. The values are matched up with the keys in the object shape table.

    Arrays are similar, except that instead of using an object index, they simply store their length.

    This encoding has the property that lookups into an array are O(1) and lookups into an object are O(lg N), giving efficient support for path-based queries.

    But there's a pretty big downside relative to MsgPack, CBOR, and the like. The cost of efficient random access is that the elements of arrays and objects must have a known size. Thus, the (variable-width) values themselves cannot be stored directly into the array's element list. Instead, arrays and objects have a list of fixed-width numeric offsets into the value table. This adds a level of indirection, and thus overhead, to JND. The payoff is that once a particular value is written (like a string or double-precision float), its index can be reused and referenced multiple times.

    So how does this play out in practice? Net win or loss?

    Size Benchmarks

    I used sajson's standard corpus of test JSON documents.

    • apache_builds Root object, largely consists of an array containing many three-element objects.
    • github_events Mix of objects, arrays, and long URL strings.
    • get_initial_state I can't share the contents of this document as it's from a project at work, but it's 1.7 MB of various entity definitions, where each entity type is an object with maybe a dozen or two fields.
    • instruments A huge number of repeated structures and data -- very compressible.
    • mesh 3D geometry. Basically a giant list of floating point and integer numbers.
    • svg_menu Only 600 bytes - used to test startup and base overhead costs.
    • twitter List of fairly large objects, many long strings.
    • update-center From Jenkins. Mostly consists of an object representing a mapping from plugin name to plugin description.

    Conclusions

    We can draw a few conclusions from these results.

    • As a wire replacement for the JSON data model, there's no apparent reason to use BSON or UBSON. I'd probably default to MsgPack because it's simple, but CBOR might be okay too.
    • The current crop of binary formats don't compress any better than JSON itself, so if size over the wire is your primary concern, you might as well stick with JSON.
    • Except for small documents, where JND's increased representation overhead dominates, JND is pretty much always smaller uncompressed. As predicted, reusing strings and object shapes is a big win in practice.
    • LZ-style compression algorithms don't like JND much. Not a big surprise, since they don't do a good job with sequences of numeric values. I expect delta coding value offsets in arrays and objects would help a lot, at the cost of needing to do a linear pass from delta to absolute at encode and decode time.

    JND's disadvantage is clear in the above graphs: while it's smaller uncompressed, it does not compress as well as JSON or MsgPack. (Except in cases where its uncompressed size is dramatically smaller because of huge amounts of string or object shape reuse.)

    Where would something like JND be useful? JND's big advantage is that it can be read directly without allocating or parsing into another data structure. In addition, the uncompressed data is relatively compact. So I could imagine it being used if IO is relatively cheap but tight memory bounds are required.

    Another advantage of JND is a bit less obvious. In my experience using sajson from other languages, the parse itself is cheaper than converting the parsed AST into values in the host language. For example, constructing Swift String objects from binary UTF-8 data was an order of magnitude slower than the parse itself. In JND, every unique string would naturally be translated into host language String objects once.

    If you'd like to play with my experiments, they're on GitHub at https://github.com/chadaustin/jnd.

    Future Work?

    It was more work than I could justify for this experiment, but I'd love to see how Thrift or Protocol Buffers compare to JND. JND is a self-describing format, so it's going to lose on small messages, but when data sizes get big I wouldn't be surprised if it at least ties protobufs.

    Update: Forgot to mention decode times. I didn't have time to set up a proper benchmark with high-performance implementations of each of these formats (rather than whatever pip gave me), but I think we can safely assume CBOR, MsgPack, BSON, and UBSON all have similar parse times, since the main structure of the decoder loop would be the same. The biggest question would be: how does that compare to JSON? And how does JND compare to both?