4 Ways the New York Times and Sentry Simplify their Tech
I recently spoke to Nick Rockwell, CTO at The New York Times (a Sentry customer), and Yonas Beshawred of Stackshare about simple solutions, React, and going serverless. Take a listen or read on for the highlights. Because why would we even bother with the lowlights?
Simple solutions to not-so-simple problems
The glory days of print are behind us, leaving many major publications to search for relevancy in the age of digital media. The New York Times, although not altogether different, has actually been in the digital space for over 20 years at this point. In other words, while many of us were reading about the beginning stages of the last presidential impeachment in the pages of the old “Gray Lady,” the New York Times was laying the foundations for the digital presence by which we primarily know it today.
Nick Rockwell points out that, instead of worrying about the brand’s complex relationship with significance and reputation, “simple solutions to our problems are something that we try to remain focused on and not make things harder than they need to be.”
One such challenge is legacy, which can lead to a lack of discipline and inefficient product development. While getting started on a project or improvement is easy enough, following through while making sure outcomes align with business goals often proves difficult. “[I]t’s about pulling the rest of company and the infrastructure and our practices along, so that it all feels more modern, and we can have a good, sort of solid place to really focus on doing the product well, which is what matters.” Executing this plan is easier said than done.
To start, Rockwell implemented a process for making consistent technology decisions: the Architecture Review Board (ARB). The ARB is comprised of a cross-functional group of developers that — in addition to their daily engineering work — put technology proposals through a gamut of considerations. After several weeks of open and closed discussions, the ARB responds with prescriptive suggestions on how teams should proceed. With a process and clear expectations in place, Rockwell and his team began sorting out the technical debt.
From LAMP to React/Apollo
When Rockwell joined the New York Times, the tech stack was typical: “It was mostly PHP on the back end, MySQL, with some Java services mixed in. And then a little bit of everything.” Again, this was pre-ARB, so teams were choosing tools and languages at their own discretion. Boy, did they take advantage of this freedom; Go, Scala, Hadoop, Redshift, Dynamo, SQS, and many others all had a moment in the sun, with marginal planning around permanence or scale. Physical infrastructure was also scattered, including four data centers and an AWS virtual data center. Even the New York Times’s (Editor’s note: yes, we realize this looks incorrect, but the NYT considers the “Times” in its name to be singular, so the possessive form requires an apostrophe and an extra “s”) front end was a cobbled-together PHP framework. Unsurprisingly, an outsider could recognize the likelihood of silos, incompatibility, and discontinuity in this patchwork approach.
As Rockwell shifted his teams toward a more thoughtful and philosophical decision-making process, internal frameworks also began to shift. Most notably, New York Times’s front end is now React-based and using Apollo, although the rollout has been cautious. Specifically, “React is now talking to GraphQL as a primary API,” Rockwell explains. “There’s a Node back end, to the front end, which is mainly for server-side rendering… And then, behind there, the main repository for the GraphQL server is a Bigtable repository.”
Despite these drastic systems upgrades, content is still created in the old system. While Kafka stores and serves persistent data, MySQL acts as the back end to Scoop, the New York Times’s newsroom CMS. Still, Rockwell is dedicated to carefully chipping away at the legacy piece-by-piece. “Once we[‘ve] wired things up to be dumping published content onto Kafka, we could gradually move clients over to read over directly from Kafka, or come through the GraphQL server to get stuff.” These changes, of course, are done in the name of simplicity.
React + GraphQL = A more stable front end
Over-complication is a legitimate fear for Rockwell, especially when it comes to the New York Times’s front-end framework. From what he details, the front-end analytics don’t seem to be reducing this fear. “So, everything’s going into BigQuery, and almost everything is going straight into Pub/Sub and then doing some processing in Dataflow before ending up in BigQuery. We still do too much processing and augmentation on the front end before it goes into Pub/Sub. And that’s using some kind of stuff we pulled together using Dynamo and so on.” Executing his “simple solutions” proposition, Rockwell wants to eliminate as many elements as possible, having BigQuery hand data directly to Pub/Sub.
So, why the move to React, specifically? Rockwell explains, “You’re trying to balance the performance gains doing things server-side versus the development needs of composing things on the client side.” He goes on to mention that, of all front-end frameworks, there was actual excitement around React (and we can assume the ARB gave it a thumbs-up).
Similarly, the decision to implement GraphQL was welcomed by Rockwell’s team, as it was chosen to tackle problems like hydration and filtering. However, the challenge of ownership presented itself immediately. Rockwell’s team had to tackle “how schema changes would work, who was responsible for them, what the workflow would be, and, also, how much we would try to automate it so that you would make a change in one place and have it flow everywhere, versus not try to create an incredibly complicated contraption and just deal with changing things in a few places at the same time.” Ultimately, a choice of expediency was made: a platform team that worked on a similar product — not a front-end team — now owns the project.
The future is serverless and open source.
As technology at the New York Times continues to evolve, Rockwell envisions a future of purposeful tools, “where, as a developer, you don’t need to think about scaling and architecting for scaling.” Instead, these things are taken care of for you. He includes CDNs in his definition of serverless, like his venture with Akamai in 1999 that replaced entire sections of server infrastructure. Platform-as-a-service would also fit within this definition of serverless, like Google App Engine and Heroku.
Ultimately, Rockwell recognizes that the New York Times does not have to start from scratch with every tool. “[W]ho’s going to do a better job of utilizing the underlying resources? Like, us? Or, you know, Amazon and Google? And the answer is Amazon and Google will. So, over time, they ought to be able to run much more efficiently, and they ought to pass the savings on to us.”
A perhaps less-commonly-argued point is that going serverless also increases developer productivity and subsequent happiness. While engineers may be a bit more constrained, those same engineers have the time and space to focus more on their projects. “Spend less time operating and more time engineering,” as I always say.
However, even a serverless future has its problems. Rockwell’s team initially underestimated the impacts of potential changes, including those regarding security and documentation. The change itself was also disruptive, as it altered the needs, skill sets, and strategy of the New York Times’s ops team.
Rockwell applies the same need for purpose to the open-source movement. “[I]s this going to be useful to somebody? And do we believe in it in the long run? If so, we’ll open-source it without hesitation,” he explains. “I’d be more interested in giving back to the community than protecting some competitive advantage around the tech that we do.”
By streamlining the decision-making process and replacing legacy issues with purposeful tools, Rockwell is cultivating the solutions to extend the New York Times’s relevance and accessability well into the digital future. Sentry for React will (naturally) help strengthen the solid foundation Rockwell is constructing, as his teams are now able to record usage and environment details and recreate bugs on parameters specific to the New York Times app. As a purposeful tool itself, Sentry also contributes to the productivity of developers by isolating, contextualizing, and prioritizing React (and virtually every other platform’s) errors Of course, Sentry also helps teams monitor and debug serverless functions, but that’s a story for another time. Meanwhile, here’s a story for right now.