← Back to Blog

Automating Hundreds of TVs and Other Lessons from Amazon

February 18, 2025 · Natalie Walls

Automating Hundreds of TVs and Other Lessons from Amazon

Somewhere in an Amazon operations center, someone's full-time job used to be scrolling through TV menus.

Not just one TV. Dozens. Maybe hundreds. Day after day, navigating those terrible on-screen keyboards, typing in credentials, scrolling through content libraries, making sure the right streams were playing on the right displays.

If you've ever tried to log into Netflix on a smart TV using a remote control, you know exactly how soul-crushing this is. Now imagine that's your actual job.

When someone asked, "Can't we automate this?" it sounded like a simple problem. Spoiler: nothing is simple at scale.

The Device Zoo

The first challenge was the sheer variety of devices. Fire TV, Roku, Android TVs, Apple TV (more on Apple later). Each one had different automation capabilities:

  • MQTT for some devices
  • Swift automation for Apple stuff
  • Android test automation for Android TVs
  • Accessibility automation where nothing else worked
  • An internal Prime Video service for automating Fire TV and Roku

Getting scripts working for all these devices is only half the battle. Once you can control the devices, you still need to orchestrate them. Do you run on-prem machines in the operations centers? Or provision cloud hosts to simplify operations for your customers?

And then there's Apple. Why is Apple literally the worst? I won't elaborate, but if you've tried to automate anything Apple makes, you already know.

Building the Thing

After solving the device automation problem, we built an async service and webapp to manage:

  • Accounts and credentials
  • Content configurations
  • Display layouts
  • Running automations across hundreds of devices
  • Monitoring and alerting when things break

This is where I learned the most important lesson of my time at Amazon: operational code matters more than clever code.

Observability is Key

Writing code that works is one thing. Writing code that someone else can debug at 2am when it breaks is another.

Here's what actually matters:

  • Log the start and end of every function with state information
  • Design your cloud infrastructure so it's not spaghetti - log groups shouldn't point to each other like a treasure hunt
  • Write specific alarms with runbooks so simple a hungover engineer can follow them

That last one is crucial. If your runbook requires deep system knowledge or careful reasoning, it's not a good runbook. The person fixing the issue at 3am might be hungover, sleep-deprived, or both. Design accordingly.

The TypeScript Redemption Arc

We used TypeScript more than JavaScript on this project, and honestly? It won me over.

I used to hate JavaScript. But TypeScript's functional programming fundamentals - the way it encourages you to think about data flow and transformations - grew on me. It's not perfect, but it's good enough that I stopped complaining about it.

What I Actually Did (vs. What I Expected)

Going into Amazon, I thought I'd spend most of my time writing code.

Reality: I spent way more time mentoring and reviewing code than actually writing it.

This isn't a complaint - it's just the reality of working on a team at scale. The leverage you get from making your teammates better is often higher than writing more code yourself.

When Breaking Things Feels Real

Working on software used by millions of people sounds abstract until something breaks during "gametime" and you get paged.

Luckily, our software's customers were internal. But Prime Video has a strong technical reputation (largely thanks to Sye, their low-latency streaming acquisition), so the bar was high.

When you're the one getting woken up to fix something, scale feels very, very real.

Leadership Principles: The Good and the Cringe

Amazon's Leadership Principles are... a thing. Some of them are genuinely useful. Some feel like corporate BS.

The one that stuck with me: Invent and Simplify.

The "invent" part is obvious - everyone wants to build new things. But the simplify part is often forgotten, and it's actually the harder part.

Simplifying sometimes means building something new so you can deprecate something old. It means taking a step back from the code you just wrote and asking, "Is there a clearer way to do this?"

It's unglamorous work. But it's the work that makes systems maintainable.

The Culture Part

Using company values to shape culture and make consistent decisions across a large organization is genuinely smart. It works.

Until it doesn't.

It breaks down when senior leadership doesn't exhibit the values, or uses them to avoid hard questions. When the principles become a shield instead of a guide, people notice.

Also: reply-all hell is real, and the #aws-memes Slack channel is the only thing that makes it bearable.

What I'd Tell My Younger Self

If I could go back five years and tell myself one thing, it would be this:

The code you write matters less than the systems you build, and the systems you build matter less than the people you work with.

Write operational code that someone else can maintain. Build services that solve real problems, not just technically interesting ones. Simplify relentlessly. And when senior leadership stops living the values they preach, that's your sign.

Oh, and learn to love TypeScript. You'll be using it more than you think.


I spent five years at Amazon Prime Video working on streaming infrastructure, automation, and internal tools. Now I'm exploring AI/ML tooling and developer experience. If you want to talk about any of this, hit me up at nfwalls@pm.me.