<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>log.andvari.net - Writing</title><link href="https://log.andvari.net/" rel="alternate"/><link href="https://log.andvari.net/feeds/writing.atom.xml" rel="self"/><id>https://log.andvari.net/</id><updated>2026-02-02T12:00:00+00:00</updated><entry><title>Disappointing People Early</title><link href="https://log.andvari.net/disappointing-people-early.html" rel="alternate"/><published>2026-02-02T12:00:00+00:00</published><updated>2026-02-02T12:00:00+00:00</updated><author><name>Dave O'Connor</name></author><id>tag:log.andvari.net,2026-02-02:/disappointing-people-early.html</id><summary type="html">&lt;p&gt;One of my favourite phrases to remind people of as I'm fumbling about the business of producing reliable systems is &lt;strong&gt;"We Should Disappoint People Early"&lt;/strong&gt;. I've gotten enough funny looks about this that I figure I should explain myself.&lt;/p&gt;
&lt;p&gt;The 'polite' assumption most folks make is that we should aim …&lt;/p&gt;</summary><content type="html">&lt;p&gt;One of my favourite phrases to remind people of as I'm fumbling about the business of producing reliable systems is &lt;strong&gt;"We Should Disappoint People Early"&lt;/strong&gt;. I've gotten enough funny looks about this that I figure I should explain myself.&lt;/p&gt;
&lt;p&gt;The 'polite' assumption most folks make is that we should aim to never disappoint people at all. However, the wisdom is in the timing. If disappointment is coming (and it usually is, somewhere), you want it to arrive early enough that everyone can adjust, rather than late enough to cause real damage.&lt;/p&gt;
&lt;p&gt;This applies everywhere: SLOs, product roadmaps, support response times, vendor relationships, and most critically, to the implicit promises we make when we &lt;em&gt;don't&lt;/em&gt; say anything at all.&lt;/p&gt;
&lt;h3&gt;The Implicit Promise Problem&lt;/h3&gt;
&lt;p&gt;My erstwhile colleague Niall Murphy wrote an excellent piece on &lt;a href="https://blog.relyabilit.ie/implicit-slos-and-their-dangers/"&gt;implicit SLOs and their dangers&lt;/a&gt; that crystallises something I've seen over and over. The short version: your users are already forming expectations about your service's reliability, whether you tell them what to expect or not.&lt;/p&gt;
&lt;p&gt;If your service has been running at eleventy nines for the past year, your customers have noticed. They've built systems that depend on it. They've made architectural decisions predicated on "this thing is always up." They've stopped building fallbacks. You've made an implicit promise, and you probably didn't mean to.&lt;/p&gt;
&lt;p&gt;The problem comes when you need to break that promise. Maybe there's cost-cutting. Maybe there's a re-architecture. Maybe you're just doing some planned maintenance that you've been putting off. Suddenly, you're at three nines instead of eleventy, and your customers are furious -- not because three nines is unreasonable, but because you violated an expectation they'd built up over time.&lt;/p&gt;
&lt;p&gt;As Niall puts it: even if you smoothly manage the transition, if you fail 100x more often than you previously did, people are going to notice.&lt;/p&gt;
&lt;h3&gt;Why Stating an SLO is Better Than Not Stating One&lt;/h3&gt;
&lt;p&gt;&lt;a href="https://sre.google/sre-book/service-level-objectives/"&gt;The SRE book&lt;/a&gt; makes this point clearly: the business must establish what the availability target is for the system. Not the SRE team, not the platform engineers -- the business, informed by what users will actually tolerate and what makes commercial sense.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://sre.google/in-conversation/"&gt;Ben Treynor's observation&lt;/a&gt; that "100% is the wrong reliability target for basically everything" is foundational here. You're not running a pacemaker. You get to decide what's reasonable, and more importantly, you get to tell people.&lt;/p&gt;
&lt;p&gt;The alternative -- not stating an SLO -- is worse in every way. You've still made a promise; you just don't know what it is. Your customers have inferred one from your historical performance, and it's probably higher than you'd have chosen. You now have all the obligations of a promise without any of the negotiation that should have gone into making it.&lt;/p&gt;
&lt;p&gt;You have to make trade-offs. You can't make trade-offs against a number you haven't decided on. When you decide on that number, you then need to staple it to people's faces and let them emote about it early.&lt;/p&gt;
&lt;h3&gt;The Same Pattern in Roadmaps&lt;/h3&gt;
&lt;p&gt;Product roadmaps suffer from exactly the same dynamic. A product (or platform!) team should not make a promise until they've conducted enough discovery work to understand what's truly required, and the only people who can make such a promise are the team responsible for delivering it. The corollary is stark -- if you're making roadmap commitments before you understand what delivery entails, you're setting up a future betrayal.&lt;/p&gt;
&lt;p&gt;This is the implicit promise problem wearing different clothes. If you show a customer a roadmap with specific features and dates, you've made a promise. If the business context changes -- and it will -- you're now in the position of either delivering something suboptimal because you committed to it, or "breaking your promise" by adapting to reality.&lt;/p&gt;
&lt;p&gt;The fix isn't to hide your roadmap. The fix is to disappoint people early by being explicit about the nature of roadmaps: they're current best guesses, subject to change, and the further out you look, the fuzzier they get. Folks who've learned this lesson put "subject to change" disclaimers on everything, use confidence percentages, and replace hard dates with buckets like "Now," "Next," and "Later." Obviously it's possible to hedge too much, but a certain amount of expectation-setting is healthy.&lt;/p&gt;
&lt;p&gt;When roadmap planning is treated as a rigid forecast, it creates pressure and distrust. When it's treated as a dynamic, communication-first process, it builds trust and momentum -- even when timelines shift. The early disappointment of "this might change" is infinitely preferable to the late disappointment of "you promised."&lt;/p&gt;
&lt;p&gt;Just like with outages, good partners and customers will understand that software and the means by which we make software are terrible, complicated, unpredictable beasts. A good customer and partner should be more interested in your response to an outage than they are on beating you over the head with it. Similarly, a good customer partner will appreciate you keeping them up to date on delivery and being honest about outcomes, rather than doing deadline gymnastics.&lt;/p&gt;
&lt;h3&gt;Vendor Relationships and the Shared Responsibility Trap&lt;/h3&gt;
&lt;p&gt;There's a particularly insidious version of this problem that plays out when companies move from on-premises infrastructure to PaaS or SaaS, in either Security or for provision of a dependency. The &lt;a href="https://aws.amazon.com/compliance/shared-responsibility-model/"&gt;cloud shared responsibility model&lt;/a&gt; is supposed to be clear: the provider is responsible for security &lt;em&gt;of&lt;/em&gt; the cloud, while the customer is responsible for security &lt;em&gt;in&lt;/em&gt; the cloud. In practice, it's a mess of unstated expectations.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.cio.com/article/416343/the-top-cloud-security-threat-comes-from-within.html"&gt;Gartner famously predicted&lt;/a&gt; that through 2025, 99% of cloud security failures will be the customer's fault. This sounds like finger-pointing, but it reflects a deeper truth: when companies adopt cloud services, they often expect the vendor to take over responsibilities that were never part of the deal. As a customer, you're intentionally abdicating duty of care about the details, in return for paying money for an outcome. It can be difficult to let go of the parts you've farmed out to your vendor.&lt;/p&gt;
&lt;p&gt;When vendors aren't explicit about what they &lt;em&gt;won't&lt;/em&gt; do, customers fill in the gaps with optimism. So, when something breaks, the disappointment isn't just about the outage -- it's about betrayed expectations that were never actually set. &lt;a href="https://media.defense.gov/2024/Mar/07/2003407863/-1/-1/0/CSI-CloudTop10-Shared-Responsibility-Model.PDF"&gt;The NSA and CISA published guidance in March 2024&lt;/a&gt; specifically because so many organisations were accelerating their cloud journeys "without proper planning and an appreciation for shared responsibilities."&lt;/p&gt;
&lt;p&gt;This creates an ugly dynamic. If you're a vendor and you're not telling customers what failures to expect, they're going to want the full list of all things that can go wrong. Their nervousness translates into demands for breadth-first total coverage of all eventualities. But there are always failures you won't be able to enumerate specifically -- the unknown unknowns, the novel combinations, the things that have never happened before.&lt;/p&gt;
&lt;p&gt;The answer is to disappoint early. Be explicit about what's in scope and what isn't. Document the failure modes you &lt;em&gt;do&lt;/em&gt; know about. Make clear that you can't possibly predict everything, and explain what happens when the unexpected occurs, and why that makes you a safe pair of hands. The discomfort of that conversation now is vastly preferable to a customer discovering during an incident that their assumptions about your service were wrong. It also precludes a lot of demands from those same customers that are driven by vibes and nervousness at not feeling they know enough about potential failure modes and outcomes.&lt;/p&gt;
&lt;h3&gt;Support Response Times: The Hidden SLO&lt;/h3&gt;
&lt;p&gt;Every support team has an SLO, whether they've stated one or not. If you typically respond to tickets within 4 hours, your customers have noticed. They've calibrated their workflows around it. The first time you take 24 hours, they'll be furious -- not because 24 hours is unreasonable in isolation, but because you violated the expectation you accidentally created. This applies also to customers who only open the occasional support request - the feigned surprise that they don't get a 15-minutes response when the SLA is an hour (or more) will be familiar to anyone who's been on a customer support rotation. Customers who open tickets all the time will be more familiar with the expected performance because they've been told more recently.&lt;/p&gt;
&lt;p&gt;This is where "disappoint early" becomes almost literal. If you know you can't offer priority support between midnight and 5 AM, say so upfront. If critical issues get faster response than routine questions, publish your priority matrix. If your support capacity means occasional delays during high-volume periods, tell people before they need help, not after they've been waiting.&lt;/p&gt;
&lt;h3&gt;Internal Customers Too&lt;/h3&gt;
&lt;p&gt;This isn't just about external customers. The same dynamic plays out inside your organisation, often more acutely.&lt;/p&gt;
&lt;p&gt;If you're a platform team that offers services internally, your internal customers (the product teams) are forming expectations about your services just as external customers would. If you've been accidentally excellent at something -- say, keeping your deployment pipeline at sub-minute latencies -- you've now made an implicit promise. The moment that slips to five minutes because you've added some necessary safety checks, you'll hear about it.&lt;/p&gt;
&lt;p&gt;The conversation is much easier if you've stated up front: "We target 95th percentile deploy times under 5 minutes." Now there's a conversation to be had. Maybe 5 minutes isn't good enough for some use cases. Maybe it's plenty. But at least everyone knows what they're working with.&lt;/p&gt;
&lt;p&gt;Similarly with stakeholders and leadership. If you're asked "when will this be done?" and you say "soon" or "we're working on it," you've allowed them to fill in their own number. That number will be wrong, and it will be too optimistic. The disappointment arrives later, bigger, and with compound interest.&lt;/p&gt;
&lt;p&gt;Saying "this will take four weeks" might disappoint someone today. But that disappointment is manageable. They can plan around it. They can reprioritise. They can have the argument about whether four weeks is acceptable &lt;em&gt;now&lt;/em&gt;, when there's still time to do something about it.&lt;/p&gt;
&lt;h3&gt;The Discomfort is the Point&lt;/h3&gt;
&lt;p&gt;Stating an SLO feels uncomfortable precisely because it's a commitment. The same is true for publishing a support response time matrix, or putting "subject to change" on a roadmap, or listing the failure modes your SaaS platform &lt;em&gt;won't&lt;/em&gt; protect against. Each of these forces you to confront the reality that you can't be all things to all people. It makes explicit something that was previously implicit and negotiable.&lt;/p&gt;
&lt;p&gt;That discomfort is valuable. It's the discomfort of honesty. It's far preferable to the discomfort of a customer discovering during an outage that your service wasn't as reliable as they'd assumed, or finding out during an incident that their cloud vendor doesn't cover what they thought it did, or learning that the feature they were counting on got deprioritised.&lt;/p&gt;
&lt;p&gt;The core of "disappoint early" is this: small disappointments now prevent large disappointments later. An SLO that seems modest compared to your historical performance might disappoint a customer today. A roadmap that says "this might change" feels less confident than one with firm dates. A vendor contract that explicitly lists what's out of scope seems less comprehensive than one that doesn't mention limitations at all.&lt;/p&gt;
&lt;p&gt;But in each case, the explicit version creates space for honest conversations about what they actually need, what you can actually deliver, and what happens when those don't align. The implicit version creates the illusion of agreement that shatters on first contact with reality.&lt;/p&gt;
&lt;h3&gt;So.&lt;/h3&gt;
&lt;p&gt;If you're running a service without a stated SLO, you've already made a promise -- you just don't know what it is. Your customers have inferred one, and it's probably more optimistic than you'd like.&lt;/p&gt;
&lt;p&gt;If you're sharing a roadmap without caveats about uncertainty, you've created expectations you may not be able to meet. You've also put yourself in a position where you've no way to climb down if the situation changes.&lt;/p&gt;
&lt;p&gt;If you're a vendor who hasn't explicitly documented what failures customers should expect, they're filling in the blanks with assumptions that will turn into accusations when something goes wrong.&lt;/p&gt;
&lt;p&gt;If you're working with stakeholders without explicit expectations about timelines, support responsiveness, or capabilities, you've allowed them to assume. Those assumptions will bite you.&lt;/p&gt;
&lt;p&gt;The fix is the same in every case: have the uncomfortable conversation now. State what you can actually commit to. Document what's out of scope. Put confidence levels on your forecasts. Publish your priority matrix. Be boringly, repetitively explicit about constraints and limitations.&lt;/p&gt;
&lt;p&gt;It's often a good idea to overcompensate when communicating constraints or expectations that may be 'disappointing.' The customer who understands your limitations upfront is a customer who can plan around them. The customer who discovers them during a crisis loses trust, a thing that is hard-won and sometimes impossible to repair.&lt;/p&gt;
&lt;p&gt;It's paradoxical, but the path to being a more reliable partner -- to your customers, your stakeholders, your colleagues -- runs directly through being willing to disappoint them sooner.&lt;/p&gt;</content><category term="Writing"/><category term="writing"/><category term="sre"/><category term="leadership"/></entry><entry><title>Chasing Boring at Just the Right Speed</title><link href="https://log.andvari.net/no-mttr.html" rel="alternate"/><published>2026-01-22T12:00:00+00:00</published><updated>2026-01-22T12:00:00+00:00</updated><author><name>Dave O'Connor</name></author><id>tag:log.andvari.net,2026-01-22:/no-mttr.html</id><summary type="html">&lt;h3&gt;Asking the Right Questions&lt;/h3&gt;
&lt;p&gt;A while back when I was looking for a fulltime gig (and when I was contracting, of course), I had the opportunity to do a bunch of interviews (including the kind of informal interviews you do to get a new contracting client). One of my boilerplate …&lt;/p&gt;</summary><content type="html">&lt;h3&gt;Asking the Right Questions&lt;/h3&gt;
&lt;p&gt;A while back when I was looking for a fulltime gig (and when I was contracting, of course), I had the opportunity to do a bunch of interviews (including the kind of informal interviews you do to get a new contracting client). One of my boilerplate questions when asked "Any questions for me?" has always been "What does success look like for the person in this role?". I like to ask it of everyone, even folks who'd be reporting to me or folks far away on the org chart. It gives you a good insight into how stitched together the team is on the important bits, I find.&lt;/p&gt;
&lt;p&gt;In the majority of cases, I tended to get at least one answer that boils down to "Fewer Outages, Lower MTTR".&lt;/p&gt;
&lt;p&gt;"Fewer Outages", to me, is a weird one. I tend to respond to a "There are too many outages" complaint by asking "Well, how many is too many?". Just like &lt;a href="https://sre.google/in-conversation/"&gt;100% is the wrong SLO target for basically everything&lt;/a&gt;, "None" isn't the right target for number of outages. You take a reasonable and data-informed swag and aim for that. In &lt;a href="/6reasons.html"&gt;the vast majority of cases&lt;/a&gt;, you're not running a nuclear power plant or a circulatory system, so you get to really think about what number is right.&lt;/p&gt;
&lt;p&gt;Similarly, for MTTR (Mean Time To Recovery, or 'how long in seconds it takes you to mitigate the effect of an outage') the last while has seen a "race to the bottom". There are entire companies whose premise for what they'll do for you (and charge you a handsome fee, to boot) is lower your MTTR. They'll generally do it using an AI chatbot (and strand your systems knowledge and capability inside that same bot), but that's neither here nor there - the main gist is "MTTR is your problem", and it's a tempting one to latch onto. Easy to measure, easy to be seen to be grumpy about.&lt;/p&gt;
&lt;h3&gt;Doing it the Fastestest&lt;/h3&gt;
&lt;p&gt;The problem is that MTTR, as a primary focus, is a red herring. It's measuring the wrong thing; or at least, it's measuring something that's downstream of what you actually want.&lt;/p&gt;
&lt;p&gt;If you're having the same kinds of outages over and over, getting really fast at recovering from them is like getting really good at bailing water out of a leaky boat. Sure, you're staying afloat, but water coming in is a bad sign, and you built the boat in the first place so should know what's up.&lt;/p&gt;
&lt;p&gt;The real purpose of good incident practice isn't to get fast at recovery. It's to feed back what you learn into your software development lifecycle. Every incident is information. Every outage is telling you something about where your systems, processes, or assumptions are wrong. The goal isn't to recover quickly (though you should); it's to ensure you don't have to recover from the same thing twice.&lt;/p&gt;
&lt;p&gt;If you wanted to be a little more philosophical about it (in terms your boss would probably hate), you could say that Outages (like all feedback) are a Gift. You pick yourself up, dust yourself off, and see what there is to learn from the (sometimes pretty shitty) experience.&lt;/p&gt;
&lt;p&gt;I waxed slightly lyrical recently about &lt;a href="https://blog.cloudflare.com/18-november-2025-outage/"&gt;Cloudflare's writeup&lt;/a&gt; on the outage they had in November 2025. The part that was valuable to me was not a power fantasy about how quickly everyone scrambled and dropped everything and mitigated the outage - in fact, I only checked just now how long the outage lasted (3 hours). I'm not concerned about those 3 hours. Less would be nice, but the real outcome is much more valuable. I'm not even a current Cloudflare customer, but the fact that I know what caused the outage, and I know things have changed means more than a much shorter outage, followed by "Sorry customers, it's very technical, won't happen again (probably)". &lt;/p&gt;
&lt;h3&gt;Doing it The Bestest&lt;/h3&gt;
&lt;p&gt;When I talk to teams about the point of good incident management (primarily retrospectives), I generally say that the point of doing them is twofold. That is:&lt;/p&gt;
&lt;h4&gt;Avoiding Repeats&lt;/h4&gt;
&lt;p&gt;If you're seeing repeat incidents -- the same service falling over the same way, the same capacity issues, the same configuration mistakes -- then your incident response process isn't doing its job, no matter how good your MTTR looks. You're optimising for the wrong end of the pipeline.&lt;/p&gt;
&lt;p&gt;When you focus too hard on MTTR, you create incentives that work against learning. Teams get good at quick fixes and workarounds. They get good at restarting services and clearing queues. They don't necessarily get good at asking "why did this happen, and what systemic change prevents it from happening again?"&lt;/p&gt;
&lt;p&gt;When I worked freelance, I ran into at least one shop that used a managed service partner. Their first instinct was to kick things, seemingly at random. Restart the service. Reboot the machine. Reset the network device.&lt;/p&gt;
&lt;p&gt;Their MTRR numbers were &lt;em&gt;amazing&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Generally when a particular service fell over, it was back within the hour. However, it was falling over every couple of weeks. When I probed for more information on contributing factors, I got crickets. No followups, no logs, no learning. Clearly there's some slider between "mitigate" and "understand" that makes them somewhat mutually exclusive at either end of the slider.&lt;/p&gt;
&lt;h4&gt;Raising All Boats&lt;/h4&gt;
&lt;p&gt;Post-incident reviews (or postmortems, or whatever you're calling them this week) should be producing action items that feed into your roadmap. Not just "add better monitoring" (though sure, maybe), but real changes: fix the race condition, add the circuit breaker, change the deployment process. Do a priority swap so we fix something we've identified as real, instead of adding additional whizz-bang that may be more speculative in terms of how happy it makes customers. For bonus points, involve your customer support friends to make sure what you're inserting into the SDLC is actually going to move the needle for customers.&lt;/p&gt;
&lt;p&gt;The measure of a mature incident response process isn't how fast you recover. It's how rarely you see the same incident twice, and how much of the learning from your outages sticks, in the form of enduring shifts to how you write code and design systems. It's how often your outages are genuinely novel -- new and interesting failures that teach you something new about your systems, and real action on then finding yourself a better class of first-world problems.&lt;/p&gt;
&lt;h3&gt;So&lt;/h3&gt;
&lt;p&gt;So yes, resolve outages quickly. Absolutely. But don't let MTTR become the thing you're optimising for. The goal is to build systems and processes where you're constantly learning and improving, not systems where you're just really efficient at fighting the same fires over and over.&lt;/p&gt;
&lt;p&gt;Your incidents should be novel, and retro outcomes should be real. If they're not, your incident process is failing you, regardless of what your status page says.&lt;/p&gt;</content><category term="Writing"/><category term="sre"/><category term="incidents"/><category term="leadership"/></entry><entry><title>Critical Thinking and DEI</title><link href="https://log.andvari.net/critical-thinking.html" rel="alternate"/><published>2025-02-16T00:00:00+00:00</published><updated>2025-02-16T00:00:00+00:00</updated><author><name>Dave O'Connor</name></author><id>tag:log.andvari.net,2025-02-16:/critical-thinking.html</id><summary type="html">&lt;p&gt;Part of my day-job as head of Engineering at Google Ireland was attending talks and events/Q&amp;amp;A for local interest groups like Engineers Ireland. I was reminded today of &lt;a href="https://www.engineersireland.ie/News/revealed-only-one-in-eight-engineers-is-a-woman-engineers-ireland-finds"&gt;a panel I was on around diversity in engineering in Ireland&lt;/a&gt;, and my allegedly 'fascinating' comments at it. That's not …&lt;/p&gt;</summary><content type="html">&lt;p&gt;Part of my day-job as head of Engineering at Google Ireland was attending talks and events/Q&amp;amp;A for local interest groups like Engineers Ireland. I was reminded today of &lt;a href="https://www.engineersireland.ie/News/revealed-only-one-in-eight-engineers-is-a-woman-engineers-ireland-finds"&gt;a panel I was on around diversity in engineering in Ireland&lt;/a&gt;, and my allegedly 'fascinating' comments at it. That's not the part I remember most vividly, though.&lt;/p&gt;
&lt;p&gt;After the presentation of the report, there was a Q&amp;amp;A session, with some good but not unusual questions. At one point, someone up the back raised a hand, and asked:&lt;/p&gt;
&lt;p&gt;"What is Critical Thinking?"&lt;/p&gt;
&lt;p&gt;The panel kind of went a bit quiet, but the cogs and wheels were turning in my head. It was exactly the right question at the right time, and a lot of stuff in that very moment fell into place in my head. Probably one of the best Q&amp;amp;A questions I'd ever seen, let alone had the privilege to answer.&lt;/p&gt;
&lt;p&gt;Critical thinking is the ability to inspect a problem from different perspectives, and most crucially, to be able to reject hypotheses without bias. You need to be able to take your bright idea, and discard it in place of a better one, without attachment. To do it, you need people who have different frames of reference, who have had different 'crucible moments' and who have different contexts culturally that allow us to respectfully disagree and offer a broad set of views.&lt;/p&gt;
&lt;p&gt;Homogeneous teams simply can't do it. We like to imagine we can transcend our personal experiences, biases and ego, but it simply isn't the case. That, to me, is what came to form the core of my thoughts on Diversity, Equity and Inclusion (or at least, the Diversity part). It forms a key part of my philosophy on building good teams. You simply can't build a good team from folks who look, think, and have experienced the same. Homogeneous teams are weak teams.&lt;/p&gt;
&lt;p&gt;Another example: much earlier in my career, I was part of a small group in a leadership training tasked with coming up with a simple means of telling each other the number on a price tag without speaking. One of the team asked "How do we say a comma?". Myself and another Irish person in our group didn't see the point; just spell out the digits, why put in commas?&lt;/p&gt;
&lt;p&gt;Turns out in big parts of continental Europe, price tags have a comma between the euro amount and the cents, whereas in Ireland there's a period. This may seem like a facile example, but these are the sort of small assumptions that get made every day that lead to accessibility issues, bias and poor quality outcomes in products you build.&lt;/p&gt;
&lt;p&gt;There are a lot of things happening right now that make this a hot button issue; I care deeply about the social justice aspects of the naked white supremacy, homophobia and xenophobia that informs the current US administration's policy shifts. My friends are being hurt.&lt;/p&gt;
&lt;p&gt;However, another aspect of pushing back against this wrong-headed, unjust and deeply stupid shift is what also reminded me of something I got to say to James Damore in the brief time he was at Google after his stupid-ass memo (this got me put on an external list of 'SJWs', a point of particular pride for me). &lt;/p&gt;
&lt;p&gt;I told him, "You're not just wrong, you're &lt;strong&gt;incorrect&lt;/strong&gt;."&lt;/p&gt;
&lt;p&gt;As well as the plain prejudice, racism and homophobia that informs the current policy shifts, it's also worth pointing out that they simply weaken capability. We're being asked to build weaker teams, with narrower mindsets and much less interesting approaches to work.&lt;/p&gt;
&lt;p&gt;While we're resisting and addressing these changes as leadership, remember that pointing out real, tangible risks to business and capability are part of the arsenal.&lt;/p&gt;</content><category term="Writing"/><category term="writing"/><category term="leadership"/><category term="dei"/></entry><entry><title>Believable AMAs for Genuine Leaders</title><link href="https://log.andvari.net/ama.html" rel="alternate"/><published>2025-01-13T00:00:00+00:00</published><updated>2025-01-13T00:00:00+00:00</updated><author><name>Dave O'Connor</name></author><id>tag:log.andvari.net,2025-01-13:/ama.html</id><summary type="html">&lt;h3&gt;Communication Scales with Scale&lt;/h3&gt;
&lt;p&gt;Communicating with your team scales up as your team grows - it's an experience not unlike the individual engineer to senior engineer to &lt;a href="https://www.oreilly.com/library/view/the-staff-engineers/9781098118723/"&gt;staff engineer&lt;/a&gt; path. As well as scaling up in terms of impact, the very method of how you do it changes.&lt;/p&gt;
&lt;p&gt;You &lt;em&gt;very&lt;/em&gt; quickly …&lt;/p&gt;</summary><content type="html">&lt;h3&gt;Communication Scales with Scale&lt;/h3&gt;
&lt;p&gt;Communicating with your team scales up as your team grows - it's an experience not unlike the individual engineer to senior engineer to &lt;a href="https://www.oreilly.com/library/view/the-staff-engineers/9781098118723/"&gt;staff engineer&lt;/a&gt; path. As well as scaling up in terms of impact, the very method of how you do it changes.&lt;/p&gt;
&lt;p&gt;You &lt;em&gt;very&lt;/em&gt; quickly grow out of being able to do a full matrix of 1:1 conversations, likely before you even go beyond single digit numbers of people. Even with a smaller team, this method can get pretty exhausting for all involved quickly. It's also pretty lossy - rather than actively tracking what you've communicated to each person, it's not unlikely that you'll assume you've said something about team direction, strategy, etc to everyone, when you might not have. It's possible to track this more exactly, but it's probably best not to. This is one instance where after you open a new spreadsheet, it's probably time to have a quiet word with yourself and see if you need to change the method.&lt;/p&gt;
&lt;p&gt;Next in the toolbox is usually the ubiquitous stand-up or team meeting - a much better way to make sure you've let the whole team know about something, and ideally to give them the chance to clarify anything. This is probably the most common way to 'filter' down either communications from the broader organisation, or directions just for this team.&lt;/p&gt;
&lt;p&gt;Once you scale beyond a team, things get tricky. You want to filter a message to all of these folks, but also give them the opportunity to ask questions and fill in the blanks they might have, that often you haven't thought of. Google's &lt;a href="https://medium.com/@nareshnavinash/googles-tgif-meetings-50f03a4f0403"&gt;TGIF&lt;/a&gt; meetings used to do this; I attended a number of these in person when I'd visit HQ, and seeing Larry and Sergey being surprised by questions was pretty routine. This isn't a bad thing -- you're not going to think of everything, and often you'll miss things that will be brought up by thoughtful questions. The answers here were usually pretty frank; this changed later on when TGIF answers got pretty routinely leaked, often in real-time.&lt;/p&gt;
&lt;p&gt;When does it become time to do something similar for your own team? In my case, it was when I started managing teams in various different offices - I had teams first in Dublin and California, and later NYC, Sydney and Seattle. While I did do my fair share of aeroplane time, it quickly became unworkable to do local &lt;a href="https://www.indeed.com/hire/c/info/how-to-conduct-town-hall-meetings"&gt;"town hall meetings"&lt;/a&gt;, even virtually. I still did them when I visited offices, but I wanted these to be in addition to folks getting all the information they needed day-to-day.&lt;/p&gt;
&lt;h5&gt;Tip: Perception is Reality&lt;/h5&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;If you put a message out to a group of people, and most of them think you meant a certain thing, then that's what you meant.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is a common pitfall from leaders of all levels - I've seen everyone from line-level managers to VP+ everywhere make the mistake of communicating carelessly, and not following up when it becomes clear that folks misunderstood what they meant. There's a hard problem here for most leaders. You're a professional communicator, you're good at this, so it's very tempting to think something like "Well, if folks misunderstood then that's not my problem".&lt;/p&gt;
&lt;p&gt;What just happened is that you said words and people parsed those words into meaning, and your words failed to get the meaning across. In fact, they got another meaning across that you didn't &lt;em&gt;mean&lt;/em&gt; to, making that the reality for those people. If a senior person shows up and says something, and most of the team comes away thinking "Dave will fire us if we make mistakes", then that's what you said. You've got work to do, or learning to do, or ideally both. Don't let hubris make you skip this part.&lt;/p&gt;
&lt;h3&gt;Communicating a Thing Starts with Communicating It&lt;/h3&gt;
&lt;p&gt;How you actually make decisions is beyond the scope of this article - let's assume for a moment you've done a good job at having all the right folks do all the right things to make a decision, charter a project, or otherwise determine a path forward.&lt;/p&gt;
&lt;p&gt;Perhaps you've gotten a few leads together for a day or two and determined your roadmap for a year, or decided to make some team changes, or something of that ilk. You've spent hours internalising all the factors involved, and ideally used good judgement and data to come to a decision. Why would you expect folks on your team outside the room to immediately understand why a decision is made, just because you announced it? That's assuming you announce it at all.&lt;/p&gt;
&lt;p&gt;The primary mistakes I've seen people make when it comes to communicating direction and change is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Not communicating the decision, direction or change &lt;em&gt;at all&lt;/em&gt;.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This is &lt;em&gt;extremely&lt;/em&gt; common. It is a shockingly common assumption by rooms full of smart people that knowledge and context hard-won in that room are somehow magically transplanted to everyone affected by the outcome. I try to dedicate some "How are we communicating this?" time at the end of leadership get-togethers for this exact reason. There is real, difficult work in communicating things to folks; step one is acknowledging this and planning for it.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Communicating a decision or change, without providing any means of asking for details, or otherwise questioning it.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Every organisation differs in how decisions are done. On one extreme, there's the autocratic "One person makes all decisions" model. On the other end is the apocryphal "Absolute Consensus on all decisions, always" model. Chances are you're somewhere in the middle. &lt;em&gt;Someone&lt;/em&gt; on the team isn't going to be happy with any given direction or decision. They're going to have some opinions and (usually loaded) questions. You want to put yourself in a position where these questions get answered, ideally in public. Otherwise people will answer these questions themselves. You're probably not going to like those answers. There's really no downside to inviting questions and comment on directions and decisions, yet I've seen many otherwise smart folks think they can quietly implement a decision and maybe nobody will mind or notice.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Not owning the decision&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;It's always tempting to defer to some external factor (deadlines, budget, leadership) when saying why a decision was made. This is a tricky balance, because while most good leaders don't like deferring to "Because I/we say so" as the reason a direction is being taken, that's ultimately what's happening. Yes, there are always contributing factors, but you've taken those into account to make a high-quality decision, which you should stand over as your own. If external factors seem to force your hand in all directions you take, then you're not really making decisions at all. Again, perception is reality. People are smart and will spot this right away.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h5&gt;Tip: Timing and Heads-Up(s)&lt;/h5&gt;
&lt;p&gt;As a conscientious leader or manager, you are likely highly practised at the art of delayed gratification. Most people aren't. If you're announcing something, 'fast follow' doesn't mean in a week or two when you can schedule a Q&amp;amp;A/AMA session. Let your ability to schedule this kind of session inform when you announce. Don't announce things on Friday and do the session on Monday; people will spend the weekend inventing answers to their own questions that will be way more compelling than any boring sensible answers you might eventually give.&lt;/p&gt;
&lt;p&gt;If there are folks who need to know about something in advance, because they're key stakeholders, or simply because people will ask them about it first and they don't like surprises, do give them a heads-up. If you're worried about leaks, then that's a good instinct. Do this as late as is practical. Things will leak, accept this as a fact of life and plan accordingly.&lt;/p&gt;
&lt;h3&gt;Making it Real: The AMA&lt;/h3&gt;
&lt;p&gt;This section sets out tips on AMA ("Ask Me Anything") sessions. These can be held in relation to a specific announced thing, or be semi-regular. I've often made a point of doing these regularly (and per-timezone, for folks at an offset time from me).&lt;/p&gt;
&lt;h5&gt;Tip: A note on 'Corp Speak'&lt;/h5&gt;
&lt;p&gt;The jig is up, corp-speakers. Everyone knows you're doing it. You haven't cracked the ability to make people believe you through clever use of words.&lt;/p&gt;
&lt;p&gt;One thing that increased numbers of virtual town-halls have given us is the ability to leave without tripping over chairs or being noticed, and the ability to watch attendee numbers in real-time. I've personally watched viewer numbers drop off a cliff on these kinds of calls when a particularly 'polished' answer comes out. This is a shame, as it drives disengagement from the message coming from leadership, and even if the rest of the content is credible and sincere, this can derail people's engagement with the decisions or directions being communicated. At &lt;em&gt;best&lt;/em&gt;, folks will just think less of the corp-speaker.&lt;/p&gt;
&lt;p&gt;One clue that this is not the True Way is that most CEOs are not corp-speakers when they're at home. In a company-private setting they are generally quite open and communicate as openly as they can. Generally when they defer a question to a corp-speaker, it's because they need protection from Saying the Wrong Thing, or smell a Trap. More on Traps later.&lt;/p&gt;
&lt;h4&gt;Setting Expectations: The '3 Answers' Model&lt;/h4&gt;
&lt;p&gt;I've seen several variants of this model of AMA, so I can't take any credit for it. However, it has worked for me for many years across teams and orgs.&lt;/p&gt;
&lt;p&gt;The basic model (which you should absolutely tell attendees in advance) is that attendees can ask anything. The answer, however, can take one of 3 forms.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;I answer the question to the best of my ability.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;I tell you that I don't know.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;I tell you why I'm not telling you, to the best of my ability.&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;First, you'll notice there's no "I will bloviate for a few minutes and hope you look bewildered and stop asking" option. The intention is that the answers are credible and concise, and engage with the question in good faith. &lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Option #1&lt;/strong&gt; is straightforward, and ideally is the option you take most of the time. Answer the question. Be prepared for follow-ups. If you've done your homework, this is the easy option. If someone's pointing out something you haven't thought of, say you haven't thought of that, and offer to follow up with them. Congratulations! You've successfully found one of the most tangible outcomes of holding these sessions: a high-value connection with an engaged team member.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Option #2&lt;/strong&gt; is just as straightforward on paper. It's not an answer many leaders like to give, however, as the pageantry innate to most corporate hierarchies generally assumes the senior person in the room to be all-knowing. What you're actually doing here is letting people know that you've really heard the question and you're not going to bullshit them by guessing. The ideal answer usually takes the form of "I don't know, but let me find out and get back to you", or "I don't know, let's talk after and we can find out".&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Option #3&lt;/strong&gt; is the tricky one. The premise of giving this kind of answer is that it's more credible to say you're not answering than to use smoke and mirrors to try to steer people away from the subject. Some common reasons not to answer a question is that it's information private to an individual, it's information that's privileged (i.e. "I could answer but it would make you all &lt;a href="https://www.investopedia.com/terms/i/insidertrading.asp"&gt;Insiders&lt;/a&gt; and you probably don't want that"), or that it's information where answering would be worse for team cohesion than assuaging some folks' curiosity. I've been asked questions in AMAs related to individual performance of folks on the team, which are straightforward "that's not my information to give" answers. I've also had 'escalations' of decisions made at the team level, where it's appropriate to say something like "I'm not here to overrule decisions you don't like, someone else owns this decision and it's been devolved to them entirely".&lt;/p&gt;
&lt;h5&gt;Tip: A Note on Traps&lt;/h5&gt;
&lt;p&gt;You're likely going to run into questions that are designed to make you say something that'll either get you in "trouble", or make you take a firm position on something you don't need to. Always ask clarifying questions if you feel things are a bit vague.&lt;/p&gt;
&lt;p&gt;I once joined a team where in my first regular AMA, I was asked "What is your opinion on &lt;a href="https://en.wikipedia.org/wiki/Monorepo"&gt;monorepos&lt;/a&gt;?". This may seem like a benign question on its face, but the fact that someone chose to ask that means that chances are there's a monorepo holy war happening somewhere on the team and I'm being asked to weigh in. My answer was essentially "I will not referee your nerd fight", and this was Option #3 above. &lt;/p&gt;
&lt;p&gt;On a more serious note, I've also been asked if a new female exec on my team is "able" to take on the teams she was hired to take on. Again, I could very easily (and emphatically) answer such a question, but it was more valuable to take Option #3 and say something along the lines of "I'm not answering that because I know what you're doing, knock it off."&lt;/p&gt;
&lt;h5&gt;Tip: Anonymous Questions&lt;/h5&gt;
&lt;p&gt;I used to take anonymous questions, but don't do that any more. You get more questions that folks would otherwise not ask, but the quality can vary massively (the above question about a female exec was anonymous, in case it wasn't obvious). What worked instead was to nominate some folks in the organisation as proxies for folks who weren't comfortable asking questions themselves, for whatever reason. These proxies weren't necessarily very senior folks, just people on the team who were considered safer options. The reasons for wanting to proxy a question don't really matter; it can be tempting for a leader to say "No anonymous or proxied questions because this is a safe space", but that's not strictly true. Also, perception is reality, as we've established.&lt;/p&gt;
&lt;h3&gt;Answers Aren't just Answers&lt;/h3&gt;
&lt;p&gt;A side-effect of some of these answers you might give (especially Options #2 and #3) is to let people know your style, and your mind on things. The quality and timbre of questions tends to adjust over time. This means that AMA questions are 'messier' than curated FAQs or simple top-down announcements. This can be a little intimidating, but is an investment in your credibility as a leader, and in your team understanding your mind on things. Whether we like it or not, teams tend to quietly imitate the style of their leadership. Over time, a culture of openness and of consequence-free questioning of the status quo more than makes up for any initial (or ongoing!) awkward moments. Like any investment, the key is to stick with it.&lt;/p&gt;</content><category term="Writing"/><category term="writing"/><category term="leadership"/></entry><entry><title>Plus Ça Change Management: 20 Years of SRE</title><link href="https://log.andvari.net/plus-ca-change-management-20-years-of-sre.html" rel="alternate"/><published>2024-07-19T10:54:00+01:00</published><updated>2024-07-19T10:54:00+01:00</updated><author><name>Dave O'Connor</name></author><id>tag:log.andvari.net,2024-07-19:/plus-ca-change-management-20-years-of-sre.html</id><summary type="html">&lt;p&gt;On July 19th 2004, I spent my first day at Google. I showed up at the Datacenter in Dublin, since that's apparently where SREs and production folks were going to sit. There wasn't a corp network connection, because this was Google in 2004. A couple of days later, I moved …&lt;/p&gt;</summary><content type="html">&lt;p&gt;On July 19th 2004, I spent my first day at Google. I showed up at the Datacenter in Dublin, since that's apparently where SREs and production folks were going to sit. There wasn't a corp network connection, because this was Google in 2004. A couple of days later, I moved my stuff to Barrow Street and commandeered a desk. A few weeks after that, someone from facilities asked "Hey, isn't your team based in the DC?". "Nope", I said. They shrugged, and updated some spreadsheets, and then SRE was based in Barrow Street. Again, this was Google in 2004. We were making it up as we went along, in the best possible way.&lt;/p&gt;
&lt;p&gt;I could claim I was doing SRE-adjacent things before this, as I suspect could many people, but I'm going with that day as when I became an SRE. The function was still figuring itself out; in many ways it still is. My work with Busy Teams as a freelancer isn't "SRE work" on paper, but in practice, it very much is. Resilience, defense in depth, common sense, backup plans for backup plans. Across industry, SRE/Reliability/Devops/ProdEng/Whatevsies is many things to many people. So, in the spirit of the core of the function, I've been thinking a little about uncomfortable gaps in capability. What have we not figured out yet?&lt;/p&gt;
&lt;p&gt;Here's my short list of boiling hot takes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Most companies haven't figured out how much they steady-state care about reliability. I expand on this in &lt;a href="https://log.andvari.net/6reasons.html"&gt;"6 Reasons you Don't need an SRE Team"&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Assuming we've figured out that we care, we still can't agree how hard a problem this is. SRE (and traditional ops) begat DevOps, which at its core has the premise that busy developers could side-gig the production bits, and you can just wave your hand and say "You build it, you run it" and that'll cover you. This isn't true today, and was even less true when 'DevOps' started being a thing. If I squint my eyes, I can see a course correction happening with "Production Engineering", which has at its premise a lot more sensible of an acknowledgment that this whole area is hard (as in, requires smartness, innovation, and Real Engineering(tm)) as opposed to difficult (as in, it's boring and I don't want to do it). These are glacial shifts; it's taken more than 20 years for us to go in this circle back to acknowledgment that this is a real and specialised set of problems.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;We actually run less and less of our own infra, and practices need updating. For a brief period before AI was suddenly what all startups are about, there were (and still are!) great startups producing big parts of your tool-chain as SaaS/PaaS. This is great for not having to build in in-house expertise in that particular area, but can leave you dead in the water if you don't have a good strategy around vendor management. I spoke about this a bit at &lt;a href="https://www.usenix.org/conference/srecon23emea/presentation/panel-saas"&gt;SRECon EMEA 2023&lt;/a&gt;, and I do feel like a lot of our practices involve declaring an outsourced part of our tool-chain to be an opaque cuboid, and then not having defense in depth for when it goes away (temporarily or permanently). Many of the SRE practices as set out in various books/articles kind if assume you own your whole stack. This is becoming mostly untrue, and we are currently a Frog of Moderately Troubling Temperature here.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Anyway. It's Friday, it's 22 degrees outside in Dublin, and it's time to go enjoy the next 20. No shortage of things to do.&lt;/p&gt;</content><category term="Writing"/><category term="log"/><category term="work"/><category term="google"/><category term="sre"/></entry><entry><title>Production, SRE and the Architecture of the Built Environment</title><link href="https://log.andvari.net/production-sre-and-the-architecture-of-the-built-environment.html" rel="alternate"/><published>2024-02-15T16:23:00+00:00</published><updated>2024-02-15T16:23:00+00:00</updated><author><name>Dave O'Connor</name></author><id>tag:log.andvari.net,2024-02-15:/production-sre-and-the-architecture-of-the-built-environment.html</id><summary type="html">&lt;p&gt;Recycling old talk proposals for fun and...that's it!&lt;/p&gt;
&lt;p&gt;Here's an amalgamation of the proposal text and some notes from a talk I proposed for SRECon EMEA last year -- I'll be talking about this to the Dublin SRE meetup in a couple of weeks, so here it is while the …&lt;/p&gt;</summary><content type="html">&lt;p&gt;Recycling old talk proposals for fun and...that's it!&lt;/p&gt;
&lt;p&gt;Here's an amalgamation of the proposal text and some notes from a talk I proposed for SRECon EMEA last year -- I'll be talking about this to the Dublin SRE meetup in a couple of weeks, so here it is while the dust is dusted off.&lt;/p&gt;
&lt;p&gt;&lt;a href="/pages/sre-and-architecture.html"&gt;Production, SRE and the Architecture of the Built Environment&lt;/a&gt;&lt;/p&gt;</content><category term="Writing"/><category term="log"/><category term="writing"/></entry><entry><title>6 Reasons You Don't Need an SRE Team</title><link href="https://log.andvari.net/6reasons.html" rel="alternate"/><published>2023-06-21T05:40:00+01:00</published><updated>2023-06-21T05:40:00+01:00</updated><author><name>Dave O'Connor</name></author><id>tag:log.andvari.net,2023-06-21:/6reasons.html</id><summary type="html">&lt;p&gt;The last several years have seen a huge upsurge in the popularity of the DevOps/SRE/Production Engineering model, with companies large and small adopting some of the practices and mindsets. One of the principal lessons many of these organisations (hopefully) learned was that it's close to impossible to adopt …&lt;/p&gt;</summary><content type="html">&lt;p&gt;The last several years have seen a huge upsurge in the popularity of the DevOps/SRE/Production Engineering model, with companies large and small adopting some of the practices and mindsets. One of the principal lessons many of these organisations (hopefully) learned was that it's close to impossible to adopt the SRE model as described by Google in the the first &lt;a href="https://sre.google/sre-book/table-of-contents/"&gt;SRE book&lt;/a&gt;. It's a good approach to take on board the parts of the book that work for you, and to actively triage your time, energy and effort.  &lt;/p&gt;
&lt;p&gt;However, one premise doesn't see a lot of investigation or introspection -- that is, whether you should even have an SRE team at all. The existence of SRE (or "DevOps", or "Production Engineering", or "Platform Trust", or any of the other taxonomic manoeuvres I've seen) is still treated somewhat as a given.&lt;/p&gt;
&lt;p&gt;This can be for a number of reasons -- there's a pre-existing "Operations" team, there's a strong leadership advocate, or the skills exist in the organisation and it's considered good to consolidate them. It can also often be for less good reasons; pure organisational momentum, a desire to silo-ise toil, and for reactionary reasons.&lt;/p&gt;
&lt;p&gt;The arguments for having an SRE team can often seem obvious: Everyone likes having reliable systems; it's hard to argue against that. Everyone also likes someone else doing the hard bits. So, let's cover the counterpoint: 6 Reasons you shouldn't have an SRE team.&lt;/p&gt;
&lt;h3&gt;1. You're not Google&lt;/h3&gt;
&lt;p&gt;Even though Google's approaches to parts of the problem space are battle-tested and can help you with your approach, you're still not Google. You're not even Google in 2004.  &lt;/p&gt;
&lt;p&gt;As someone who was at Google in 2004, I will let you in on the secret sauce that made a large SRE team at Google a good idea.&lt;/p&gt;
&lt;h4&gt;Intractable Problems&lt;/h4&gt;
&lt;p&gt;Two factors drove the near-intractability of the situation Google found itself in around this time: Scale and Innovation. These sound like lovely soundbites, but the truth of the matter was that nobody was doing anything even remotely like what Google were doing at this time. Nobody had the same requirements.&lt;/p&gt;
&lt;p&gt;In my last gig before Google, we were using &lt;a href="https://www.nagios.org/"&gt;nagios&lt;/a&gt; and &lt;a href="https://sourceforge.net/projects/mon/"&gt;mon&lt;/a&gt;. None of these sharded in any way that would work for us; they were designed for monitoring well-understood things about a few hundred hosts. At the time, Google had a couple of hundred thousand hosts, and were adding thousands more per month. Nothing in either the OSS world or that you could buy was even going to come close. There was no Prometheus, no Docker, no Terraform, and these wouldn't exist in any meaningful way for another 10+ years. &lt;/p&gt;
&lt;p&gt;We had to build, deploy and somehow keep together things at a scale and complexity (related to the tools available) that we had never before seen in our lifetimes and likely will never see again. This meant that simply hiring buildings full of operators wasn't going to work at all. This was plainly obvious -- even back of envelope calculations showed a scaling vector that meant we'd run out of humans in pretty short order.&lt;/p&gt;
&lt;h4&gt;Unlimited Money&lt;/h4&gt;
&lt;p&gt;I'm not a finance person, but it's difficult to express just how much Google was able to invest  around the software ecosystem. Everything from the hardware and cooling to the  monitoring software and job scheduler was built in-house. This was mostly because of the "Intractable Problems" thing, but also because we could afford to. In many cases, we'd be fools not to; the industry in certain areas wasn't moving fast enough, or even in the right direction for us. Compartmentalising the big-picture reliability problems into a group that we could build or hire the best SREs on the planet into was a no-brainer.&lt;/p&gt;
&lt;p&gt;To briefly get off the Google fanboy train and go back to the crux of the point: You're not Google. I don't mean that your product isn't a great product, or that you don't have real problems. I'm saying that the criteria you use to evaluate whether you need a full-blown SRE presence should be specific to you. In your checklist of things to potentially adopt from Google’s model, include an item about having a separate team at all.&lt;/p&gt;
&lt;p&gt;(Also, you're not Google in 2004. If I want a reliable sharded database with global consistency today, I can buy it and have it today, using money. This was not even close to true in 2004)  &lt;/p&gt;
&lt;h3&gt;2. You don't care that much about reliability.&lt;/h3&gt;
&lt;p&gt;It's very easy to say you care about reliability, and very difficult to figure out and assert how much you actually care. Even if you have a good set of SLIs and SLOs, that's not the whole picture of 'how much' you care. Is reliability so important to you that you need a whole entire team just to care about it? What about testing? Product Research? Customer feedback loops? Why did SRE get to the top of the list?&lt;/p&gt;
&lt;p&gt;It's easy to assert that near-perfect reliability will drive user retention. It's very difficult to be so sure of your product's use case that you can put a number on how much unreliability you can tolerate. Saying "We're not a stock exchange, our users will tolerate 5 minutes of downtime every so often" is a strong thing to be able to assert; stronger and more rigorous than saying you need eleventy nines for your applications at all times.  &lt;/p&gt;
&lt;p&gt;&lt;a href="https://artdiamondblog.com/archives/2013/10/_source_levy_st_19.html"&gt;This quote&lt;/a&gt; from Steven Levy's book on how earlier Google worked epitomises what was heard loud and clear throughout engineering. I once had Eric Schmidt visit my team's weekly meeting (I was on GMail SRE at the time), and the money quote was his response to a question about prioritising reliability vs. product features -- "I will choose reliability, every time". We framed that quote in our team area. Even to the jaded sysadmins and hackers we often saw ourselves as at the time, it was a clear message that reliability was a product feature, all the way from co-founder to CEO to us.&lt;/p&gt;
&lt;p&gt;Is your CEO saying these things? Even behind closed doors? What about when you're under pressure to ship features? &lt;/p&gt;
&lt;p&gt;The extent to which Google really, genuinely cared about reliability and made the company-level investment in it can often be forgotten - think about if you're there with them.&lt;/p&gt;
&lt;h3&gt;3. You're not sure what the team should do or own.&lt;/h3&gt;
&lt;p&gt;If you’re not able to very succinctly explain and have everyone understand what a team like SRE is there for, then the ambiguity and ability to re-legislate who does what is often quietly seen as an advantage.&lt;/p&gt;
&lt;p&gt;Some places have this right; many places don't. If a team is chartered and has a remit, that remit should be set in stone and the requisite effort put in place to maintain that knowledge. In &lt;a href="https://www.youtube.com/watch?v=zIHVordrtBc"&gt;my talk at SRECon Europe 2022&lt;/a&gt; I go into a bit of detail about stakeholders. It's been my personal experience that Stakeholders (i.e. product owners, dev partners, company leadership) either misunderstand or make it their business to misunderstand what an SRE group should be doing..  &lt;/p&gt;
&lt;p&gt;Even at Google (that wonderful flawless bastion of enlightened thought on the matter), I routinely ran into folks at VP level and above who didn't really know where their team's remit ended and SRE began, and really weren't inclined to learn. Free labour is free labour.&lt;/p&gt;
&lt;p&gt;If you're not 100% sure what SRE (or any team, for that matter!) exists for, what it will do, and why it's a separate team, then you should strongly consider whether it being a separate team is a good idea. &lt;/p&gt;
&lt;h3&gt;4. You're doing it to avoid internalising inconvenient truths&lt;/h3&gt;
&lt;h4&gt;Understanding reliability is everyone's job&lt;/h4&gt;
&lt;p&gt;If you, or engineering leads who work in your org don't think so, then hiring a separate team to care about it isn't going to help.&lt;/p&gt;
&lt;p&gt;There is a continuing meme within software that I personally don't understand -- the idea that software can be produced to spec and then disappears into the ether. It existed to an extent in the area of software being shipped on Floppies/CDs/DVDs/tarballs, where the turnaround on bug reports, releases, patches, etc. was measured in development cycles. However, it has somehow survived into the era of there being exactly one running installation of your software that you care about.&lt;/p&gt;
&lt;p&gt;I can understand how a team might shard the effort of day-to-day running of their software; I have less time for the idea that the entire ecosystem of care can be put on another team.&lt;/p&gt;
&lt;h4&gt;Low-value work&lt;/h4&gt;
&lt;p&gt;Reliability engineering is widely viewed as low-value work. Even if leadership is bought in, the meme is extremely prevalent at all levels of product engineering.&lt;/p&gt;
&lt;p&gt;The usual response from many folks in Ops and SRE roles would be to say that it's not -- it's crucial to system reliability and customer trust, and so forth.&lt;/p&gt;
&lt;p&gt;The good news is that you can both be right!&lt;/p&gt;
&lt;p&gt;Further to my &lt;a href="https://www.usenix.org/publications/loginonline/oncall-equal-opportunity-waste-time"&gt;article on oncall being a waste of time&lt;/a&gt;, I'm here to make a humble request that people say this quiet part out loud. If it's the honest assessment of an engineer or leader that work is of low value, that actually forms the basis of a &lt;em&gt;very&lt;/em&gt; interesting conversation to be had about priority and how this toil gets looked after. Some of the most stressed and least valuable SRE engagements I've seen were ones where people were talking past each other, with nobody acknowledging what everyone knew; that nobody wants to do this work, and it should be eliminated. The convenient presence of an SRE team means the conversation can often go around in circles. Speaking of which...&lt;/p&gt;
&lt;h4&gt;5. Your SRE team might be a red herring&lt;/h4&gt;
&lt;p&gt;Further to the above post about low-value work; another anti-pattern which means you should be very sure about your SRE team's remit is that they are a convenient outlet for lack of planning or responsibility on the part of other groups.&lt;/p&gt;
&lt;p&gt;Even in a "you build it, you run it" shop, it's possible to indefinitely delay and defer platform reliability or modernisation work, on the grounds that an SRE team should either be doing this work wholesale, or assisting. It's a very easy position to take, that has the side-effect of deflecting responsibility from a product or service owner onto an SRE group. I've seen cases where the SRE group hasn't even been asked to do the work, this deflecting responsibility onto...nobody.&lt;/p&gt;
&lt;p&gt;Would this happen if you were able to be 100% sure that everyone at your company completely understands SREs remit and has no issues with it? No!&lt;/p&gt;
&lt;p&gt;Would it happen if you didn't have an SRE team at all? Also No!&lt;/p&gt;
&lt;h4&gt;6. You got a big fright&lt;/h4&gt;
&lt;p&gt;We've established that it's difficult to put your finger on how much you care about reliability. Even if you have done the work and really have a good handle on this; you may be one giant outage away from throwing that all away on optics.&lt;/p&gt;
&lt;p&gt;As a thought experiment: go and corner your nearest C[TE]O and ask them how much they care about reliability. It's probably a lot, right? &lt;/p&gt;
&lt;p&gt;Now, go ask them again the day after a multiple-hour outage. Now it's definitely a lot!&lt;/p&gt;
&lt;p&gt;SRE as a brand is a double-edged sword -- it can be seen as a panacea for reliability problems, when often it's a more intrinsic cultural investment. I have personally seen SRE groups appear almost overnight, or have money/headcount thrown at them because we had some big outages and this is an audacious hail-mary pass of a move that really shows you care about reliability. If you're not careful, you could be telling your customers you care about reliability, while telling your product developers and leadership that they don't have to.&lt;/p&gt;
&lt;h3&gt;Conclusion&lt;/h3&gt;
&lt;p&gt;The model of large SRE teams covering many services in a vague and nebulous way that's open to repeated re-interpretation is mostly a side-effect of (a) cargo-culting the building of these large groups, or (b) retrofitting SRE/DevOps onto existing groups without the company-wide reliability focus required (or the fortitude to decide you didn't need such a large group to do SRE).&lt;/p&gt;
&lt;p&gt;Most of the reasons Google built a large SRE presence were related to being better-equipped to throw money at all its problems than most companies in the last 100 years, and due to the utter absence of big parts of the software and hardware infrastructure needed to build reliable services.&lt;/p&gt;
&lt;p&gt;The last 20 years have seen enormous advances in the SaaS and infrastructure software space. What was absent is rapidly becoming present; what was esoteric is rapidly becoming generic. To fully realise the SRE model is to reduce complexity, to make rational choices, to buy because you don't need to build.&lt;/p&gt;
&lt;p&gt;The next stage in removing our production training wheels as an industry is to tear down the fence between SRE and Product Engineering, and make rational investments in reliability as a mindset, based on specific needs.&lt;/p&gt;</content><category term="Writing"/><category term="writing"/><category term="sre"/></entry><entry><title>What Would It Take?</title><link href="https://log.andvari.net/what-would-it-take.html" rel="alternate"/><published>2023-05-12T15:48:00+01:00</published><updated>2023-05-12T15:48:00+01:00</updated><author><name>Dave O'Connor</name></author><id>tag:log.andvari.net,2023-05-12:/what-would-it-take.html</id><summary type="html">&lt;p&gt;One of the most constructive things I've found when facing a difficult work situation is to externalise and write things down -- I do quite well with talking things through with folks, and coaching and mentoring is something I get a lot of energy out of. When it's just me and …&lt;/p&gt;</summary><content type="html">&lt;p&gt;One of the most constructive things I've found when facing a difficult work situation is to externalise and write things down -- I do quite well with talking things through with folks, and coaching and mentoring is something I get a lot of energy out of. When it's just me and my thoughts, the interlocutor is writing, and I often have to push myself to get that done. &lt;/p&gt;
&lt;p&gt;When talking to others in this situation, I'd often frame it in simple "What would it take?" terms. In the case of whether to stick around in a job or stick with a project or idea, this would be "What would it take for you to quit today?", and the vital accompanying question of "What would it take to resolve all this, and to re-commit indefinitely?". In the case of a job, you're either going to quit, or you're not. Spending a lot of time agonising in between does pretty much everyone a disservice, yourself more than anyone. I think it's healthiest to always be going in the direction of one or the other, if you do feel there's work to be done there.&lt;/p&gt;
&lt;p&gt;What I hadn't been realising is that this was a kind of abridged version of the spookily-named &lt;a href="https://www.mindtools.com/adnb7ul/overwhelmed-at-work"&gt;CIA Model&lt;/a&gt; used in formal coaching. Essentially, you're doing a bit of compartmentalising of issues in your own head into "Stuff I can fix", "Stuff I can affect", and "Stuff I just have to put up with". There's a lot to unpack there -- in some cases, there may be bright lines around ethics, capability and sometimes just bad timing that mean you have to self-select out of the situation. You may also decide that something you think you can fix is something you likely shouldn't, or that you need to think about if you should tolerate it long-term.&lt;/p&gt;
&lt;p&gt;Conversely, taking a full checkpoint on what matters to you, and what the real situation is will lead you to clarity on next steps. It'd be a shame to not investigate options for how to move stuff from "I tolerate this" to "I'm okay with it", or better.&lt;/p&gt;
&lt;p&gt;Most people take a little prodding to get there, and I'm no different. If there's no capture of what the sources of stress are and why they matter, we end up having to rely on the Amygdala, the "animal brain". &lt;strong&gt;It is very hard to reason logically when we think we're going to be eaten by a tiger.&lt;/strong&gt; This is why we invented writing.&lt;/p&gt;
&lt;p&gt;So, if you're at a point where you think you're phoning it in, or you're struggling to get the basics done; think about the theory first. &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;What about this situation do you directly &lt;strong&gt;C&lt;/strong&gt;ontrol?&lt;/li&gt;
&lt;li&gt;What do you not control, but have some &lt;strong&gt;I&lt;/strong&gt;nfluence over the outcome?&lt;/li&gt;
&lt;li&gt;What parts are you going to have to &lt;strong&gt;A&lt;/strong&gt;ccept and deal with?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you're able to successfully explore what the situation on the ground is, the slightly more practical parts that I've worked through with others and for myself are usually the more challenging practicalities:&lt;/p&gt;
&lt;h6&gt;&lt;strong&gt;What is the issue&lt;/strong&gt; I'm dealing with?&lt;/h6&gt;
&lt;ul&gt;
&lt;li&gt;Do I feel underappreciated and having my incentives change would fix it?&lt;/li&gt;
&lt;li&gt;Do I feel like a co-worker is making my life difficult?&lt;/li&gt;
&lt;li&gt;Is my employer doing a kind of business you don't like?&lt;/li&gt;
&lt;li&gt;Am I not seeing enough customer conversions or sales? &lt;/li&gt;
&lt;li&gt;&lt;strong&gt;All of these are valid. They all take up mental energy.&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h6&gt;&lt;strong&gt;What would it take&lt;/strong&gt; to resolve the issue and have me re-commit to the idea/project/job/etc. indefinitely? What's missing?&lt;/h6&gt;
&lt;ul&gt;
&lt;li&gt;It costs mental energy to &lt;em&gt;tolerate&lt;/em&gt; a situation, rather than having it be a fact of life that you're truly reconciled with.&lt;/li&gt;
&lt;li&gt;Moving from &lt;em&gt;tolerance&lt;/em&gt; to &lt;em&gt;acceptance&lt;/em&gt; takes change; and it's almost never just rationalising the situation away in your head. Real change involves doing something you can't take back for free.&lt;/li&gt;
&lt;/ul&gt;
&lt;h6&gt;&lt;strong&gt;What would it take&lt;/strong&gt; to give up and change direction completely? What's stopping me?&lt;/h6&gt;
&lt;ul&gt;
&lt;li&gt;This is a challenging rhetorical question, but the actual answer is something you should try to write down. What would need to &lt;em&gt;change&lt;/em&gt; in order for you to say "thus far, and no further" and do something more drastic than putitng up with the situation? What's your 'trapdoor'?&lt;/li&gt;
&lt;li&gt;In each of these cases, you should consider making these things known, if you're able to be definite about them. This especially applies to business outcomes, where you might not be the only stakeholder.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This has worked for me in a long career at one employer, and for knowing when I'm able to make changes without compromising on core values, but also to provide an outlet for the Sunday Night Terrors when the Amygdala takes over. The right answer isn't always the most satisfying one, but it's easier once you put (figurative) pen to paper.&lt;/p&gt;</content><category term="Writing"/><category term="work"/><category term="coaching"/></entry><entry><title>Oncall: An Equal-Opportunity Waste of Time</title><link href="https://log.andvari.net/oncall-an-equal-opportunity-waste-of-time.html" rel="alternate"/><published>2022-11-04T21:13:00+00:00</published><updated>2022-11-04T21:13:00+00:00</updated><author><name>Dave O'Connor</name></author><id>tag:log.andvari.net,2022-11-04:/oncall-an-equal-opportunity-waste-of-time.html</id><summary type="html">&lt;p&gt;I spent a number of days at &lt;a href="https://www.usenix.org/conference/srecon22emea"&gt;SRECon 2022&lt;/a&gt; in Amsterdam, and gave a talk that I'd had rattling around in my head for a wee while - roughtly based on a paper I left at Google when I left.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.twitter.com/lauralifts"&gt;Laura&lt;/a&gt; was good enough to suggest it as a ;login: article …&lt;/p&gt;</summary><content type="html">&lt;p&gt;I spent a number of days at &lt;a href="https://www.usenix.org/conference/srecon22emea"&gt;SRECon 2022&lt;/a&gt; in Amsterdam, and gave a talk that I'd had rattling around in my head for a wee while - roughtly based on a paper I left at Google when I left.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.twitter.com/lauralifts"&gt;Laura&lt;/a&gt; was good enough to suggest it as a ;login: article, so here it is: &lt;a href="https://www.usenix.org/publications/loginonline/oncall-equal-opportunity-waste-time"&gt;Oncall: An Equal-Opportunity Waste of Time&lt;/a&gt;.&lt;/p&gt;</content><category term="Writing"/><category term="log"/><category term="writing"/><category term="oncall"/><category term="sre"/></entry><entry><title>Working Deliberately with OKRs</title><link href="https://log.andvari.net/working-deliberately-with-okrs.html" rel="alternate"/><published>2021-06-26T23:40:00+01:00</published><updated>2021-06-26T23:40:00+01:00</updated><author><name>Dave O'Connor</name></author><id>tag:log.andvari.net,2021-06-26:/working-deliberately-with-okrs.html</id><summary type="html">&lt;p&gt;After being asked by a work colleague to say words about OKRs and if they're a good thing to adopt, I realised I had more words than fit in an email or short doc -- so, here's a slightly longer doc on my view of OKRs, tempered by 15 years or …&lt;/p&gt;</summary><content type="html">&lt;p&gt;After being asked by a work colleague to say words about OKRs and if they're a good thing to adopt, I realised I had more words than fit in an email or short doc -- so, here's a slightly longer doc on my view of OKRs, tempered by 15 years or so of using them fairly successfully, and seeing them used...less successfully. They're one of those "Google does them so let's do them too" type things that seem like a no-brainer, but can often mask the right answer for where your practice is at.&lt;/p&gt;
&lt;p&gt;Anyway, say hello to &lt;a href="/pages/working-deliberately-with-okrs.html"&gt;Working Deliberately with OKRs&lt;/a&gt; (with thanks to &lt;a href="https://www.twitter.com/ahidalgosre"&gt;@ahidalgosre&lt;/a&gt; for casting a critical eye over it). &lt;/p&gt;</content><category term="Writing"/><category term="writing"/><category term="okrs"/></entry><entry><title>Bad Machinery Paper</title><link href="https://log.andvari.net/bad-machinery-paper.html" rel="alternate"/><published>2021-05-19T20:34:00+01:00</published><updated>2021-05-19T20:34:00+01:00</updated><author><name>Dave O'Connor</name></author><id>tag:log.andvari.net,2021-05-19:/bad-machinery-paper.html</id><summary type="html">&lt;p&gt;Thanks to my accomodating erstwhile colleagues, I'm able to publish the &lt;a href="/pages/bad-machinery.html"&gt;original paper&lt;/a&gt; (with minor redactions so it makes sense for an external audience). A version of this paper later became Chapter 29 of the &lt;a href="https://sre.google/sre-book/table-of-contents/"&gt;SRE Book&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;To add a little bit of extra colour, the document came out of …&lt;/p&gt;</summary><content type="html">&lt;p&gt;Thanks to my accomodating erstwhile colleagues, I'm able to publish the &lt;a href="/pages/bad-machinery.html"&gt;original paper&lt;/a&gt; (with minor redactions so it makes sense for an external audience). A version of this paper later became Chapter 29 of the &lt;a href="https://sre.google/sre-book/table-of-contents/"&gt;SRE Book&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;To add a little bit of extra colour, the document came out of some lessons learned on Storage SRE (specifically Bigtable/Colossus), that I ran directly back in the early 2010s -- The date I have on this document is 2016, but I know it was around long before that.&lt;/p&gt;
&lt;p&gt;Storage SRE back then was something of an anomaly -- there were a number of teams that managed storage services (the above, but also some various other services, some of which where purely internal and never saw the light of day). To amalgamate these services meant that we potentially won a bunch of &lt;strong&gt;OMG synergy&lt;/strong&gt; because of the base assumptions made in design that were unique to services that kept data. Google started out as a search company with a throwaway, re-scrape-able corpus (built on GFS, which wasn't designed to persist data), so it was natural for us to have a couple of goes at stateful services, and see what stuck. &lt;/p&gt;
&lt;p&gt;So, while you might imagine that lumping services of a similar shape together is a good thing to reduce duplication of effort (and you'd be mostly right), other factors include service maturity, generalised attitude toward production, and even come down to the vagaries of the dev/SRE relationship, or of engineering leadership as a whole.&lt;/p&gt;
&lt;p&gt;So around then, we had a Bigtable and Colossus service that generally worked remarkably well. We had challenges in observability, and in developing the tech around load-sharing of users on a shared compute/storage pool. This later became known as running a &lt;a href="https://www.oreilly.com/content/multi-single-tenant-architectures-in-cloud/"&gt;multi-single-tenant&lt;/a&gt; architecture, although at the time we saw it as fairly chaotic - not all the tooling had been done, and some of the assumptions and prior art we have the benefit of today wasn't a thing, yet.&lt;/p&gt;
&lt;p&gt;While others looked at this as a purely technical problem, myself and a couple of key leads saw this as a problem of focus. Google in its earlier days had faced additional problems of scale like this -- where esentially we could keep digging, or we could put in place a set of attitudes and approaches that enabled us getting out of the hole we were in as far as interrupts are. Some of these problems fell into the "We will need 5000 people to edit config files" or "We will need more hard drives than humanity is making" variety, which really do need 'creative solutions').&lt;/p&gt;
&lt;p&gt;One of the anti-patterns we observed at the time was a tendency among folks on both Dev and SRE teams to concentrate on speedy detection and resolution of issues (The 'Rocks with Eyes" method, as succinctly described at the time). Being able to know when things are busted &lt;strong&gt;is&lt;/strong&gt; important -- but the followup is key. Oncall and interrupts are fundamentally a waste of time and should be minimised, so signing up for extra seems not a great approach.&lt;/p&gt;
&lt;p&gt;I had a separate paper related to this one called something like 'email alerts are from the past' -- and my real-life experience of that was coaching some key team members through some of the above approaches. We had folks who felt that they needed to keep getting email alerts, as they had trained themselves to spot patterns, and believed they needed to keep doing this or we'd have more outages. You might step back from this a wee bit and recognise that this was a set of people who were inadvertently being profoundly disrespectful of their own time. It's not because they were bad people of course, it's because they were busy people, and firefighting had found its way into their muscle memory.&lt;/p&gt;
&lt;p&gt;In the end (in this example), myself and senior TL on the team ended up unilaterally turning off email alerts. She sent me the code review, I approved it, done. It was clear that we weren't going to reach consensus on whether or not it was a good idea; and while I'm not a fan of this approach in general, it became clear we weren't going to get there (see also the 'consensus is nice' section). In this case, nothing happened. A few folks were briefly sad, but ended up with more time on the clock and more spoons.&lt;/p&gt;
&lt;p&gt;That, to my mind, is the outcome of a successful strategy around interrupt management. The interrupts need to be addressed, and in a timely way -- but you as the team responsible hold the ability to do it in a way that doesn't treat people as machines. In addition, interupts are not a closed system - the interrupts you're getting a month from now need to be different from the ones you're getting today -- by finding a better set of first world problems. &lt;/p&gt;
&lt;p&gt;But moreso, the way you give yourself the ability to step back and really address the root causes in a noisy system is to respect your time, reduce context switches, and give people both the well-clock time and the spoons to be able to do so.&lt;/p&gt;</content><category term="Writing"/><category term="log"/><category term="writing"/></entry><entry><title>Everybody's Free (To Use Their Best Judgement)</title><link href="https://log.andvari.net/everybodys-free-to-use-their-best-judgement.html" rel="alternate"/><published>2016-09-26T11:22:00+01:00</published><updated>2016-09-26T11:22:00+01:00</updated><author><name>Dave O'Connor</name></author><id>tag:log.andvari.net,2016-09-26:/everybodys-free-to-use-their-best-judgement.html</id><content type="html">&lt;p&gt;This is something I wrote for new Googlers, drinking from the information hose. It's not formally part of the onboarding process, but enough people email me about it that I suspect it's passed around a lot informally.&lt;/p&gt;
&lt;p&gt;(With Apologies to Mary Schmich)&lt;/p&gt;
&lt;p&gt;&lt;a href="/pages/everybodys-free.html"&gt;Everybody's Free (To Use Their Best Judgment)&lt;/a&gt;&lt;/p&gt;</content><category term="Writing"/><category term="writing"/></entry><entry><title>Conferences for Introverts</title><link href="https://log.andvari.net/conferences-for-introverts.html" rel="alternate"/><published>2016-08-20T17:29:00+01:00</published><updated>2016-08-20T17:29:00+01:00</updated><author><name>Dave O'Connor</name></author><id>tag:log.andvari.net,2016-08-20:/conferences-for-introverts.html</id><content type="html">&lt;p&gt;This is a paper I wrote for consumption of people doing conferences and trainings at Google. I figured it was more generally applicable, so here it is: &lt;a href="/pages/conferences-for-introverts.html"&gt;Conferences for Introverts&lt;/a&gt;&lt;/p&gt;</content><category term="Writing"/><category term="writing"/><category term="conferences"/><category term="introverts"/></entry></feed>