Screaming in the Cloud with Corey Quinn features conversations with domain experts in the world of Cloud Computing. Topics discussed include AWS, GCP, Azure, Oracle Cloud, and the "why" behind how businesses are coming to think about the Cloud.
Evolving Alongside Cloud Technology with Jason McKay
31:35Jason McKay, Chief Technology Officer at Logicworks, joins Corey on Screaming in the Cloud to discuss how the cloud landscape has changed and what changes are picking up steam. Jason highlights the benefit of working in a consulting role, which provides a constant flow of interesting problems to solve. Corey and Jason also explore why cloud was positioned well for the current economic changes, and how Kubernetes is slowly but surely becoming more standardized. Jason also reveals some of his predictions for the future of cloud-based development. About JasonJason is responsible for leading Logicworks’ technical strategy including its software and DevOps product roadmap. In this capacity, he works directly with Logicworks’ senior engineers and developers, technology vendors and partners, and R&D team to ensure that Logicworks service offerings meet and exceed the performance, compliance, automation, and security requirements of our clients. Prior to joining Logicworks in 2005, Jason worked in technology in the Unix support trenches at Panix (Public Access Networks). Jason graduated Bard College with a Bachelor of Arts and holds several AWS and Azure Professional certifications.Links Referenced: Logicworks: https://www.logicworks.com/ LinkedIn: https://www.linkedin.com/in/jasonhmckay/
The Realities of Working in Data with Emily Gorcenski
36:22Emily Gorcenski, Data & AI Service Line Lead at Thoughtworks, joins Corey on Screaming in the Cloud to discuss how big data is changing our lives - both for the better, and the challenges that come with it. Emily explains how data is only important if you know what to do with it and have a plan to work with it, and why it’s crucial to understand the use-by date on your data. Corey and Emily also discuss how big data problems aren’t universal problems for the rest of the data community, how to address the ethics around AI, and the barriers to entry when pursuing a career in data. About EmilyEmily Gorcenski is a principal data scientist and the Data & AI Service Line Lead of ThoughtWorks Germany. Her background in computational mathematics and control systems engineering has given her the opportunity to work on data analysis and signal processing problems from a variety of complex and data intensive industries. In addition, she is a renowned data activist and has contributed to award-winning journalism through her use of data to combat extremist violence and terrorism. The opinions expressed are solely her own.Links Referenced: ThoughtWorks: https://www.thoughtworks.com/ Personal website: https://emilygorcenski.com Twitter: https://twitter.com/EmilyGorcenski Mastodon: https://mastodon.green/@[email protected] TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I’m Corey Quinn. My guest today is Emily Gorcenski, who is the Data and AI Service Line Lead over at ThoughtWorks. Emily, thank you so much for joining me today. I appreciate it.Emily: Thank you for having me. I’m happy to be here.Corey: What is it you do, exactly? Take it away.Emily: Yeah, so I run the data side of our business at ThoughtWorks, Germany. That means data engineering work, data platform work, data science work. I’m a data scientist by training. And you know, we’re a consulting company, so I’m working with clients and trying to help them through the, sort of, morphing landscape that data is these days. You know, should we be migrating to the cloud with our data? What can we migrate to the cloud with our data? Where should we be doing with our data scientists and how do we make our data analysts’ lives easier? So, it’s a lot of questions like that and trying to figure out the strategy and all of those things.Corey: You might be one of the most perfectly positioned people to ask this question to because one of the challenges that I’ve run into consistently and persistently—because I watch a lot of AWS keynotes—is that they always come up with the same talking point, that data is effectively the modern gold. And data is what unlocks value to your busin—“Every business agrees,” because someone who’s dressed in what they think is a nice suit on stage is saying that it’s, “Okay, you’re trying to sell me something. What’s the deal here?” Then I check my email and I discover that Amazon has sent me the same email about the same problem for every region I’ve deployed things to in AWS. And, “Oh, you deploy this to one of the Japanese regions. We’re going to send that to you in Japanese as a result.”And it’s like, okay, for a company that says data is important, they have no idea who any of their customers are at this point, is that is the takeaway here. How real is, “Data is important,” versus, “We charge by the gigabyte so you should save all of your data and then run expensive things on top of it.”Emily: I think data is very important, if you know what you’re going to do with it and if you have a plan for how to work with it. I think if you look at the history of computing, of technology, if you go back 20 years to maybe the early days of the big data era, right? Everyone’s like, “Oh, we’ve got big data. Data is going to be big.” And for some reason, we never questioned why, like, we were thinking that the ‘big’ in ‘big data’ meant big is in volume and not ‘big’ as in ‘big pharma.’This sort of revolution never really happened for most companies. Sure, some companies got a lot of value from the, sort of, data mining and just gather everything and collect everything and if you hit it with a big computational hammer, insights will come out and somehow there’s insights will make you money through magic. The reality is much more prosaic. If you want to make money with data, you have to have a plan for what you’re going to do with data. You have to know what you’re looking for and you have to know exactly what you’re going to get when you look at your data and when you try to answer questions with it.And so, when we see somebody like Amazon not being able to correlate that the fact that you’re the account owner for all of these different accounts and that the language should be English and all of these things, that’s part of the operational problem because it’s annoying, to try to do joins across multiple tables in multiple regions and all of those things, but it’s also part—you know, nobody has figured out how this adds value for them to do that, right? There’s a part of it where it’s like, this is just professionalism, but there’s a part of it, where it’s also like… whatever. You’ve got Google Translate. Figure out yourself. We’re just going to get through it.I think that… as time has evolved from the initial waves of the big data era into the data science era, and now we’re in, you know, all sorts of different architectures and principles and all of these things, most companies still haven’t figured out what to do with data, right? They’re still investing a ton of money to answer the same analytics questions that they were answering 20 years ago. And for me, I think that’s a disappointment in some regards because we do have better tools now. We can do so many more interesting things if you give people the opportunity.Corey: One of the things that always seemed a little odd was, back when I wielded root credentials in anger—anger,’ of course, being my name for the production environment, as opposed to, “Theory,” which is what I call staging because it works in theory, but not in production. I digress—it always felt like I was getting constant pushback from folks of, “You can’t delete that data. It’s incredibly important because one day, we’re going to find a way to unlock the magic of it.” And it’s, “These are web server logs that are 15 years old, and 98% of them by volume are load balancer health checks because it turns out that back in those days, baby seals got more hits than our website did, so that’s not really a thing that we wind up—that’s going to add much value to it.” And then from my perspective, at least, given that I tend to live, eat, sleep, breathe cloud these days, AWS did something that was refreshingly customer-obsessed when they came out with Glacier Deep Archive.Because the economics of that are if you want to store a petabyte of data, with a 12-hour latency on request for things like archival logs and whatnot, it’s $1,000 a month per petabyte, which is okay, you have now hit a price point where it is no longer worth my time to argue with you. We’re just not going to delete anything ever again. Problem solved. Then came GDPR, which is neither here nor there and we actually want to get rid of those things for a variety of excellent legal reasons. And the dance continues.But my argument against getting rid of data because it’s super expensive no longer holds water in the way that it wants did for anything remotely resembling a reasonable amount of data. Then again, that’s getting reinvented all the time. I used to be very, I guess we’ll call it, I guess, a data minimalist. I don’t want to store a bunch of data, mostly because I’m not a data person. I am very bad thinking in that way.I consider SQL to be the chests of the programming world and I’m not particularly great at it. And I also unlucky and have an aura, so if I destroy a bunch of stateless web servers, okay, we can all laugh about that, but let’s keep me the hell away from the data warehouse if we still want a company tomorrow morning. And that was sort of my experience. And I understand my bias in that direction. But I’m starting to see magic get unlocked.Emily: Yeah, I think, you know, you said earlier, there’s, like, this mindset, like, data is the new gold or data is new oil or whatever. And I think it’s actually more true that data is the new milk, right? It goes bad if you don’t use it, you know, before a certain point in time. And at a certain point in time, it’s not going to be very offensive if you just leave it locked in the jug, but as soon as you try to open it, you’re going to have a lot of problems. Data is very, very cheap to store these days. It’s very easy to hold data; it’s very expensive to process data.And I think that’s where the shift has gone, right? There’s sort of this, like, Oracle DBA legacy of, like, “Don’t let the software developers touch the prod database.” And they’ve kind of kept their, like, arcane witchcraft to themselves, and that mindset has persisted. But now it’s sort of shifted into all of these other architectural patterns that are just abstractions on top of this, don’t let the software engineers touch the data store, right? So, we have these, like, streaming-first architectures, which are great. They’re great for software devs. They’re great for software devs. And they’re great for data engineers who like to play with big powerful technology.They’re terrible if you want to answer a question, like, “How many customers that I have yesterday?” And these are the things that I think are some of the central challenges, right? A Kappa architecture—you know, streaming-first architecture—is amazing if you want to improve your application developer throughput. And it’s amazing if you want to build real-time analytics or streaming analytics into your platform. But it’s terrible if you want your data lake to be navigable. It’s terrible if you want to find the right data that makes sense to do the more complex things. And it becomes very expensive to try to process it.Corey: One of the problems I think I have that is that if I take a look at the data volumes that I work with in my day-to-day job, I’m dealing with AWS billing data as spit out by the AWS billing system. And there isn’t really a big data problem here. If you take a look at some of the larger clients, okay, maybe I’m trying to consume a CSV that’s ten gigabytes. Yes, Excel is going to violently scream itself to death if I try to wind up loading it there, and then my computer smells like burning metal all afternoon. But if it fits in RAM, it doesn’t really feel like it’s a big data problem, on some level.And it just feels that when I look at the landscape of all the different tools you can use for things like this, they just feel like it’s more or less, hmm, “I have a loose thread on my shirt. Could you pass me that chainsaw for a second?” It just seems like stupendous overkill for anything that I’m working with. Counterpoint; that the clients I’m working with have massive data farms and my default response when I meet someone who’s very good at an area that I don’t do a lot of work in is—counterintuitively to what a lot of people apparently do on Twitter—is not the default assumption of oh, “I don’t know anything about that space. It must be worthless and they must be dumb.”No. That is not the default approach to take anything, from my perspective. So, it’s clear there’s something very much there that I just don’t see slash understand. That is a very roundabout way of saying what could be uncharitably distilled down to, “So, is your entire career bullshit?” But no, it is clearly not.There is value being extracted from this and it’s powerful. I just think that there’s been an industry-wide, relatively poor job done of explaining that value in ways that don’t come across as contrived or profoundly disturbing.Emily: Yeah, I think there’s a ton of value in doing things right. It gets very complicated to try to explain the nuances of when and how data can actually be useful, right? Oftentimes, your historical data, you know, it really only tells you about what happened in the past. And you can throw some great mathematics at it and try to use it to predict the future in some sense, but it’s not necessarily great at what happens when you hit really hard changes, right?For example, when the Coronavirus pandemic hit and purchaser and consumer behavior changed overnight. There was no data in the data set that explained that consumer behavior. And so, what you saw is a lot of these things like supply chain issues, which are very heavily data-driven on a normal circumstance, there was nothing in that data that allowed those algorithms to optimize for the reality that we were seeing at that scale, right? Even if you look at advanced logistics companies, they know what to do when there’s a hurricane coming or when there’s been an earthquake or things like that. They have disaster scenarios.But nobody has ever done anything like this at the global scale, right? And so, what we saw was this hard reset that we’re still feeling the repercussions of today. Yes, there were people who couldn’t work and we had lockdowns and all that stuff, but we also have an effect from the impact of the way that we built the systems to work with the data that we need to shuffle around. And so, I think that there is value in being able to process these really, really large datasets, but I think that actually, there’s also a lot of value in being able to solve smaller, simpler problems, right? Not everything is a big data problem, not everything requires a ton of data to solve.It’s more about the mindset that you use to look at the data, to explore the data, and what you’re doing with it. And I think the challenge here is that, you know, everyone wants to believe that they have a big data problem because it feels like you have to have a big data problem if you—Corey: All the cool kids are having this kind of problem.Emily: You have to have big data to sit at the grownup's table. And so, what’s happened is we’ve optimized a lot of tools around solving big data problems and oftentimes, these tools are really poor at solving normal data problems. And there’s a lot of money being spent in a lot of overkill engineering in the data space.Corey: On some level, it feels like there has been a dramatic misrepresentation of this. I had an article that went out last year where I called machine-learning selling pickaxes into a digital gold rush. And someone I know at AWS responded to that and probably the best way possible—she works over on their machine-learning group—she sent me a foam Minecraft pickaxe that now is hanging on my office wall. And that gets more commentary than anything, including the customized oil painting I have of Billy the Platypus fighting an AWS Billing Dragon. No, people want to talk about the Minecraft pickaxe.It’s amazing. It’s first, where is this creativity in any of the marketing that this department is putting out? But two it’s clearly not accurate. And what it took for me to see that was a couple of things that I built myself. I built a Twitter thread client that would create Twitter threads, back when Twitter was a place that wasn’t overrun by some of the worst people in the world and turned into BirdChan.But that was great. It would automatically do OCR on images that I uploaded, it would describe the image to you using Azure’s Cognitive Vision API. And that was magic. And now I see things like ChatGPT, and that’s magic. But you take a look at the way that the cloud companies have been describing the power of machine learning in AI, they wind up getting someone with a doctorate whose first language is math getting on stage for 45 minutes and just yelling at you in Star Trek technobabble to the point where you have no idea what the hell they’re saying.And occasionally other data scientists say, “Yeah, I think he’s just shining everyone on at this point. But yeah, okay.” It still becomes unclear. It takes seeing the value of it for it to finally click. People make fun of it, but the Hot Dog, Not A Hot Dog app is the kind of valuable breakthrough that suddenly makes this intangible thing very real for people.Emily: I think there’s a lot of impressive stuff and ChatGPT is fantastically impressive. I actually used ChatGPT to write a letter to some German government agency to deal with some bureaucracy. It was amazing. It did it, was grammatically correct, it got me what I needed, and it saved me a ton of time. I think that these tools are really, really powerful.Now, the thing is, not every company needs to build its own ChatGPT. Maybe they need to integrate it, maybe there’s an application for it somewhere in their landscape of product, in their landscape of services, in the landscape of their interim internal tooling. And I would be thrilled actually to see some of that be brought into reality in the next couple of years. But you also have to remember that ChatGPT is not something that came because we have, like, a really great breakthrough in AI last year or something like that. It stacked upon 40 years of research.We’ve gone through three new waves of neural networking in that time to get to this point, and it solves one class of problem, which is honestly a fairly narrow class of problem. And so, what I see is a lot of companies that have much more mundane problems, but where data can actually still really help them. Like how do you process Cambodian driver’s licenses with OCR, right? These are the types of things that if you had a training data set that was every Cambodian person’s driver’s license for the last ten years, you’re still not going to get the data volumes that even a day worth of Amazon’s marketplace generates, right? And so, you need to be able to solve these problems still with data without resorting to the cudgel that is a big data solution, right?So, there’s still a niche, a valuable niche, for solving problems with data without having to necessarily resort to, we have to load the entire internet into our stream and throw GPUs at it all day long and spend hundreds of—tens of millions of dollars in training. I don’t know, maybe hundreds of millions; however much ChatGPT just raised. There’s an in-between that I think is vastly underserved by what people are talking about these days.Corey: There is so much attention being given to this and it feels almost like there has been a concerted and defined effort to almost talk in circles and remove people from the humanity and the human consequences of what it is that they’re doing. When I was younger, in my more reckless years, I was never much of a fan of the idea of government regulation. But now it has become abundantly clear that our industry, regardless of how you want to define industry, how—describe a society—cannot self-regulate when it comes to data that has the potential to ruin people’s lives. I mean, I spent a fair bit of my time in my career working in financial services in a bunch of different ways. And at least in those jobs, it was only money.The scariest thing I ever dealt with, from a data perspective is when I did a brief stint at Grindr because that was the sort of problem where if that data gets out, people will die. And I have not had to think about things like that have that level of import before or since, for which I’m eternally grateful. “It’s only money,” which is a weird thing for a guy who fixes cloud bills for a living to say. And if I say that in a client call, it’s not going to go very well. But it’s the truth. Money is one of those things that can be fixed. It can be addressed in due course. There are always opportunities there. Someone just been outed to their friends, family, and they feel their life is now in shambles around them, you can’t unring that particular bell.Emily: Yeah. And in some countries, it can lead to imprisonment, or—Corey: It can lead to death sentences, yes. It’s absolutely not acceptable.Emily: There’s a lot to say about the ethics of where we are. And I think that as a lot of these high profile, you know, AI tools have come out over the last year or so, so you know, Stable Diffusion and ChatGPT and all of this stuff, there’s been a lot of conversation that is sort of trying to put some counterbalance on what we’re seeing. And I don’t know that it’s going to be successful. I think that, you know, I’ve been speaking about ethics and technology for a long time and I think that we need to mature and get to the next level of actually addressing the ethical problems in technology. Because it’s so far beyond things like, “Oh, you know, if there’s a biased training data set and therefore the algorithm is biased,” right?Everyone knows that by now, right? And the people who don’t know that, don’t care. We need to get much beyond where, you know, these conversations about ethics and technology are going because it’s a manifold problem. We have issues with the people labeling this data are paid, you know, pennies per hour to deal with some of the most horrific content you’ve ever seen. I mean, I’m somebody who has immersed myself in a lot of horrific content for some of the work that I have done, and this is, you know, so far beyond what I’ve had to deal with in my life that I can’t even imagine it. You couldn’t pay me enough money to do it and we’re paying people in developing nations, you know, a buck-thirty-five an hour to do this. I think—Corey: But you must understand, Emily, that given the standard of living where they are, that that is perfectly normal and we wouldn’t want to distort local market dynamics. So, if they make a buck-fifty a day, we are going to be generous gods and pay them a whopping dollar-seventy a day, and now we feel good about ourselves. And no, it’s not about exploitation. It’s about raising up an emerging market. And other happy horseshit that lies people tell themselves.Emily: Yes, it is. Yes, it is. And we’ve built—you know, the industry has built its back on that. It’s raised itself up on this type of labor. It’s raised itself up on taking texts and images without permission of the creators. And, you know, there’s—I’m not a lawyer and I’m not going to play one, but I do know that derivative use is something that at least under American law, is something that can be safely done. It would be a bad world if derivative use was not something that we had freely available, I think, and on the balance.But our laws, the thing is, our laws don’t account for the scale. Our laws about things like fair use, derivative use, are for if you see a picture and you want to take your own interpretation, or if you see an image and you want to make a parody, right? It’s a one-to-one thing. You can’t make 5 million parody images based on somebody’s art, yourself. These laws were never built for this scale.And so, I think that where AI is exploiting society is it’s exploiting a set of ethics, a set of laws, and a set of morals that are built around a set of behavior that is designed around normal human interaction scales, you know, one person standing in front of a lecture hall or friends talking with each other or things like that. The world was not meant for a single person to be able to speak to hundreds of thousands of people or to manipulate hundreds of thousands of images per day. It’s actually—I find it terrifying. Like, the fact that me, a normal person, has a Twitter following that, you know, if I wanted to, I can have 50 million impressions in a month. This is not a normal thing for a normal human being to have.And so, I think that as we build this technology, we have to also say, we’re changing the landscape of human ethics by our ability to act at scale. And yes, you’re right. Regulation is possibly one way that can help this, but I think that we also need to embed cultural values in how we’re using the technology and how we’re shaping our businesses to use the technology. It can be used responsibly. I mean, like I said, ChatGPT helped me with a visa issue, sending an email to the immigration office in Berlin. That’s a fantastic thing. That’s a net positive for me; hopefully, for humanity. I wasn’t about to pay a lawyer to do it. But where’s the balance, right? And it’s a complex topic.Corey: It is. It absolutely is. There is one last topic that I would like to talk to you about that’s a little less heavy. And I’ve got to be direct with you that I’m not trying to be unkind, but you’ve disappointed me. Because you mentioned to me at one point, when I asked how things were going in your AWS universe, you said, “Well, aside from the bank heist, reasonably well.”And I thought that you were blessed as with something I always look for, which is the gift of glorious metaphor. Unfortunately, as I said, you’ve disappointed me. It was not a metaphor; it was the literal truth. What the hell kind of bank heist could possibly affect an AWS account? This sounds like something out of a movie. Hit me with it.Emily: Yeah, you know, I think in the SRE world, we tell people to focus on the high probability, low impact things because that’s where it’s going to really hurt your business, and let the experts deal with the black swan events because they’re pretty unlikely. You know, a normal business doesn’t have to worry about terrorists breaking into the Google data center or a gang of thieves breaking into a bank vault. Apparently, that is something that I have to worry about because I have some data in my personal life that I needed to protect, like all other people. And I decided, like a reasonable and secure and smart human being who has a little bit of extra spending cash that I would do the safer thing and take my backup hard drive and my old phones and put them in a safety deposit box at an old private bank that has, you know, a vault that’s behind the meter-and-a-half thick steel door and has two guards all the time, cameras everywhere. And I said, “What is the safest possible thing that you can do to store your backups?” Obviously, you put it in a secure storage location, right? And then, you know, I don’t use my AWS account, my personal AWS account so much anymore. I have work accounts. I have test accounts—Corey: Oh, yeah. It’s honestly the best way to have an AWS account is just having someone else having a payment instrument attached to it because otherwise oh God, you’re on the hook for that yourself and nobody wants that.Emily: Absolutely. And you know, creating new email addresses for new trial accounts is really just a pain in the ass. So, you know, I have my phone, you know, from five years ago, sitting in this bank vault and I figured that was pretty secure. Until I got an email [laugh] from the Berlin Polizei saying, “There has been a break-in.” And I went and I looked at the news and apparently, a gang of thieves has pulled off the most epic heist in recent European history.This is barely in the news. Like, unless you speak German, you’re probably not going to find any news about this. But a gang of thieves broke into this bank vault and broke open the safety deposit boxes. And it turns out that this vault was also the location where a luxury watch consigner had been storing his watches. So, they made off with some, like, tens of millions of dollars of luxury watches. And then also the phone that had my 2FA for my Amazon account. So, the total value, you know, potential theft of this was probably somewhere in the $500 million range if they set up a SageMaker instance on my account, perhaps.Corey: This episode is sponsored in part by Honeycomb. I’m not going to dance around the problem. Your. Engineers. Are. Burned. Out. They’re tired from pagers waking them up at 2 am for something that could have waited until after their morning coffee. Ring Ring, Who’s There? It’s Nagios, the original call of duty! They’re fed up with relying on two or three different “monitoring tools” that still require them to manually trudge through logs to decipher what might be wrong. Simply put, there’s a better way. Observability tools like Honeycomb (and very little else becau se they do admittedly set the bar) show you the patterns and outliers of how users experience your code in complex and unpredictable environments so you can spend less time firefighting and more time innovating. It’s great for your business, great for your engineers, and, most importantly, great for your customers. Try FREE today at honeycomb.io/screaminginthecloud. That’s honeycomb.io/screaminginthecloud.Corey: The really annoying part that you are going to kick yourself on about this—and I’m not kidding—is, I’ve looked up the news articles on this event and it happened, something like two or three days after AWS put out the best release of last years, or any other re:Invent—past, present, future—which is finally allowing multiple MFA devices on root accounts. So finally, we can stop having safes with these things or you can have two devices or you can have multiple people in Covid times out of remote sides of different parts of the world and still get into the thing. But until then, nope. It’s either no MFA or you have to store it somewhere ridiculous like that and access becomes a freaking problem in the event that the device is lost, or in this case stolen.Emily: [laugh]. I will just beg the thieves, if you’re out there, if you’re secretly actually a bunch of cloud engineers who needed to break into a luxury watch consignment storage vault so that you can pay your cloud bills, please have mercy on my poor AWS account. But also I’ll tell you that the credit card attached to it is expired so you won’t have any luck.Corey: Yeah. Really sad part. Despite having the unexpired credit card, it just means that the charge won’t go through. They’re still going to hold you responsible for it. It’s the worst advice I see people—Emily: [laugh].Corey: Well, intentioned—giving each other on places like Reddit where the other children hang out. And it’s, “Oh, just use a prepaid gift card so it can only charge you so much.” It’s yeah, and then you get exploited like someone recently was and start accruing $60,000 a day in Lambda charges on an otherwise idle account and Amazon will come after you with a straight face after a week. And, like, “Yes, we’d like our $360,000, please.”Emily: Yes.Corey: “We tried to charge the credit card and wouldn’t you know, it expired. Could you get on that please? We’d like our money faster if you wouldn’t mind.” And then you wind up in absolute hell. Now, credit where due, they in every case I am aware of that is not looking like fraud’s close cousin, they have made it right, on some level. But it takes three weeks of back and forth and interminable waiting.And you’re sitting there freaking out, especially if you’re someone who does not have a spare half-million dollars sitting around. Imagine who—“You sound poor. Have you tried not being that?” And I’m firmly convinced that it a matter of time until someone does something truly tragic because they don’t understand that it takes forever, but it will go away. And from my perspective, there’s no bigger problem that AWS needs to fix than surprise lifelong earnings bills to some poor freaking student who is just trying to stand up a website as part of a class.Emily: All of the clouds have these missing stairs in them. And it’s really easy because they make it—one of the things that a lot of the cloud providers do is they make it really easy for you to spin up things to test them. And they make it really, really hard to find where it is to shut it all down. The data science is awful at this. As a data scientist, I work with a lot of data science tools, and every cloud has, like, the spin up your magical data science computing environment so that your data scientist can, like, bang on the data with you know, high-performance compute for a while.And you know, it’s one click of a button and you type in a couple of na—you know, a couple of things name, your service or whatever, name your resource. You click a couple buttons and you spin it up, but behind the scenes, it’s setting up a Kubernetes cluster and it’s setting up some storage bucket and it’s setting up some data pipelines and it’s setting up some monitoring stuff and it’s setting up a VM in order to run all of this stuff. And the next thing that you know, you’re burning 100, 200 euro a day, just to, like, to figure out if you can load a CSV into pandas using a Jupyter Notebook. And you’re like—when you try to shut it all down, you can’t. It’s you have to figure, oh, there is a networking thing set up. Well, nobody told me there’s a networking thing set up. You know? How do I delete that?Corey: You didn’t say please, so here you go. Without for me, it’s not even the giant bill going from $4 a month in S3 charges to half a million bucks because that is pretty obvious from the outside just what the hell’s been happening. It’s the little stuff. I am still—since last summer—waiting for a refund on $260 of ‘because we said so’ SageMaker credits because of a change of their billing system, for a 45-minute experiment I had done eight months before that.Emily: Yep.Corey: Wild stuff. Wild stuff. And I have no tolerance for people saying, “Oh, you should just read the pricing page and understand it better.” Yeah, listen, jackhole. I do this for a living. If I can fall victim to it, anyone can. I promise. It is not that I don’t know how the billing system works and what to do to avoid unexpected charges.And I’m just luck—because if I hadn’t caught it with my systems three days into the month, it would have been a $2,000 surprise. And yeah, I run a company. I can live with that. I wouldn’t be happy, but whatever. It is immaterial compared to, you know, payroll.Emily: I think it’s kind of a rite of passage, you know, to have the $150 surprise Redshift bill at the end of the month from your personal test account. And it’s sad, you know? I think that there’s so much better that they can do and that they should do. Sort of as a tangent, one of the challenges that I see in the data space is that it’s so hard to break into data because the tooling is so complex and it requires so much extra knowledge, right? If you want to become a software developer, you can develop a microservice on your machine, you can build a web app on your machine, you can set up Ruby on Rails, or Flask, or you know, .NET, or whatever you want. And you can do all of that locally.And you can learn everything you need to know about React, or Terraform, or whatever, running locally. You can’t do that with data stuff. You can’t do that with BigQuery. You can’t do that with Redshift. The only way that you can learn this stuff is if you have an account with that setup and you’re paying the money to execute on it. And that makes it a really high barrier for entry for anyone to get into this space. It makes it really hard to learn. Because if you want to learn anything by doing, like many of us in the industry have done, it’s going to cost you a ton of money just to [BLEEP] around and find out.Corey: Yes. And no one likes the find out part of those stories.Emily: Nobody likes to find out when it comes to your bill.Corey: And to tie it back to the data story of it, it is clearly some form of batch processing because it tries to be an eight-hour consistency model. Yeah, I assume for everything, it’s 72. But what that means is that you are significantly far removed from doing a thing and finding out what that thing costs. And that’s the direct charges. There’s always the oh, I’m going to set things up and it isn’t going to screw you over on the bill. You’re just planting a beautiful landmine you’re going to stumble blindly into in three months when you do something else and didn’t realize what that means.And the worst part is it feels victim-blamey. I mean, this is my pro—I guess this is one of the reasons I guess I’m so down on data, even now. It’s because I contextualize it in a sense of the AWS bill. No one’s happy dealing with that. You ever met a happy accountant? You have not.Emily: Nope. Nope [laugh]. Especially when it comes to clouds stuff.Corey: Oh yeah.Emily: Especially these days, when we’re all looking to save energy, save money in the cloud.Corey: Ideally, save the planet. Sustainability and saving money align on the axis of ‘turn that shit off.’ It’s great. We can hope for a brighter tomorrow.Emily: Yep.Corey: I really want to thank you for being so generous with your time. If people want to learn more, where can they find you? Apparently filing police reports after bank heists, which you know, it’s a great place to meet people.Emily: Yeah. You know, the largest criminal act in Berlin is certainly a place you want to go to get your cloud advice. You can find me, I have a website. It’s my name, emilygorcenski.com.You can find me on Twitter, but I don’t really post there anymore. And I’m on Mastodon at some place because Mastodon is weird and kind of a mess. But if you search me, I’m really not that hard to find. My name is harder to spell, but you’ll see it in the podcast description.Corey: And we will, of course, put links to all of this in the show notes. Thank you so much for your time. I really appreciate it.Emily: Thank you for having me.Corey: Emily Gorcenski, Data and AI Service Line Lead at ThoughtWorks. I’m Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you’ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you’ve hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry, insipid, insulting comment, talking about why data doesn’t actually matter at all. And then the comment will disappear into the ether because your podcast platform of choice feels the same way about your crappy comment.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
Ne ratez aucun épisode de “Screaming in the Cloud” et abonnez-vous gratuitement à ce podcast dans l'application GetPodcast.
The Growing Dominion of Cloud Providers with Raj Bala
35:01Raj Bala, Founder of Perspect, joins Corey on Screaming in the Cloud to chat about his experiences working in the world of cloud and why he made the shift from Gartner Analyst to Founder. Raj asks the question, “Is AWS truly customer-obsessed?” in the face of their business practices, and challenges the common notion that analysts don’t need to have lived experience with a product to criticize it. Raj and Corey also explore the absurdity of Azure naming conventions, how cloud providers are creating roadblocks to multi-cloud, and the response of the greater public as cloud providers become more and more powerful. About RajRaj Bala, formerly a VP, Analyst at Gartner, led the Magic Quadrant for Cloud Infrastructure and Platform Services since its inception and led the Magic Quadrant for IaaS before that. He is deeply in-tune with market dynamics both in the US and Europe, but also extending to China, Africa and Latin America. Raj is also a software developer and is capable of building and deploying scalable services on the cloud providers to which he wrote about as a Gartner analyst. As such, Raj is now building Perspect, which is a SaaS offering at the intersection of AI and E-commerce.Raj's favorite language is Python and he is obsessed with making pizza and ice cream. Links Referenced: Perspect: https://perspect.com former2.com: https://former2.com Twitter: https://twitter.com/raj
Data Protection the AWS Way with Wayne Duso and Nancy Wang
33:48Wayne Duso, VP of Storage, Edge and Data Governance Services at AWS and Nancy Wang, GM of AWS Data Protection, both join Corey on Screaming in the Cloud to discuss data protection and analysis at AWS. Wayne and Nancy describe how AWS Backup has scaled to protect over 90% of the data stored on AWS today. Nancy explains how her team specializes in helping AWS customers develop custom solutions for their specific data needs, and the way that AWS has built out new tools and services to accommodate that customization. Wayne also reveals how important data analysis is to the AWS team when it comes to improving services and developing ground-breaking new innovations. About WayneProfessionally, Wayne is a Vice President at Amazon Web Services (AWS) where he leads a set of businesses delivering cloud infrastructure services. In 2013, he founded and continues to lead the AWS Boston regional development center. Wayne is an always-curious entrepreneur who is passionate about building innovative teams and businesses that deliver highly disruptive value to customers. He loves engaging people who build and deliver customer-obsessed solutions, as well as customers wanting to realize value from those solutions. Wayne also holds over 40 patents in distributed and highly-available computer systems, digital video processing, and file systems. Personally, Wayne is a proud dad to great people, and loves to cook and grow things, it relaxes and grounds him, and he cherishes finding adventure in the ordinary as well as the extraordinary.About NancyNancy Wang is a global product and technical leader at Amazon Web Services, where she leads P&L, product, engineering, and design for its data protection and governance businesses. Prior to Amazon, she led SaaS product development at Rubrik, the fastest-growing enterprise software unicorn and built healthdata.gov for the U.S. Department of Health and Human Services. Passionate about advancing more women into technical roles, Nancy is the founder & CEO of Advancing Women in Tech, a global 501(c)(3) nonprofit with 16,000+ members spanning three continents.Nancy is an angel investor in data security and compliance companies, and an LP with several seed- and growth-stage funds such as Operator Collective and IVP. She earned a degree in computer science from the University of Pennsylvania.Links Referenced:re:Invent talk with Nancy and Neha: https://www.youtube.com/watch?v=ELSm3WgR8RE
Getting the Basics Right in Cloud Security with Fouad Matin
41:58Fouad Matin, Co-founder & CEO of Indent, joins Corey on Screaming in the Cloud to discuss how to get data security right without creating unnecessary barriers for your development team. Fouad and Corey discuss how getting admin access as a developer can be time consuming and vague, when it should be efficient and come with an easily defined reason for granting access. Fouad also explains why he feels most breaches are due to not getting the basics right, and why he feels storing customer and sensitive data should be done with the same principles as dealing with hazardous waste.About FouadFouad Matin is the co-founder and CEO of Indent, a security company that enables teams to perform mission-critical operations faster and more securely. With Indent, organizations like HackerOne, Modern Treasury, Vercel, and PlanetScale are able to grant secure, time-bound user and admin permissions for cloud apps and infrastructure through Slack.Prior to Indent, Fouad worked as an engineer at Segment, a customer data platform helping companies secure their pipelines for handling customer data. He co-founded a non-partisan non-profit in 2016 to help people register and get out to vote through easy-to-use, privacy preserving tools. In 2018, while validating Indent’s mission, partnered with Vote.org to build tools for users to find their polling place and preview their ballot using client-side encryption.Links Referenced: Indent: https://indent.com Nobody Should Have Production Access: https://indent.com/blog/production Fouad on Twitter: https://twitter.com/fouadmatin Indent on Twitter: https://twitter.com/indent Unplanned Maintenance: https://unplannedmaintenance.com Least Privilege in Practice Blog Post from Indent: https://indent.com/blog/least-privilege Additional Links Referenced: Email: mailto:[email protected] Fouad LinkedIn: https://www.linkedin.com/company/indentinc/ Indent LinkedIn: https://www.linkedin.com/in/fouadmatin/
Being Present in the Moment Through Balcony-Hopping with Mai-Lan Tomsen Bukovec
25:09Mai-Lan Tomsen Bukovec, Vice President of Foundational Data Services at AWS, joins Corey on Screaming in the Cloud to discuss her technique for spending time intentionally and prioritizing work-life balance called balcony-hopping. Mai-Lan explains how she created the concept of balcony-hopping and how it has helped her to be a better leader, mother, wife, and boxer. Corey and Mai-Lan discuss how in today’s age, attention is a form of currency and why it’s so important to be intentional with how and where you spend your attention. Mai-Lan also offers practical insights to anyone seeking to feel more productive, present, and balanced. About Mai-LanMai-Lan Tomsen Bukovec is Vice President, Foundational Data Services (FDS) at Amazon Web Services (AWS) and leads a number of high-scale AWS cloud services that provide storage and streaming of petabytes or exabytes of data and essential building blocks for modern application architecture like queuing and notifications, monitoring, alarming, logging and reliability validation. Mai-Lan’s teams include some of AWS’ first and largest-scale services like Amazon S3 and Simple Queue Service (SQS) to more recent and fast-growing services like managed open source streaming (Amazon Managed Streaming for Apache Kafka).Prior to joining Amazon, Mai-Lan spent almost 15 years in engineering and product leadership roles at technology companies including Microsoft and early stage startups. She began her technology career after serving in the U.S. Peace Corps in the Mopti region of Africa as a Forestry volunteer after earning her degree from University of California, San Diego.At Amazon, Mai-Lan is an advisor to [email protected], creator and sponsor of internal leadership development programs for Amazon employees, and is passionate about AWS initiatives and cloud services that maximize human potential everywhere.Mai-Lan has three children and lives in Seattle with her family. When she is not working on Amazon cloud services and spending time with her husband and kids, Mai-Lan trains primarily in boxing with additional practice in the martial art Savate.Links Referenced: LinkedIn post “Live Your Best Life Through Balcony Hopping”: https://www.linkedin.com/pulse/live-your-best-life-through-balcony-hopping-mai-lan-tomsen-bukovec/ LinkedIn: https://www.linkedin.com/in/mailan/
The Complexity and Value of Scaling Reliability with Kannan Solaiappan
31:37Kannan Solaiappan, Head of Reliability and Data Engineering at Circles Life, joins Corey on Screaming in the Cloud to discuss building a team in a start-up environment and the complexities of balancing reliability and security with scale. Kannan describes the challenges of building a semi-platform multiple instances model and how products like Severalnines have helped identify and optimize potential problems before they affect customers. Kannan and Corey also discuss the impact that major outages had on the world at large when it came to fault-tolerance on entry points, and Kannan explains how guardrails can improve reliability without creating the same resistance from engineers that governance can. About KannanWith over 20 years of experience in the technology industry, Kannan Solaiappan is a highly motivated and passionate leader with a track record of driving results. With a background in software development, operations, architecture, security, and Agile transformation, Kannan has served as a Head of DevOps/Reliability/Data Engineering & Architecture, managing budgets of over 10 million dollars. Kannan has successfully led teams of up to 80 members and has a strong background in building and maintaining world-class organizational structures and cultures.Currently, Kannan is leading a team of SRE, DevOps, and Data Engineering professionals at Circles Life, Asia’s first fully digital telco, where Kannan is working towards building the world’s best Telco SAAS platform with a focus on CiCD, observability, reliability, resilience, and security.Kannan has a diverse set of skills including IT Service Management, team management, IT strategy, vendor management, site reliability engineering, Architecture and leadership.Links Referenced: Severalnines: https://severalnines.com/ Circles.Life: https://circles.life Circles.Life Instagram: https://www.instagram.com/circleslifesg/ Circles.Life Twitter: https://twitter.com/circleslifesg Circles.Life Facebook: https://www.facebook.com/CirclesLifeSG/
Building Community in Open Source with Floor Drees
33:10Episode SummaryFloor Drees, Staff Developer Advocate at Aiven, joins Corey on Screaming in the Cloud to discuss her journey into the world of open source and the opportunities she sees to improve developer relations. Floor and Corey dive into the pitfalls and opportunities of being a frequent speaker at events, and Floor shares some best practices to help be prepared for those opportunities. Floor also shares why she feels events should include hybrid remote attendance options, and the benefits of hosting local events to breathe life into new relationships within the developer community. Floor and Corey also discuss the complexities of maintaining an open-source project and what goes into keeping an open-source community healthy and thriving. About FloorFloor is a Staff Developer Advocate at Aiven, a company that manages your favorite open source data tools for you without exploiting the projects and their maintainers. Previously Floor worked in DevRel at Grafana Labs and Microsoft. She is a Devopsdays Core member, and organizes the Devopsdays Amsterdam and Eindhoven chapters. She is a Microsoft MVP for Developer Technologies, and organizes a bunch of meetups, including-but-not-limited-to contributing.today, DevRel Salon Amsterdam, and the Amsterdam Ruby Meetup. Floor is also an art school graduate, who stumbled into tech face-first.Links Referenced: Aiven: https://aiven.io floor.dev: https://floor.dev Mastodon: https://mastodon.lol/@floord Twitter: https://twitter.com/floordrees dev.to: https://dev.to/floord
The Ever-Growing Ecosystem of Postgres with Álvaro Hernandez
35:52Álvaro Hernandez, Founder of OnGres, joins Corey on Screaming in the cloud to discuss his hobby project Dyna53, the balkanization of AWS services, and all things Postgres. Álvaro and Corey discuss what it means to be an AWS Community Hero these days, and Álvaro shares some of his experiences as being one of the first Heroes to provide feedback on AWS services. Álvaro also shares his thoughts on why people shouldn’t underestimate the importance of selecting the right database, why he feels Postgres and Kubernetes work so well together, and the ever-growing ecosystem of Postgres.About ÁlvaroÁlvaro is a passionate database and software developer. Founder of OnGres ("ON postGRES"), he has been dedicated to Postgres and R&D in databases for more than two decades.Álvaro is at heart an open source advocate and developer. He has created software like StackGres, a Platform for running Postgres on Kubernetes or ToroDB (MongoDB on top of Postgres). As a well-known member of the PostgreSQL Community, Álvaro founded the non-profit Fundación PostgreSQL and the Spanish PostgreSQL User Group. He has contributed, among others, the SCRAM authentication library to the Postgres JDBC driver.You can find him frequently speaking at PostgreSQL, database, cloud (becoming an AWS Data Hero in 2019), and Java conferences. In the last 10 years, Álvaro has completed more than 120 tech talks (https://aht.es).Links Referenced: OnGres: https://ongres.com/ Dyna53: https://dyna53.io/ Personal Website: https://aht.es Twitter: https://twitter.com/ahachete LinkedIn: https://www.linkedin.com/in/ahachete/
The 4D Approach to Cloud Sustainability with Catharine Strauss
40:37About CatherineCatharine brings more than fifteen years of experience building global networks and large scale data center infrastructure to the challenge of scaling quickly and safely. She loves building engaged and curious teams, providing insightful forecasting tools, and thinking about how to build to scale in a sustainable way to preserve a humane quality of life on this swiftly tilting planet. When not trying to predict the future as a capacity planner, she’s often knitting extremely complicated sweaters and coming up with ridiculous puns.TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Tailscale SSH is a new, and arguably better way to SSH. Once you’ve enabled Tailscale SSH on your server and user devices, Tailscale takes care of the rest. So you don’t need to manage, rotate, or distribute new SSH keys every time someone on your team leaves. Pretty cool, right? Tailscale gives each device in your network a node key to connect to your VPN, and uses that same key for SSH authorization and encryption. So basically you’re SSHing the same way that you’re already managing your network.So what’s the benefit? Well, built-in key rotation, the ability to manage permissions as code, connectivity between any two devices, and reduced latency. You can even ask users to re-authenticate SSH connections for that extra bit of security to keep the compliance folks happy. Try Tailscale now - it's free forever for personal use.Corey: Kentik provides Cloud and NetOps teams with complete visibility into hybrid and multi-cloud networks. Ensure an amazing customer experience, reduce cloud and network costs, and optimize performance at scale — from internet to data center to container to cloud. Learn how you can get control of complex cloud networks at www.kentik.com, and see why companies like Zoom, Twitch, New Relic, Box, Ebay, Viasat, GoDaddy, booking.com, and many, many more choose Kentik as their network observability platform. Corey: Welcome to Screaming in the Cloud. I’m Corey Quinn. As a cloud economist, I wind up talking to an awful lot of folks about optimizing their AWS bills. That is what it says on the tent. It’s what I do. Increasingly, I’m having discussions around the idea of sustainability because the number-one rule of cloud economics is also the number-one rule for sustainability. Step one, turn that shit off. If you’re not using it, turn that shit off. If it doesn’t add value commensurate to what it costs, turn that shit off. Because the best way to optimize something is to get rid of it. Today, to go into bit more depth on that, my guest is Catharine Strauss. Catharine, thank you for joining me.Catharine: Thank you. I’m excited.Corey: So, you have a long and storied career of effectively running global-scale network operations in terms of capacity planning, in terms of building out world-spanning networks, and logistics of doing that. You know, the stuff that’s completely invisible to most people, except when it breaks. So, it’s more or less a digital plumbing-type of role. How did you go from there to thinking about sustainability in a networking context?Catharine: Yeah. Thank you. I got dropped into networking as a career option, completely from the physical side, building out global networks. And all of the constraints that we were dealing with, were largely physical, logistical, or legal. So, we would do things like ship things through customs and have items stopped because they were miscategorized as munitions because they were lasers, “Pew, pew.” We had things like contract negotiations for data centers to do trenching into them that needed easements with the railroad. Like, just weird stuff that you don’t normally think of as a cloud-project constraint. So, all of these physical constraints made it just more interesting to me because they were just so tactile.Corey: There’s so much that is out there in the world that is completely divorced from anything that you have to think about in terms of building out networks and software. Until, suddenly, it’s very much there, and you’re learning that there’s an entire universe/industry/ecosystem that you know nothing about that you now need to get into. Railroad easements are a terrific example of that. It’s, “Wait, what, we’re building the cloud here. What the hell does the railroad have to do—is there actually a robber baron I need to go fight somewhere? How does this work?” The old saw about the cloud just being someone else’s computer is not particularly helpful, but it is true. There’s a tremendous amount of work that goes into building out the physical footprint for a data center—let alone a hyper-scale cloud provider’s data center—that does not have to be something the vast majority of us need to think about anymore. And that’s, kind of, glorious and magical. But it does mean that there are people who very much need to think about that.Increasingly, we’re seeing the sustainability and climate story of cloud extend beyond those folks. There are no carbon-footprint tools and dashboards in all the major cloud providers that I’m aware of. Well, I’d say it’s a good start, but in some cases, it’s barely that. It feels like this is something that people are at least starting to take semi-seriously in the context of cloud. How have you seen that evolving?Catharine: So, when I think about a data center, I see it as a factory where you take heavy metals and electricity—Corey: And turn them into YAML. Sorry, sorry. Go ahead.Catharine: [laugh] you turn them into spreadsheets, cat videos, and waste heat, right? So, when I’m looking at, you know, this tremendous global network, I started to look into what’s the environmental cost of that. And what I found was, kind of, surprising. Like, three percent of our total global emissions, is coming from computing and the internet, and all of these things that I spent my career building. And I started to have waves of regret. And looking at that in the context of: how can we make things better? How can we make things more efficient, and how can we operate better with the physical constraints of electricity and energy grids, and what they are struggling with doing to provide us with what we need or managing this beast of an internet?Corey: Right now, it feels like there’s an awful lot of—I don’t know what the term is, greenwashing, cloud washing—basically, making your problem someone else’s problem. I feel like the cloud providers are in a position where they have to walk something of a tightrope. Because on the one hand, yeah, there are choices I can make as a customer that will absolutely improve the carbon footprint of what it is that I’m doing. On the other, they never invite me to have conversations to negotiate with their energy providers around a lot of these things. So, it feels like, “Oh, yeah. Make sure that the cloud you’re using is green enough.” “Wasn’t that what I’m paying you for?” That feels like it’s a really weird dichotomy that I’m still struggling to reconcile exactly how to approach.Catharine: Yeah. I, you know, I looked at the Amazon Sustainability platform, and they’ve got those two parts of it. They’ve got sustainability in the cloud, sustainability of the cloud. And, you know, I’ve worked with enough Google SREs to know that they and Amazon Data Center providers and Azure, they all have a vested interest in making it as cheap as possible to operate their data centers. And that goes far beyond individual server performance. It goes to the way that they do cooling. And, like, the innovations there are tremendous.But they’re not doing that out of the goodness of their heart; they’re doing that because it makes business sense for them. It reduces the cost for them to provide these services. And, you know, in some cases, it really obscures things because they will sign energy contracts and then keep them super-secret. There’s very little transparency because these are industry secrets, and they don’t want to damage their negotiation positions for the next deal that they sign. So, Amazon, you know, will put PR releases out there about all of their solar farms that they are sustaining in Virginia. But they don’t talk about what percent that is of their total energy consumption, and they don’t talk about, you know, what the total footprint is because that is considered either a security risk or an economic risk if people were to find out, you know, exactly how much energy they’re pulling.Corey: I am, somewhat, sympathetic, but only to the reality that the more carbon transparency that a cloud provider gives around the relative greenness of a given service that they offer in a given region, the closer they get to exposing a significant component on their per-service margins. And they’re, understandably, extraordinarily reluctant to that because then people will do things like figure out exactly how much are they up-charging things like data egress and ongoing per-hour session charges for some sage-maker nonsense.There’s an awful lot out there that I don’t think they want to have out there just for, on the one hand, the small one that’s easy to deal with is the customer uprising. But more so, they don’t want to expose this to their competitors.Catharine: Yeah, I don’t know that I have a ton of sympathy. If the service is cheaper because they’re running off of green energy, as we have increasingly seen in the market that solar and wind are just the cheapest alternative. If it’s cheaper for Amazon and Google, I, kind of, feel like they should convey that, so that people can take advantage of those savings.We’ve got a demand issue, where, I think, the demand for these renewable energy sources is outstripping supply. But they’re planning for the next five years where that decreasingly becomes an issue. So, why not let people operate according to their values, or even, you know, their own best interests in choosing data centers that are emitting fewer emissions into the world?Corey: There seems to be a singular focus between all of these providers in what they’re displaying through their tools. And that is on carbon footprint, and it is also suspiciously, tightly bounded to what looks like compute. There’re a lot of other climate-impacting effects of large-scale cloud providers. It has significant disruption to local waterways. There are tremendous questions around the sustainability around manufacturing of the various components that get turned into equipment that gets sold to these providers then integrating into other things. There’s an awful lot of downstream effects. And I can’t shake the feeling that focusing on how renewable the energy is to power the compute, focuses on a very small part of the story. How do you land on that one?Catharine: I would agree with that. I think people will often say, “Oh, what you should do if you’re managing,” you know, “Your data center resources is for efficiency, you should be updating your hardware once a year or putting out the resources that are the most powerful.” The tipping point might be later than you actually think because what happens to those resources when they go back out into the environment when you decommission them? It’s so hard to resell them, especially, globally. The reuse of gear is becoming harder and harder, and so the lifetime of that gear, that equipment, those servers, routers, whatnot, all of that is becoming harder and harder to do. And the disposal of those materials has a tremendous impact.So, I do think the energy is a big part of it, and it feels like the thing that we can control the most. But, like, if you really want to change the world, go work on carbon-neutral cement or batteries made out of rust and sand to store solar energy. You know, go work on low-heat steel. Those are the things where you’re really making an impact. What we need to do in the market is really transform our notion of the cloud as this infinite nebulous, weightless item into something that is physical and has a physical impact on our lives.So, when you’re trying to decide what your retention policy is for your data in your company when you’re trying to decide where to replicate data, how long to hold it in active storage, you’re really thinking about the megawatts that it takes, and the impact of that on the full picture.Corey: Well, a question that I’ve had as I look across my customer base of large companies doing interesting and exciting things with cloud, is I would love—absolutely love—to see a comparative analysis done by each provider that in very human terms, says what the relative climate impact is of taking all of their different storage services, on a per-petabyte basis, where I say, “Okay, if I want to store this in their object storage, or if I want to put this on disc volumes, or I want to use their deep-archive storage that looks an awful lot like tape, I don’t care so much about the cost of those things, but I want to know what is the climate impact of this,” because I think that would be revelatory on a whole bunch of different levels. But it seems it’s computes where they tend to focus instead.Catharine: Yeah, it would be really nice if as businesses, we started to look at the fuller impact of our actions. And it isn’t just about the money saved. But my genuine belief is that it will get cheaper to do the right thing. And it is getting cheaper every day to use fewer resources. But the market has not caught up to that, and you can see that in how many companies are still giving away free, unlimited storage, right? You know, how many Go-Pro videos of someone’s backyard, how many hours of that kind of footage is there out there in the world that’s never going to get viewed again, but is sitting out there taking up energy that, at the same time, that we’re having brownouts, and people are suffering and having to turn off their air conditioning?Corey: I think that we would do well as a society to get rid of a heck of a lot more data just because it sits there; it burns energy; it costs money, and I’m sorry, you’re going to really have to reach to convince me that the web server access logs from 2012 are in any way business valuable or relevant to, basically, anyone out there.But I want to take it one step further because now that we know that we’re definitely burning the planet to wind up storing a petabyte of data here, I’m very curious as to the climate footprint of then going into your world, taking that data, and throwing it somewhere else across the internet. Because I can tell you, almost to the penny, what that’s going to cost, and it’s an astonishingly large number because yeah, egress fees are what they are, but I couldn’t tell you what the climate footprint of that is.Catharine: Yeah. When I was working at Fastly, we did a lot of optimizations across our network to avoid peak traffic because that was how we were built. You know, we had to build out to a certain network capacity, and then we could build, essentially, the area under our diurnal curve, we can build that out. But we don’t have to, necessarily, serve it from the absolute closest data center. If we could serve it from a nearby data center or a provider that was three milliseconds of ‘wait and see more,’ we could potentially use resources that we have elsewhere in the cloud to serve that request more efficiently.And I think we have an opportunity to do that with data centers scattered around the globe. Why aren’t we load balancing so that we’re pushing traffic from the data centers that are off-peak—you know, have energy to spare to accommodate for the data centers that are reaching capacity and don’t have enough energy on the grid—why aren’t we using these resources more efficiently?Corey: I’ve often lamented, from an economic perspective, that if I want to spend less money and optimize things, I can wind up trading out my instance types. Okay, I have a super-fast, high-end processor that costs a lot of money. I can get shittier compute by spending less. The same story with storage. I can get slower storage for less money that’s a lot less performing, and it has some latencies added, but, “Great,” but I can make that decision.With networking, it’s all of its nothing. It’s there is no option for me to say, “I want to pay half of what the normal data rates are, but in return, I really only care that this data gets to where it’s going by next Tuesday.” I don’t need it done in sub-second latency speeds. There’s no way to turn that off or to make that election. Increasingly, I really am coming around to the idea that cloud economics and sustainability are one in the same.Catharine: Yeah. For me, it makes a lot of sense. And, you know, when I look at people in their careers, focusing on cloud economics feels like a very, very easy win if you also care about sustainability. And it feels like once you have the data and the reporting tools—and, you know, we talked about the big gaps there—but if you’re reporting on both your costs and the carbon footprint, you’re developing a plan for how to optimize on both of them at the same time, and you’re bringing that back to your management, bringing that back to your teammates, and really making sustainability an active value in your organization.I feel like there’s not only a benefit to you, the finances of your company, and your personal career, but there’s also a social impact where, you know, maybe you can feel a little less guilty about eating that steak. Maybe you can offset some travel that is increasing your carbon footprint; maybe you can do a trade-off; maybe you can do everything in little bursts across a broad scope, instead of us needing, you know, some big solution that’s going to save us. There’s no one solution.I think that’s the main thing I’ve discovered in my education on sustainability is it has to be 50,000 small things, the ‘magic buckshot’ rather than the ‘magic bullet,’ is the term that I see used a lot. Carbon removal from the sky is coming, but while we wait for it, we got to slow the pace of digging the hole, and really give our solutions a chance to work.Corey: I despair at times at the lack of corporate will, I suppose, to wind up pursuing cloud sustainability as a customer of one of the cloud providers. I get people reaching out to me, pretty frequently, to help optimize the cost of their AWS bill. That is, definitely, what I do for a living. If I don’t have people reaching out on that, something is going wrong somewhere. And even then, there have been months that have been relatively slow in recent years. Because well, it turns out when money is free, you don’t really care that much about saving money. Now, people are tightening their belts and have to think about it a lot more, but that is a direct incentive of if you go ahead and optimize your cloud-spend bill, you will have more money.That is, sort of, what our capitalist system is supposed to optimize for in many respects. “Great,” you can have more money. But it’s still not exciting for folks, and it’s not what they really wind up chasing after. I despair at getting them to think larger than money because that’s the only thing that companies generally tend to think about in the abstract, and start worrying about the future and climate and to invest significant effort in doing climate optimization. I don’t know that there is a business today in greening your cloud workloads that could be started the way that I have for fixing the AWS bill.Catharine: Yeah, I don’t think there’s a business in it; I think it’s a movement. It’s like accessibility; it’s like security; it’s like a lot of other movements that have happened recently in tech where it becomes everybody’s job. And it’s important to people. And it becomes part of your company’s brand, and you use it for recruitment; you use it for advancing your own career; you use it for making people feel like they’re making a better decision.When I look at the three big cloud providers, and I look at the ways that they are marketing their sustainability, it is so slick. You go to their sustainability page and it’s all, you know, beautiful, flashy graphics and information on all these feel-good things. Because they know, if they don’t do it, they’re going to be passed over because somebody is going to bring this up when they’re evaluating their choices. Because we want it; we all want it. We just don’t quite know how to get there. And until recently, it was more expensive, and you did have a green tax made the sustainable options more expensive. We’re turning the page on that. Solar is cheaper than coal. And that’s all you really—all you have to say to justify some of these advancements. It’s all going to flow out of that simple fact.Corey: Cloud native just means you’ve got more components or microservices than anyone (even a mythical 10x engineer) can keep track of. With OpsLevel, you can build a catalog in minutes and forget needing that mythical 10x engineer. Now, you’ll have a 10x service catalog to accompany your 10x service count. Visit OpsLevel.com to learn how easy it is to build and manage your service catalog. Connect to your git provider and you’re off to the races with service import, repo ownership, tech docs, and more. Corey: I think that there’s a tremendous opportunity here to think about this. And I think you’re right. It absolutely takes on aspects of looking like a movement to do that. I’m optimistic about that. The counterpoint is that individuals are often not tremendously effective at altering the behavior of trillion-dollar companies, or even the relatively small ‘only’ 50 billion-dollar companies out there.I can see where it starts, and I can see the outcome that you want there. I just have no idea what it looks like in between. It’s, like, “Step two, we’ll figure this part out later. Step three, climate.”Catharine: Yeah. If I were going to do it at my company, I would go to HR. And I would say, “I would like to form an employee-resource group around sustainability. Do you know anyone on the executive team who is interested in sustainability?” Get them to sponsor it; talk to that sponsor, and say, “We’re the co-benefits here. What do you see as things that we absolutely need to do from a corporate-strategy standpoint that are aligned with this?”And then start having meetings—open meetings—where you invite people concerned about climate change, and you start to talk cross-functionally about, “What can we do? Can we change our retention policies? Can we change the way that we bill for services? Can we individually delete data on our Wiki that hasn’t been accurate for seven years?” And, you know, start to talk and share successes. Then take it out to the larger industry and start giving talks because people want to be able to do something. Climate despair is real, but we, as cloud technologists, are so powerful in the resources that we have stewardship over. But I have to think that there is a possibility of making real change here.Corey: There’s a certain point of scale at which point, having a sustainability conversation becomes productive. There are further points of scale where it becomes mandatory, let’s be clear here. But when I’m building something in the off hours—mostly for shit-posting purposes—it generally tends to wind up costing maybe seven cents or so, when all is said and done because I’m using Lambda functions and other things that don’t take a whole lot of computer resources out there. Googling what the most climate-effective way to implement that would be, is one of those exercises where the google search has a bigger carbon footprint than the entire start-to-finish of what it is that I’m building. It’s not worth me looking into that.There is some inflection point between that, and we run 500,000 servers around the world or 500,000 instances where, yeah, there’s a definite on-ramp where you need to start thinking about these things. What is that, I guess, that first initial point of, “I should be thinking about this,” for a given workload?Catharine: So, I’ve been trying to get data on this, and my best calculation is that an average server in a hyperscale data center, where you’re using the whole thing for an entire year, is one to two tons of CO2 per year. So, I think when you start to look at other initiatives that you’re seeing, I think the tipping point is around ten tons per year. And for some people, that’s a lot; that’s a lot of resources that you need to get up to that point.Corey: That feels directionally right. I think that is absolutely around where it starts to make sense. I mean, right now, I’m also in the uncomfortable creeping-awareness position of I’ve run a medium-sized EC2 instance persistently. That is my developer environment. I have it running all the time because having a Linux box is, sort of, handy. And whether I need it or not, it’s there. If I were to turn it off when I go to sleep at night, for example, I do not believe that would have any climate impact whatsoever from the perspective of this is a medium-size instance. There are a bunch of those on any individual server.Amazon is not going to turn off Iraq right now because my instance is there or it’s not. It is well within the margin of error for anything they have as far as provisioning or de-provisioning something. So, then someone, like, steals it to the term you used of climate despair a few minutes ago, that’s what this feels like. It’s one of those, “Well, okay. So, if it makes no actual difference if I were to spend instrumenting that thing to turn itself off at night and turn itself back on in the morning, it doesn’t change a damn thing. I’m just doing something that is effectively meaningless in order to make myself feel better.”The enormity of the problem and the task, and doing it at scale, well, I’m not going to convince customers to do that. And for some cases, maybe that’s for the better; maybe it’s not. But I feel like for whatever I do, there’s nothing I can do to make a difference in that sense, in my small-scale personal environment.Catharine: Yeah, yeah. I definitely appreciate that. This feel to me like the same concept of—I don’t know, a couple of months ago, if you remember, California had a heat wave, and there were rolling brownouts. And we got a text that said, “Energy is at a high right now. Please turn off any unnecessary devices,” trying to avoid additional impact to the energy grid. And if you go and you look at the graph, there was an immediate decrease of 1500 megawatts in that moment because enough people got the text and took a small action, and it had the necessary impact. We avoided the brownouts, and the power, generally, kept flowing because it’s such a big system.You know, if we’re talking about three percent of global emissions, we’re talking about, you know, power that’s the size of the aviation industry. We’re talking about power that’s, roughly, the size of Switzerland just on data centers. You, as an individual, are not going to be able to make an impact; you, as an individual talking about this to as many people as possible—as we’re doing right now—that starts to move the needle. And the thing I like about forming a grass roots group inside of your company is that it’s not just about the data centers. Maybe, it’s also about the service that comes in and brings you food and uses disposable containers; maybe, it’s about people talking about their electric cars; maybe, it’s about installing a heat pump; maybe, it’s about talking about solutions instead of just talking about creeping dread all the time.Like, my move into sustainability has been largely in response to I can’t keep doom-scrolling. I have to find the people who are making the solutions happen. And I just got out of a program with Climatebase where that is what I did for nine weeks is talk about the solutions. And all of the people in the companies that are actually doing something, they’re so much more optimistic than the people I talk about who are just reading the headlines.Corey: Doing something absolutely feels better than sitting here helplessly and more less doom-scrolling about it. I absolutely empathize there. I think the trick is to get people to start taking action on this. I am curious, getting a little bit back to where you come from, something you alluded to at one point, was how energy markets are akin to network throughput. And I definitely wanted to dive into that. What do you mean? I’m not disagreeing, but I also have a really hard time seeing that. Help?Catharine: Yeah. So, I used to do capacity planning for Fastly. And so, we would spend all day staring at the diurnal curve of our network throughput because we had to plan for the peak. Whatever our traffic throughput was, our global network needed to be able to handle it. And every day—maybe we got close to that peak; maybe we didn’t—but every day it would dip down into just the doldrums as people went to sleep and weren’t using the internet.So, when I moved into looking at energy markets, specifically smart grids, and the way that renewables affect the available supply of electricity, I saw that same electricity curve; it’s called the duck curve in electricity markets where you have this diurnal pattern and a point every day, where the grid has electricity available but no demand.So, when I was managing costs for our network, we would be trying, as much as possible, to fill that trough every day because it was free for us because we had already built out the infrastructure to fulfill that demand.And the energy markets are same way. We have built out the infrastructure. We just need the demand to meet the timing of the day. Put another way, you have to think fourth-dimensionally. It’s like Doc Brown in Back to the Future III. Marty says, “If we continue along this track, the bridge isn’t built yet. We’re going to plunge into the canyon and die.” And Doc Brown says, “No, no, no. You’re not thinking fourth-dimensionally. When we travel through time, we will be in the future, and the bridge will be there.” So, if we can shift the load from one region where energy is being consumed at its peak and move the traffic over to a region in the Pacific Northwest or a different time zone where they haven’t yet hit their energy-consumption peak, we can more efficiently use the infrastructure that is already been built out.Corey: I really wish things were a lot easier to move around in that context. Data transfer fees make that very challenging, even if you can get around the latency challenges—which for many workloads is fine; that is not a prohibitive challenge. It’s the moving things around; moving data to those other regions, especially, in the sense of, “But, okay. You’re making it worse because now you have the data living in two different places instead of only one. You’ve doubled the carbon footprint of it, too.”For some workloads, it absolutely has significant merit. I just don’t know exactly what that’s going to look like—actually, I take that back—the more I think about that, the more I realize that in some level, that’s what SDNs do already where, “Great, if this has to be built into something; if I hit an AWS endpoint or an API Gateway or something, I want to have an option when I’m building that out to be able to have that do more or less a follow-the-sun style pattern where it’s honed out of wherever energy markets are inexpensive.” And that certainly is going to break things for a lot of workloads, but not all of them, not by far.Catharine: Yeah, and I think that is where my context is coming from. You know, working at Fastly, that was the notion, you know, “We’re caching your data close to your end-users, so you don’t have to operate resources in that area.” And we have a certain amount of leeway to how we serve that traffic. But it is a more global-distributed model and spinning up servers only when you need them is also a model that takes advantage of not having idle services around just in case you need them, actually responding to demand in real-time.If you look at what the future holds for, you know, smart grids, energy networks, there’s this tremendous ability—and I would be very surprised if the big providers are not working on this—to integrate the two—so that electricity availability and how our network traffic is served, is just built into the big providers.Corey: I really hope that one of these big providers leads the way on that. That’s the kind of thing that they should really want to see come out of these folks. We are recording this before AWS reinvents. So, if they did come out with something like this, good for them, and also, I have no idea, at the time of this recording, whether they are or not. So, if I got it right, no, I’m not breaking any confidentiality agreements. I feel I need to call that out explicitly because everyone assumes that I—that I have magic insight into everything they’re going to come out with. Not really; usually it’s all after the fact.Catharine: What I’m really hoping is that by the time this airs, Amazon has already released version two of their carbon footprint tool, where they have per data center visibility where it’s no longer three months in arrears, so that you can actually do experimentation and see how differences in the way you implement your cloud impact your carbon footprint. Rather than just, like, sort of, the receipt of, “Yep, here’s your carbon footprint.” Like, “No, no, no; I want to make it better. How do I make it better?”So, I’m very much hoping they make an announcement of that kind, and then I’ll come back.Corey: You’re welcome to come back if and when there’s anything that any of these providers release that materially changes the trajectory we’re currently on. I want to thank you for being so generous with your time. If people want to learn more, where’s the best place for them to find you?Catharine: Yeah. You can find me on my website, Summerstir.com. And also, I hang out an awful lot with some very smart people on ClimateAction.tech. Their Slack is a great repository for people concerned about exactly these issues.Corey: And we will, of course, put links to that in the [show notes 00:37:21]. Thank you so much for being so generous with your time. I appreciate it.Catharine: This has been delightful. Thank you.Corey: Catharine Strauss, budding digital sustainability consultant. I’m Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you’ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you’ve hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment that also includes the cloud sustainability metrics for that podcast platform of choice.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.