Episode #70: A Typical Serverless Architecture with Xavier Lefevre

October 12, 2020 • 61 minutes

On this episode, Jeremy chats with Xavier Lefevre about what a typical serverless architecture looks like in AWS, why you need to think more about your total cost of ownership (TCO), and how to use his serverless cost calculator to estimate common serverless workloads.

Watch this episode on YouTube:

About Xavier Lefèvre

Xavier Lefevre is currently VP of Engineering at Theodo, a web development and product consulting agency. As part of his role, Xavier manages five technical teams and leads the development of the company’s serverless expertise. He believes that serverless is a major breakthrough that will allow the industry to redirect its focus on core business needs, and his specialization centers on serverless and problematic FinOps architectures. Xavier shares his expertise through articles such as with Serverless Transformation on Medium, and various speaking events, including Virtual Serverless London meetup.

Watch this episode on YouTube: https://youtu.be/pKc2f8Q0PQI

Transcript

Jeremy: Hi everyone, I'm Jeremy Daly and this is Serverless Chats. Today I am chatting with Xavier Lefevre, who I am going to have re-pronounce his name afterwards. Hey Xavier, thanks for joining me.

Xavier: Thank you. Thank you for having me. My name is Xavier Lefevre in French, which is not very easy to say.

Jeremy: So you are the VP of engineering at Theodo. I'd love it if you can tell the listeners a little bit about your background and what you do at Theodo.

Xavier: Yes, so I'm going to start with Theodo. Theodo is a product consulting and development agency, so we work with clients of any sizes, companies of any sizes, to build websites and complete web applications for them for different kinds of use cases. So, it can be a big company, it can be a small company, it can be eCommerce, it can be a big industry, anything. We are in France, UK and U.S.A., London and New York to be exact. And we're doing several different types of web ports, so we do mobile, we do infrastructure, that's something that could be interesting, like Kubernetes a lot, and stuff like that, so that's Theodo.

And for me, so I'm a VP engineering of Theodo in France, at Paris. And I have a fun background. So, I went to business school when I was younger and I did business school, but I always wanted to work in tech and when I got out I started to work in tech, but as a business role. I realized what it meant to work in tech and the different roles. And I finally realized that I preferred to be in tech myself, so that's why I'm here today.

Jeremy: Awesome. So, you are a bit infamous now, you have this article that you wrote called, the Typical Serverless Architecture, which got a lot of praise and also got a lot of criticism from people who don't quite understand serverless architecture. So, I would love it, just to go through ... and we'll start with this, there's other things I want to get to, but let's start with that, let's start with this typical serverless architecture. Take us back, what does that look like?

Xavier: So, from experience, and I don't have ... I have a year of experience in serverless, so I'm still, compared to you, I'm still young. But from experience, I started to of course dig into serverless and understand a little bit everything that's included in the technology. And I wanted to show this big picture and this big idea of what the typical architecture is. So, what can you find inside? You can find ... So, we go through each box. Okay? I can talk about the origin as well, which can be interesting. Which one do you prefer first?

Jeremy: Well, so I have them listed here, so let's start with the front end, what does the front end look like in a serverless application?

Xavier: So let's do that. So, front end itself, your front end is going to be a FGA, for instance, like Drax, it can be Next for instance, with SSR. What you can find there are two things that are specific to servers, the first is AWS Amplify, which does a lot of stuff, but among which you can find many components in there that help you work faster. And you can find STKs that help you communicate more easily and find pre-made features with AWS services, like Cognito for instance. You can authorize your users and handle your users directly from your front end thanks to AWS Amplify, so that's one piece you can find. The other one is ... if I go a little bit further, when you host your front end. So, you have two steps, first the basic one, if it's a static React you just have to host static files with your JS that's going to be loaded on your product and that's going to run, we know how it works.

So, there you're going to use F3, it's going to be exposed by your CloudFront, and that's it. If you want to go further, which happens a lot lately, even more because it's partial, if you want to go into SSR or SSG or a bit both, that you can do with Next for instance. Here you can use ... you can take for instance Lambda at Edge, which are Lambdas that are inside of CloudFront, that run close to the users, that are super, super, super fast and that can take care of generating your pages for itself, for you, for performance purposes or SU purposes. So that's one capacity in terms of front-end servers.

Jeremy: All right. So, now you've got your front end and you've got it hosted either in CloudFront or maybe even Amplify Console, which is different than Amplify, you can host SSGs there as well, and things like that. So, what about for domains and certificates, how do you manage your domain names and your certificates in a typical serverless architecture?

Xavier: Okay. So, there are two services that are quite famous as well in AWS, you're going to have certificate managers that's going to help you deal with your certificates, HTPS, and you're going to have Route 53 that's going to help you deal with your domains. They play directly, easily with the rest of your serverless architecture and you don't have much else to do. And we're talking ... something interesting as well, we're talking a lot about AWS, because it needs those services and from experience we found the serverless experience more, let's say more fit and more complete in AWS, but of course you can do this kind of stuff in other providers.

Jeremy: All right. So, now we've got all this stuff set up, this is our front end, now we actually want to be able to process APIs, so how do we build out our business APIs?

Xavier: Okay. So, there you have two choices, first about the routes you want to expose through the API, you have API Gateway, even more complicated, you have two API gateways, the View 1 NTP2, which are not named the View 1NTP2, not easy, and you have AppSync on the other side. Okay? So, if the API Gateway is about making REST APIs, AppSync is about making GraphQL APIs, mainly, if I have to just recap really fast. Of course, they have a lot of capacities inside, one API Gateway has an option to use API Gateway WebSocket, for instance to do realtime AppSync as embedded GraphQL substations, with pros and cons. So, we have those two, depending on what you want to do, REST or GraphQL. Okay, that's your gateways.

Then behind that, you're going to have your business intelligence. So, you can directly connect those gateways to Lambdas at first, mainly to code your site. So Lambdas, what are they? They are functions, just little pieces of code that you split, that are super microns that you split, you give them ... you put them and deploy them to AWS and they do the rest. They handle the scalability and the uptime of those functions, you don't have to take care of that.

So, you connect your gateway to Lambda's, from there ... What we do, and it's not something new, it's not something specific in serverless and you code it differently, we organize our Lambdas in services, we do follow again the principle of microservices. Behind that, the idea is mainly to develop a pattern to have a well organized architecture and to have an architecture that's going to last long in terms of complexity of thinking. So, we follow the main design at max to have a clear separation of concerns between those microservices.

Jeremy: Right. So, then you have all these microservices that are separated. Now, one thing about microservices in general, and certainly with serverless, is it is a distributed system, right? You're communicating with lots of different moving parts. So, you're not going to connect every piece of business logic or everything that you do, that's not all going to happen synchronously, right? So, we want to send an update to Marketo or to Salesforce or something like that, we're going to do that asynchronously, right? So, now you've got these Lambda functions that get processed from your business APIs, then how do we connect all these other services that you have and do that asynchronously?

Xavier: Okay. So, there are several ways of doing that. So, there is this idea of serverless being driven by design, that's something that gets mentioned a lot. And you can see it in most services, Lambda reacts, and most services react, on events. So, API Gateway connected to Lambda, is an event. But you can also connect it to other services, like SQS, like EventBridge. We do use Eventbridge for us to split and communicate between our services in a, let's say uncoupled way. EventBridge is a serverless Event Bus, it's like a RabbitMQ, but easier, which is extremely comfortable and it scales by design. So one microservice is going to handle, let's say is going to be the article microservice, okay, it's going to communicate with the user authentication microservice connected to Cognito for instance, which is a service made for authentication of users, and they will communicate through this Event Bus.

So, the article, for instance, is going to push the fact that there is an article and it's not going to know ... to the whole, let's say the whole architecture, it's not going to know who is going to know who is going to consume that. And EventBridge is then going to make sure this message is related to the services that want to consume this information and react on it. If you want to go further, then NDTV has a service inside of a service, which is complicated, called DynamoDB streams, you can use the fact that you're going to push changes to data inside of DynamoDB to trigger other events, uncoupled, so even events to make side events, like for instance beginning with a Lambda, it could be something like that and doing something like ...

So, this data change in the database, I want to send an email, but I can decouple and ask for DynamoDB stream to trigger this effect. And the effect is that the first Lambda that changed the data, synchronously ended earlier and to the front end, which showed the user that the request was accepted. Okay?

Jeremy: Okay. And so now you have these back-end processes that are running tasks, right? They're asynchronous, so the user's already connected, they can take longer to run if they need to, they can fail, they can retry, you've got all kinds of things like that. What if after one of these processes finishes, you want to push data back to the user somehow, you do that with WebSocket and AppSync, right?

Xavier: Yes, that's exactly what I would do. I can take a look at the API options you could have if you want to push data, if you want to do realtime, so if you want to in a simpler way push data from the back to the front end. The best options are API Gateway WebSocket or AppSync distributions. So, API Gateway WebSocket, from I understood, and I think you can change that, you have to ping the API per connection to ask to push a message to the front end. The difference with AppSync, where you can send a batch of messages in one go. So, that's one difference in one process of AppSync you can find, but in both scenarios, they're amazing services, really reliable. And that's how you push your messages to the front end, with those two services.

Jeremy: Right. So, then in a serverless architecture we have no servers, right, or at least no servers that we know of, so, if you want to upload a file, which is a common thing that people would want to do, maybe upload a profile picture or something like that into an application, how do you do that in a serverless application?

Xavier: So, of course we use the infamous S3, which is a file storage system. So, one thing you can ... you can once again, in this event-driven manner, is that you can generate file URLs, file upload URLs with S3 at your back-end to generate with a token and upload files that you're going to send back to your front end. The front end is going to directly push the file to S3, and then in this event-driven manner, S3 is going to, if you want it, it's going be able to push an event to trigger another side effect, the fact that this file has been uploaded, you want to potentially to mark some things somewhere in your database. So, you're going to do it in this way, completely decoupled.

Jeremy: Okay. All right. So, now use and authentication, you mentioned Cognito, what else do we do in order to authorize users into our system?

Xavier: So, Cognito is a go-to authentication user management service, than anybody else, so that's where you're going to have your user base, that's where you're going to have your user access and you're going to be able to connect it with other services to make those authorizations. So, for instance in API Gateway for each member you can say that ... you can attach an authorizer, which is directly connected to Cognito, and say that this route is not going to be accessible to this type of person. That's how you're going to do it.

Jeremy: All right. And so now let's say you have a complex workflow in a serverless application, you mentioned Lambda functions, each one maybe does a discreet piece of business logic, but let's say you've got to connect five or six of them together, or more, because you're doing some sort of checkup process and you need that workflow to finish on state machines, how do we build those in serverless applications?

Xavier: Yes, this is interesting, because it's something that gets ... after some point workflows gets complicated in most applications. So, you could do it ... of course you could do it yourself with your own Lambda, but in the end it's going to be complicated and tricky to understand where the data is going and stuff like that. So, AWS has a service for this, which is called Step Functions. Here you describe your flow and the configuration, okay. So, you're going to describe your steps, you're going to describe your states, and how the state is supposed to change step after step. You're going to say that it's away, it needs to come back, it needs to be handling errors, and you can do that with Step Functions.

The nice thing with it is that AWS takes care of this flow for you, in this case once again this flow for you and makes it as well super visible, so you have a nice user interface to see ... to understand what's happening, where are you, when you are debating on where you are developing, where are you in your flow, what data went where, and what you also get out of the box. So that's really super comfortable to handle complex workflows after some point.

Jeremy: Right. Two more to go. Security?

Xavier: So, security, the two things I put forward inside of this article is, one is called IAM, which is extremely famous, it's identity and access management. It's where you're going to define and configure all your authorizations inside of AWS. So, this user has access to this service, has access to this exact action inside of this service. But it's also where you're going to say that ... you're going to define the security of accesses between services. So, this one can have access to DynamoDB and stuff like that out of the box. The idea with IAM and a good practice, is to ... You're going to help me in finding this term again. Is the fact that no one has access to nothing. And then bit by bit you will access it.

Jeremy: Yeah, principle of least of privilege, right?

Xavier: Yeah.

Jeremy: Block everybody by default and then the principle of least privilege open up just what that individual function or user needs.

Xavier: Exactly. So, with IM you do that out of the box, which is a pretty good security principle. So, that's one service. Another service I'd put forward is how to take care of secrets, and your API keys for instance. So there are two services for that, one is called Systems Manager and the other one is called Secrets Manager. In my opinion they have some differences, but you can do both for handling your secrets and your API keys, and not having to version them so much.

Jeremy: Right. And then the last one, and this is important, monitoring, how do we monitor a typical serverless application?

Xavier: So, this one is extremely important, we are in a distributed system and they were in an asynchronous system, so understanding what's happening, what even did trigger what action is indeed complicated. So, out of the box CloudWatch is a de facto solution, it's connected to all AWS services, at least all the ones we mentioned before. And that's where you're going to have all your monitoring, you're going to find all your logs, you're going to be able to customize the logs and the metrics. If you want to push that, you're going to be able to define, not easily, that's still an issue, but it's powerful, you're going to be able to define your dashboard, your alarms. And you can do a lot of stuff in my opinion with CloudWatch. In my opinion it's almost self-sufficient. But one thing you need as well, in terms of scalability, is to understand what's in your whole system and CloudWatch is not enough for that.

CloudWatch is going to be micro, it's going to be a bit too shallow and some very specific pieces I particularly want to understand, and this is actually the entry to this whole flow, I want to understand why it created this error then came back to the flow. For this there is one called X-Ray, which is doing tracing end-to-end between your services, that's supposed to be the go-to. But, and you can correct me if you think it's the case, X-Ray is not supported by all services, and EventBridge which is at the heart of our microservice architecture is not supported by X-Ray, which is a little ...

Jeremy: All right. You just listed or we just went through 11 different sort of categories within a typical serverless architecture, and we probably mentioned 20 services, and there's even a few in there we probably didn't mention. So, the criticism I think that you got from this article that you wrote, and it's probably valid criticism, nothing on you, the article was great, and I think this is exactly what a typical serverless architecture looks like, but the criticism was, "Wow, this looks really complex, right, you've got a lot of different things." So, what do you say about that complexity and also who does that complexity now fall on, right, because this sounds like the developers in this case are going to be doing a lot of these things that maybe were the Ops jobs in the past?

Xavier: Yeah, that's the key question and the key topic when I released this article. And to a person that criticizes architecture, this article, I think I'm going to show them the video of ... What's his name already, the one that sings with the piano with all services of AWS?

Jeremy: Yeah. Forrest Brazeal, yeah. Forrest. Yeah.

Xavier: That video is amazing. If I send them this video, I think they are going to tell me, "You see, we are right, it's true." There is a crazy amount of services. But it's just the way of thinking, the mindset needs to shift. The complexity and the pieces you find in an architecture at the same, let's say level of application, it's the same, we are not inventing something, it's the same, we're just reorganizing a little bit the way you connect those pieces. So, of course the critics say, "I don't understand why there is so many pieces, you take a monolith, you connect it to AWS and it does everything you mentioned there for you and there is no issue." And that's true.

The thing is that behind this monolith you will have potential scalability problems afterwards, you will have it at some point, and you will have to have ... Your company is going to grow and you will have to state it in the services and you will have to be able to make them stay. And you're going to bit by bit explode this architecture. And here serverless is asking you to think about that ahead of time, right from the start, but it's also helping you to do that properly and to not have to think about scalability anymore later, not at all. So yes, indeed as well it changes a little bit the waits on the developers, the developers have to think from the get-go about this organization and how to split it and how to communicate between those different services.

But once again, it's something ... eventually ... And we are talking about applications that's going to need a little bit of scalability or that are going to have some up and down traffic, that will need eventually this kind of stuff. And we're going to get it from the get-go, which is super comfortable. And from experience, it's positive.

Jeremy: Right. And you mentioned a good point about eventually companies are going to have to split up monoliths, and I know that there are a lot of companies that run on these big monoliths, but have you ever worked one when there was a new piece of functionality that needs to be added, it's always like, "What am I going to break if I put in some new service?" And I've always told people ... 10 years ago or so, I would say to people, "When you're building a new application, you just want to get it out there, right, like don't worry about scalability, right.

I mean, if you get a thousand users and your servers are overwhelmed, then great, right, then you now have something and maybe you could start re-architecting and getting it where it needs to be. But you don't want to spend all that time building in that scalability on a typical server app or a server-based application right from the get-go, you want to get the project out there." But that changes dramatically with serverless, because you can build these things very easily, you can get them out there very quickly, and if you just think about a few things with the scalability aspect of it, in terms of what you need to do to make sure that your functions are split independently and some other things, you can scale right up and you don't have to rewrite anything.

Xavier: Yeah. And there is a good practice when you develop, when you just developed, you have your development flow, a good practice is always to cut down your features in very small pieces, it's always a good practice, to be able to have a good, like mind control on what you're doing, and to make sure that you're not going to generate [inaudible], because it's easier to sync through. So, this human thing of splitting things in small pieces to make sense out of them is ... It goes as well inside of the building of an architecture. So, in the time, like in the time, five ... Serverless started to become popular five years ago, or something like that?

Jeremy: Yeah, about five, it's been five years or so. Yeah.

Xavier: So, a little bit before that. If you wanted to be able to do those microservices, it was a little bit more complicated because of the infrastructure you had to put around it, because of all the complexity and all the tools you had to put and that you had to manage yourself to make them work. Now, the super nice part is that AWS took care of this complexity for you and you can split this complexity in several pieces to make sense out of it from the get-go. And it's just a simple idea. It's extremely powerful. And you can see it in a lot of other fields.

Jeremy: Right. Yeah. And I think that some of these individual services that are available to us help, just by their sheer, I guess, design or the way that they're meant to be used, already have the thought of scalability built in. Like DynamoDB for example, right, like thinking through those access patterns, your database is going to scale for quite some time, you don't have to worry about it. You build something in MySQL or something like that, and that's going to work great until you maybe have a hundred thousand records that need to do some sort of cross-join or something like that, with these complex filters on them and then all of a sudden that's going to slow down.

Xavier: Yeah, that's true. We didn't ... I think I forgot about DynamoDB in the business part of my typical architecture. But yeah, DynamoDB is a serverless database we commanded as well from the beginning. And indeed, it's amazing to see as well the shift in paradigm and the fact that you have, you change ... It's always around, mainly around performance and the fact of making sure the system you're building is performing and long-lasting at the same time. And DynamoDB is a really great example about that, because when you did SQL, you didn't ... you thought about a clear and easy to sync relational model. Okay. So, it was not normalized, everything made sense, all those models had a link and a clear unity and you saw the direct connection between them. But the issue is like, it was growing and at some point your SQL queries got bigger and heavier, and there you had to do some tweaks, and here you had some performance issues, and complex performance issues usually.

So, you had to think about reshaping your data or improving your scale of performance ... request performances. And other things, you had to think about scaling your database, but scaling the SQL database is much more complicated, because it's a huge blob of data and you don't exactly know when and how you can access the data, because it's inside of your SQL queries. So, doing horizontal scaling on that is almost impossible, because you're going to get some data here, some data here, some data here, in this horizontal scale, and it's going to be terrible in terms of preference. So, DynamoDB shifts that. So, think about how you can access the data, think about preference and think about how you're going to store this data from the beginning, from the moment you put this data inside of the database. And there it works, after it works.

I'm not saying it's easy, entering data into the DynamoDB world is funny, you have to ... Once again, it's a change of mindset, but it's an exciting one, I really enjoy to think about that, how to start that and how to store data inside DynamoDB. But then it works, after that it works. So, there is also always these critics about, "But do we need that, do we need this change and this, let's say extra complexity from the beginning?" But there is that ... Of course, you have to learn something new. But it's not ... the slope is not that big and the reward after is amazing, is just amazing. You don't have to think about that stuff later. And I hope your business is going to be powerful and I hope your application is going to have a lot of users, and you're not going to need to think about that.

Jeremy: Right. Yeah, and I love DynamoDB. And advice I give too is, "If you're thinking about massive scale, DynamoDB ... You're not going to get the performance out of a SQL database, no matter how much ... with MySQL or PostgreSQL, whatever you do." But with DynamoDB too, if you're using very small ... if you have a small set of data, but you need to query it a couple of difference ways, throw a couple of extra GSIs on there, it's so easy to do that. But anyways, so I'm going to put the link for this article in the show notes, because I do want people to go and check this out. But it is very complex, and it has to be, right? As things get more advanced, they get more complex. So, how can people start building this out? You're working on a boilerplate for this, right, that can just help people maybe?

Xavier: Yeah, exactly. So, in our company we do develop and we do Greenfield projects a lot. And so we have to have something that's strong and that's easy to use and that has all the quality tools and good practice and the best items from the get-go as well. So yeah, we are making a boilerplate for this purpose and it's like how to handle well a mono repository across microservices. Stuff like that. A lot of little steps that have a lot of value and you're really happy they're there, because it takes some time to set them up. And so we're doing that and we're thinking about sharing it with the community, so it's going to go ... it's going to come super, super well along with this article.

And then we have some specific services that we want to package as examples for us that are useful, so we think it can be useful to others, it's not that complicated. But there is some specificities around security, being able to push a file anywhere on S3 or being able to just put an index at the XTML at the beginning just as a hacker, to just make some fun. That's some steps that could happen if you don't know about some good security practices. So, we want to push some blogs, just like an architecture, as examples alongside this boilerplate. The web circuit one for instance as well is something we are thinking about.

Jeremy: Awesome. Well, that will be a huge help for the community and people building it. So, I am looking forward to that being available. All right, so I want to move on to costs, because that's another thing, you look at this typical serverless architecture, you've got a lot of individual Lambda functions running, you've got CloudFront out front that obviously the data, the fees from the data gets expensive there. You've got DynamoDB, which can get expensive if you have a lot of transactions and a lot of things happening there.

So, you do have serverless costs, or serverless still costs you money. But the question is, or I guess the question that gets asked quite a bit, and most of it is anecdotal, is this idea of is serverless cheaper? And you have a serverless calculator that you put together, and I want to get to that, but let's talk about what we have to think about, what are the costs involved when we're building a Cloud application, because it's not just about the hosting costs.

Xavier: No, it's not. It's not. Intuitively when you don't know about serverless, that's what you're going to think, you're going to think about your EC2 bill and then you're going to compare it with the corresponding bill in AWS with all the services you have in a serverless product. But it wouldn't really be fair to compare them in this way. We didn't talk about costs along the ... at the beginning of this whole podcast, because we knew we would arrive there, but that's also one of the big advantages of using this technology. So, we talked about those services, we talked about services that are scalable, we talked about services that already have pre-made, let's say prepackaged features, like Cognito. And if you took a step back, that's when you think and you can use the idea of TCO, total cost of ownership, which encompasses infrastructure costs, development costs and maintenance costs.

Xavier: And that's where you can really see the value of serverless. So, we talked about the infrastructure costs, indeed that's your big ... you can compare, that's easy to figure out, the cost of your infrastructure. And then maintenance costs, so it's going to be the time you spend to patch your systems or to make sure that in the middle of the night something ... to fix something in the middle of the night. And the last one, the development costs, is the time you're going to spend to plug an open source authentication bridge inside of your architecture and to make sure that it's working well with the rest of an architecture that was not out of the box made to go with it.

And the idea with serverless and all the services that went through together, is that they all work on those three aspects of TCO. So, when you look at the serverless deal at the end and all the services, you have to look though and think that they help you on the TCO itself. But then my question was, and it's the same for the first article about the particular architecture, when I first saw it, it was complicated, it's a new world, it's still a new world even though it's five years.

And to understand what's serverless architecture and then is it really cheaper, how do I make sure that it's ... how do I understand that it's really cheaper. There are some examples, there are some calculators, there are some articles talking about TCO, some great ones, there are some studies or use cases. But still I was missing something, so I wanted to work on that. That's why I worked on the calculator.

Jeremy: Right. So, I want to get to the calculator in a second, but I do want to clarify a couple of things too, because I think what you see typically when companies try to build out their own component, right, let's say they want to build their own security or authentication component. Part of the problem with building any system that is your own, something that is not unique to your company, and as we always say undifferentiated heavy lifting, right, it doesn't add business value to you for you to own your own authentication system.

The problem with building something like that on your own is not only the development time that it takes, if I can turn on Cognito or even Off Zero or something like that and I can immediately have secure logins and password reset flows and multi-factor authentication, all this stuff's built in for me, and it might take me, I don't know, maybe a half a day to read the documentation and set that up, maybe it takes a day. But it might take me weeks and weeks and weeks of development time to build that myself. Then here's the problem, let's say the person who built that, the lead architect, that person leaves and he or she goes to some other company and now your other devs have to be the ones to maintain a system that they don't necessarily understand fully.

Which means ... what always happens when they do that? Someone wants to rewrite it, make changes or whatever and then you've got more development time, more maintenance costs, things like that. Whereas if you hire somebody who knows how to use Cognito or knows how to use Off Zero, then that's just done for you. Yeah, you have to learn how it integrates into your system, but most of that is becoming very standard. So, that's a thing that I think a lot of people don't necessarily factor in as well, is that that ongoing maintenance cost. It's also about having the people available that know that, to maintain it. Because you can spend a lot of time having somebody relearn something, trying to figure out what some developer did three years ago, and is now unreachable.

Xavier: Yeah, I couldn't say it better.

Jeremy: All right. So, we talked about TCO, which is great. So, let's get into this cost calculator, into the serverless cost calculator. So, you built this calculator, it's in Google Sheets right now, right, so you can just make a copy of it and you can do some of that stuff. I know you said maybe you'll do it as a web app or something like that, but I think Google Sheets actually is really nice, because you can go in, it's an easy interface. So, tell us about this serverless calculator, what's different between like the AWS calculator, that AWS has on their website?

Xavier: So, the AWS calculator itself is extremely powerful, but you have all the services, the 250 services, I don't know how many. You can find it in the video we were mentioning. So, it's crazy, you don't know where to look at. And the thing I wanted is ... my focus is to evangelize a little bit about the added value of serverless to really help people, that's what I did, simply. And I wanted to create something that could help people that don't have crazy, crazy expectations of serverless, to be able to access the potential cost of a serverless project for their use case. So, I needed to think about ... And the AWS cost calculator is like that, it's complicated, it's difficult, it's micro.

So, I wanted to build something like that, so I thought what can I do for that, what's going to make a difference. There was a lot of articles, there was a lot of videos, but which one do I set, which one do I fix and then I don't think about them anymore. And that's why as well I thought about writing, it was just the first step, I thought about writing what is a typical serverless architecture. Because I wanted to sell a fixed picture and starting to set some viables for the potential calculator coming behind. And which services, how do they communicate with each other, when are we using them and stuff like that? So, build this image of what a typical serverless architecture is.

And then I went further and I made the calculator. So, the calculator it has a table ... and it's a spreadsheet, it's easier indeed to play around with it, it's easier indeed to ... The idea as well with the spreadsheet is to be transparent, so it's easier to be transparent with a spreadsheet. Everybody can see everything, the calculation and the hypotheses we made. So what are you going to find there? You're going to land on it and you're going to see the main data, that's where you're going to see, the price algorithm.

You have some basic variables on the left, you still need some, because you need variables these days, of course, and then the price output on the right. And below you're going to have the services. And then you have several facts, each of the AWS services we were mentioning before. So, the entire calculation based on AWS calculator costs and the variables that are necessary for each of them. So, we used a color code to show that there is some that's really an asset, because it doesn't have a big impact on the cost at all, because there is no reason to really tweak it, it's okay like that, your architecture is going to be fine on that.

And then there are some with a different color code that are going to be put back at the central dashboard, to be able to play with them and see the impact it has on the cost. So, you can use the calculator in two ways, you don't know ... you just want to play around, you have a broad idea of the website or type of website, an eCommerce, a blog, a work tool and with a certain amount of traffic. So, just that, so we have some drop downs and you can change them and directly see ... play around and directly see the impact it has on the customer per month. And if you want to go further, from there you have the capacity of changing some user variables, I call them like that because I think that's ...

That you don't know about serverless architecture, so you are not going to know how many numbers you're going to trigger, you're not going to know how long they're going to be triggered. So, I needed to pre-think and prepare those variables with an extra layer. So, here you're going to be able to play with user variables, like the number of sessions per average user, the percentage of users authenticated, so this is going to have an impact on Cognito for instance. So, average size of ... a product size, stuff like that, which are more product-oriented. So, I don't know, I'm a CTO and it's been a long time I didn't code, it happens, and I want to be able to assess the value of this technology for my company. That's my idea, to help him to be able to do that.

Jeremy: No, I mean, it's amazing. And I love the calculator, because it basically says, right, here's that typical serverless architecture, here are the individual components that are running. And then not only that, but trying to estimate well how many DynamoDB requests are you going to make or how many Lambda functions or what does it cost for the events. All these kinds of things. If you just look at 30 services, or whatever it is, it just gets really, really hard to estimate that.

And so, like you said, if you're a CTO or you're just somebody who's evaluating serverless, going in not only can you use it to actually give you some good numbers, but it's a really good learning tool to go dig in and say, "Oh okay, all right, so these would be the components that would be running. If I change this, I see how that affects the cost." I love the fact that you can bump up the number of Lambda invocations dramatically and the cost changes by like $3. But anyway, I ...

Xavier: And you can see the impact on the whole ... on the rest of the system, that's something I like as well, because those programs ... That's something I like as well. Which is still cheaper. We're going to talk about that after.

Jeremy: No, but it is, it's really great and those different predefined scenarios are also really helpful as well. Because even if it doesn't meet your use case exactly, it's going to be pretty close. So, let's do this, because you give a couple of examples in an article, you wrote a blog post about the calculator itself, which was a very helpful blog post as well.

So, let's go through some of these scenarios in terms of low, medium and high traffic, because this is one of those things where you did a comparison for just the cost, not including engineering time, not including maintenance costs, this was just straight compute or infrastructure costs. So, let's start with the low traffic one, so this was about 50 sessions a day. If it was an eCommerce site, what would it cost you in a typical serverless application?

Xavier: So, I wanted to start slow, because I wanted to compare easy accessible values, so indeed the cost of EC2, the simplest process we can find, there is the architecture we mentioned. So, if you do 50 sessions per day of an eCommerce app, you're going to pay $1.6 per month, so including the feature, but the feature's always free, so it's going to be like that for everybody. And if you do a blog for instance, so a little bit different in terms of user variables, it's going to be a little bit more of course, it's going to be $0.5. Okay.

So, I thought in front of that what ... if we're going to do serverless, what are we going to do, it's something super simple, but you need to have some compute, you don't have a choice. So, AWS for that has this too in EC2, the simplest one, the T3a.nano. And how much is the T3a.nano? It's $3.43. So, just this little picture is amazing in my opinion, you can compare, it's two times more expensive than the eCommerce example we mentioned and it comes naked, you don't have anything in this, you have to put your frame up, you can put your database in there if you want, but I don't think it's such a good idea. So, let's say you're going to use RDS to have an extra database and you will manage it. So, you need to add the cost of RDS, which I didn't even include there. So, it's ...

Jeremy: Right. Yeah. No, that's what I was going to say, is that what I thought was funny about this example, for $3.43, that is just running that T3a.nano for 720 hours a month or whatever it is. But you have no redundancy, so likely what you need to do is you need to add a load balancer, which I think costs like $25 a month, and have at least two of these things running so that they're there.

You also said, again RDS, like you would want to put RDS in there somewhere, even if you choose the cheapest RDS, just one server, I still think that costs you maybe $15, $16 a month, something like that. So, already you're exploding those costs. Now, again, that's pennies, right, we're not talking about a lot of money there. But what about if we move on to something a little bit bigger, something like a medium traffic, so where would we ... so 2,000 sessions per day?

Xavier: So, 2,000 sessions per day, same type of applications, eCommerce app, a serverless one, $70 per month. Okay? And the blog is going to be $20. Of course eCommerce is a bit more intense. Then I continued the same kind of comparison at this layer, it was if I'm capable of showing that EC2 is more expensive than serverless, I don't have to go further and dig into the TCO topic, because you don't even have to. And there I took ... from experience, and I asked colleagues and I took a M6g.large, it's one VM which is corresponding to this size, let's say of application and traffic.

This VM is $62 per month, so I said the AWS serverless app was $70 per month, the blog was $20. So, we're reaching ... the VM is slightly lower in cost compared to the serverless eCommerce app, but it's super close. And once again I said it was enough, because once again the VM cost is nothing. You still have to put your data, as you said, you still have to ideas and some background. So, no questions asked.

Jeremy: All right. So, let's move up to the big one, this is the high traffic, 40,000 sessions a day to maybe up to a million sessions a day?

Xavier: Yeah. So, there I reached a certain limit, high traffic and very high traffic, that's how I called it, and there indeed it's more complicated to compare, those systems are a lot more complex, there is much more blogs in there. And if I had to compare serverless projects, like the one we showed, with the same in Kubernetes, with all the crazy amount the pieces you can find there, it's a good one, and that's something maybe I'm going to do one day, but here I did not have this information. So, here I took the cost of course of a serverless eCommerce cost based on the calculator and it's 1.7 K, it's a $1,700 per month for high traffic. Okay? And a blog is going to be $475. Okay?

So, we're talking about 40,000 sessions per day, which is starting to be nice per day, starting to be a good thing. And then you can go super far and think about a million sessions per day. A million sessions per day, when you're there you're pretty successful normally. Normally you would say it's pretty successful. I don't think ... I don't know if I put a website that's close to this, an example of ... No, I didn't. Do you have an example of a good website that has a million sessions per day? I don't know, it would be nice to be able to compare.

Jeremy: Yeah. It would be good to compare that. But back to the 40,000 sessions per day, $1,700 USD per month for that, and then you had for, very high though, if you did a million sessions per day, you had $49,000. Is that right?

Xavier: Okay. Yeah. So, it's $49,000, so you are giving AWS $50,000 per month, so it's nothing to blow at, but at the same time your company in front of that is big, you have a million sessions per day, so you do have something that brings ... Generally when you're eCommerce, so you know you have set products or you're a blog, so you have advertisements. You need to reassess how that $50K enters inside of the business, what are the costs. And I don't think for the size ... it can be acceptable. But then to be able to really ... So, I did not have to compare this with EC2, there is no way of comparing this with EC2, so I didn't put a comparison in from here.

I needed to take a step back to think about this idea of TCO, total cost of ownership. So, I dug inside of studies, of articles, I read a lot about that, and a lot of those articles say that the amount of energy you put on non-specialized apps, it's going to lower, it's going to run by ... It depends on the articles, but it will really have an impact and it will give birth to less amount of time spent on that. There are a lot of examples of companies that were serverless that don't have Ops, they don't have Ops at all, they don't. But it doesn't mean that they don't need Ops, it just means that they use it, instead of the rest of their tech people.

But let's say, for the sake of the size, that you're not serverless, you have two Ops, okay, you have two Ops inside of your team and you patch servers, so you reduce by two. Let's say you reduce by two the size of the Ops team, because half of the staff are not necessary anymore. If you take, for example if you take the salary for the company, of course it's always ... it costs more for the company than what you see at the end on your payroll, of an engineer, it's a $140K per year. Okay?

So, with serverless you're going to go down to one Ops, so it means that you're going to save $12K per month. Okay? So, this team of two Ops, it was for a high traffic website, for the one that had 40K sessions per day. We were talking about an AWS bill of $1,700 per month. So, it means that you saved $12,000 per month and you're paying $1.7K per month straight away. So, there is a saving of $10K in the middle of that somehow. So, if you think about TCO, of course it's always more complex than that, but the idea is here you think about TCO, even when you go further and you go to extreme traffic, it can have ... Once again ...

Of course, we are not talking about Facebook or Twitter, that's an extreme. And if you reach this, it's amazing, you don't even have to read this article if you reach this anyhow, that's for sure. And there of course for a lot of reasons you're going to have your own team to do that and it's going to be huge, it's going to be amazing. But for the most of people, for us, for the ones that read this article, it's showing that indeed serverless can be cheaper and can be amazing for their usage. That's what I wanted to share.

Jeremy: Yeah. No, and I think you're right. And then there was ... you referenced this paper on TCO by Deloitte, I will put the link in the show notes for this, but I think that that's a really interesting point, right, just the idea of reducing the number of things you have to do that are just maintenance. Again, it adds no value to your company to have somebody installing patches on a server somewhere or worrying or being on call. Now, again, you get to Facebook level or maybe you get to a point where you're at that million sessions per day, maybe it ends up being cheaper just to have Ops people that are constantly watching containers and a Kubernetes cluster and things like that.

But until you get to that point, you can save yourself a lot of money by reducing the number of people that you need working on the Ops side of things. And I want to say this, because this is something I know that a lot of people maybe worry about, is to say, "Well, what happens to the Ops jobs?" So, we just say, "We get rid one of the Ops people." And that may be true, maybe it's better to say, "We just won't hire more Ops people." But what would you do as an Ops person in an architecture like this, right, because there's still more things that can be done?

Xavier: There is still a lot of things that can be done. Of course, you have the change of observability that gets bigger, that gets greater, you have a lot of different moving pieces and you need to understand what's happening there. So, there is some more work to put inside of observability, which is a critical field to dig into. And for me, it's also a Dev Ops task. And then you have a lot of security concerns and there is always going to be some security concerns. And as well you can spend some more time to think it over and improve your security, which is good for everybody, because there is always leaks. Every week we have news of a new leak. And so you spend some more time on those kinds of tasks.

Jeremy: Right. And CICD, right, just thinking of getting things into production?

Xavier: Yeah indeed, that's amazing, thank you for that, I forgot about this one, which is a big one. Yeah, CICD, CICD is still super important and you want your developers to be as fast as possible and to have as less issues as possible during the development and testing and validation phase. Serverless gives you the opportunity, like really easily, to start feature implementing, just with CICD you are capable of ... with a simple computation, to create future environments on the go, like that, for testing new developments in an isolated manner.

And this ... it's amazing, honestly before serverless, I worked on projects where we had that, but it was a hassle and for big projects we have to reach a certain level to be able to ... or to want to spend energy in building them. Now, with serverless, finally you're capable of doing it. And that's amazing. So indeed, your Ops team is going to be able to do this fancy stuff that you dreamed up even more easily.

Jeremy: All right. So, if you're an Ops person, don't worry, just evolve and you can work on things that are much more exciting than just patching servers and worrying about your VPC configurations. All right. So, finally, so next steps on the calculator, right, so this is ... you mentioned at the start that you've got some other things that you want to do with it, so where is this going to go?

Xavier:
So, I'm going to continue using it, I'm pushing it internally, but I want to go further. So, I want to be able to challenge it based on some real use cases. So, I have had inputs, I'm working with some fellow community members, which is super nice, to be able to do that. I will challenge the data we have to make our calculator better. And then we can add some of the services, like AppSync for instance, because AppSync is pretty popular, and getting more popular, and I didn't put it inside of the calculator.

So, give a little bit more options to the table. I need to find the right balance, because the idea of an opinion is to have an opinion. The opinion has value in itself. So, if there's too much options, I'm going to come back to the AWS calculator and ... I need to find the right balance, but some options is good. And then, depending on that, next step could be potentially to build an open source website for it. I think it could be pretty fancy.

Jeremy: So, listen Xavier, thank you for your opinions, because your opinions on this have been great. Excellent tool, the article was great, I think there's a lot of learning, just to put it all out there and having your experience and sharing that with people is amazing. So, I know I appreciate it, I'm sure the community appreciates it. And again, thank for being on the show and sharing all this knowledge here. If listeners want to find out more about you, how do they do that?

Xavier: So, Twitter, of course, first. You're going to give it after, I guess, it's ...

Jeremy: I will put it in the show notes, yeah.

Xavier: Yeah, you're going to put it ... But I'm going to say it out loud, it's @xavier_lefevre, X A V I E R underscore L E F E V R E.

Jeremy: Okay. Right, I spelled your name right, awesome.

Xavier: On Twitter, on LinkedIn, of course on Medium and I'm planning on writing articles, but even more than that, I'm planning on updating my articles, I want those articles to stay up to date, I want this particular architecture to still be actual in six months. So, I'm going to work on ... and I have some ideas already, I'm going to work some on an update of my articles, so you can follow it along on Medium.

Jeremy: Awesome. Well, I will put the links that we mentioned in the show notes, as well as all of your contact information, and the serverless cost calculator. Thanks again, Xavier, it was awesome.

Xavier: Thank you very much for inviting me. So it's me that thanks you.


This episode is sponsored by Amazon Web Services!