Designing Events-First Microservices

Designing Events-First Microservices

I'm really excited to be here in New York and I was like I'm coming to Cuba con and specialist is it's sort of humbling experience to be part of this excellent micro services track it's just look I'm really looking forward to some excellent talks with a great talk directly for mine as well so I'll try to add to the story by talking about you know hiking design event first macro services so let's listen up dive into that it wasn't it will be sort of theoretical but but in somehow also practical I said I'll try to desert break down some harder concepts and try to make them serve digestible but it's more of a sort of food for thought and and and and for you to serve if you find this interesting you know we know which areas you should dive dive down into it and start exploring on your on your own so to speak so yeah I guess you want to do micro services first time and hopefully it's for the it's for the right reasons you know there's a lot of hype and buzz around my round micro services and and rightfully so I mean it's it's it's it's it's a really nice sort of design pattern or a way to implement systems but it's also easier to just carried away and does it drink the kool-aid I think that it's very it's very important that you that you sort of take a step back you think do I need to do micro services in the frig in the first place you know micro services the right reasons to do micro services in my opinion is to scale the organization you know users to break it up into sort of multiple autonomous teams that can deliver software faster more intelligent we're time to market matters etc and and not necessarily in to scale the system however if you do it right in my opinion then I will try to show you in this talk then you can actually get the best of both worlds you will actually end up with the system that can make it that you can use to roll out features and faster more predictably but also scale the system in a much better way and build more resilient software you know we're using autonomous components that can fail in isolation they can serve bounce back in isolation etc assess you know cutting a short this cascading failures that was sometimes seen in the past using UJ in application servers and we were really tightly coupled systems etc right so I think I think it's important though that I mean when I'm out there I'm saying no not not all but many companies were when they went when they go about micro services they end up with something like this that I calls in my micro lists were used where you have where you're if you're going to go from a monolith to him to to micro services system you take all synchronous or synchronous method like method calls the method to dispatch calls and you simply just replace them with with request response in a synchronous manner synchronous RPC and we're still we're and we're still used I mean the o mighty Oracle database or whatever using crud in this fully synchronous fashion you know the the problem here is if you bring a lot of the problems from monoliths over to over to micro services and you maintain that strong coupling that makes it really hard to build systems to scale and and and are available okay however you solve the problem with scaling the organization you sort of miss out on all the other features or all the benefits of micro service because you know this strong coupling does limit things like scale availability extensibility of the maintainability the understandability of the system I think we can do better than this right and grass grabbed this one and I think thinking in events really can help here I think I mean domain driven design from an events first perspective and that's where the the topic of this of this of this talk tried to sort of view the world through events and and and and so understanding what that what that means you know my domain driven design I'm sure all of you are familiar with that I mean that was a coin by Eric Evans in the I think 2003 is that did the canonical book come out it has really served as well the last was he what is 50 like 15 years but the problem is that it can easily lead us down the wrong path because it's it's it it's er puts a focus on structure too early in the design process it puts a focus on finding the domain objects you know finding all the nuns I mean I I was observing you know learning computer science back in the day when object-oriented programming was almost really really popular and then I was you know the first step was go out final final announce you know that's the same sort of thinking in domain-driven design and it's it it's nothing per se that's bad about that it's just that in this new world of distributed systems it it's not the ideal way to start I'd say because it's a kiss yeah because I mean doing that means that that that Sarah as I said it focused on the structure of the system too early and system instead early in the design process instead of trying to understand what the system actually does which is really the important thing and Greg young once said that when you start modeling in events it forces you to think about the behavior of the system as opposed to thinking about the structure of the system this is one of the written the main benefits of of events first type of design and so we should not focus on the things the nouns the domain objects that we've been taught to do but instead I don't know when you stand here for this to work well this crap this one I'm sorry instead you should focus on what one what happens in the system find the verbs and the verbs are usually events that flow in the system to try to understand who is communicating with who etc etc right I'll dive into what this means in more detail soon but first let's start with the basics what is an event I'm sure there are zillions sort of definitions right I'll give you mine so you know where I'm coming from this or the context that I'm coming from when it comes to events you know first events represents or facts of information and you know you know facts that's a fact is something effect in the world in the real world it's it's sort of something that has already happened something that we can take for being the truth it might not be the truth you know wiki with might beer or something or something that that's wrong but it's something that we saw accepted as being a fact and so that means the facts are in the nature of fact is that there are immutable you cannot change the fact you can change the fact that something has happened so to speak so this means that you know facts can only crew you know knowledge can only grow we can only learn more and more and more in life we keep some times we forget of course but in principle knowledge can only grow okay so the means of facts can only crew you can't delete facts etc if sort of facts in this in this context I'm talking about facts as represented as advance I may say first and facts they can be however disregarded I can choose to not believe the certain fact I can choose to ignore that it might be being violation with I already think is true etc so I can choose to disregard facts but once I have accepted them in this or conceptual framework for facts they cannot be retracted I can't go back and delete the fact something that I've already understood has happened okay and since and since the facts can't be the Trachta means that they also can't be deleted however I mean with the gdpr and everything you know there might be reasons where in facts actually in in practice has to be deleted you know for legal reasons like GDP or for moral reasons you know you know keeping your customers safe etc I have to say but in it but in impre– in principle they can't be deleted okay and but it but it's also very important that new facts that arrive into the system you know or or that we as humans accept can however invalidate existing facts we do that every day we'll learn more and we realize what we need what we thought we knew in the past it's actually not valid anymore the world has moved on or we accepted some effect that actually Falls etc right so to sum up facts are immutable facts just grow just accrue knowledge only grows you can choose to disregard fact once except that you can't delete them but you put and new facts that arrived us invalidate existing facts in some situations look so what should we do that we should serve mr. Kamin to assist to design process we just sort of ask ourselves what are the facts just like a detective coming into our crime scene most people have seen CSI right or or read in their Sherlock Holmes right you should try to mind them try to understand I mean causalities and things like that what happened what really went on in this crime scene okay and events storming can really help here investor army service sort of new technique came up than the in the last years five years or something like that we're sort of bringing all the stakeholders into one single room physical room you know all the old is although all the stakeholders meeting all the domain experts all the programmers etc and you have the post-it notes and happiness or a brainstorm but trying to find understand what's going on in the system from from you know how data flows in the system who's communicating with who trying to mine the facts and and also trying to understand the commands I'll bring I'll get two commands quite soon but but but but so we're trying to find cook the causality between your your components and the service and and and and and the service and you know information flow it's actually business logic that's usually where the value in your system resides more than in the nones you know the actual things they can be replaced and they can be whatever but you know data flow is is value we all know the data is value nowadays you know that's why people we gather more and more try to mine more more value out of them okay so what we need to understand when we when we brainstorm this then it's first to try to understand the intense okay the intense means I mean to find the intensity we should look for things like communication who's communicating with you who's converse conversations like that's the Serbs or session type of conversations expectations contracts between the different parties when we are transfer of OPCON of control these are usually hints for for intense okay and we should also look for facts then in fact our usual and hints for facts are you should leave state history Kosala T has I've already said I mean and why something happened in the relation to X Y etc notifications and also transfer of state right in sort of the contrary to transfer or control and intense are usually represented as commands while facts are usually represented as events okay this is sort of a good model to do to know what to look for when you when you when you think about these things but let's try to understand the difference between commands and events a little bit better and most people have heard about it perhaps not everyone knows they actually different in semantics and meaning okay sir sorry command is sort of the object form of Oban or the sort of a method call or or or an action request okay it's it's it's it's a verb you know and it should be phrased in the imperative things I mean sort of common names are things like create the order ship product etc sort of the intent to do something that I want someone to do something for me okay however we serve so and and the flow is often then as when they when when that did when these commands are accepted by some sort of component some sort of service etc that you shouldn't causes some sort of reaction you know commands are all about the intent of a command is to do side effecting and that asks someone else to do something you know that has some sort of reaction causes some sort of effect okay so it crept so so then that reaction however that sort of side effect usually then emits and creates it represented as you might say an event okay because an event represent that something has already happened that that that the facts are speak that the reaction happened and completed okay and events are usually before us or a phrase in past tense order created that the order has been created compared to create order the sort of the AAB of trying to oblige someone to do something for me okay and or product shipped etc so they have very different meanings Center semantically and there was a very different meaning when it comes to how designed the system etc okay but let's dig a little bit deeper here sir commands are really all about intent while events they're totally intent –less okay they just represent the sound the fact that something has already happened commands are directed you know I want to tell someone specifically to do something can be one or many but but they are really so specific in terms of their direction while events are fully anonymous okay commands are they have a single addressable the destination I'm sending it to a specific point asking a specific service to do something events they just happens for anyone to observe that might be interesting can be zero can be 100 can be 1000 can be whatever can be all the different sort of intense for subscribing to the events as well etc okay commands the model personal communication okay while events they were Model broadcast so to speak speakers corner how me standing up here talking to you I'm just served a broadcasting events and you can sort of choose to accept them as facts or ignore them hopefully the former and commands they then they have a distributed focus you know they move between contexts are very often across the network while events they are they have a local focus you just they're just the services just emits event locally were they were they were were they happen to be okay no address symbol destinations or anything like this just local so if you actually want to have advanced transfer across the network then you have to wrap them in messages and send them over and pretend that they happen over there are you following with me okay commands are really all about command and control I won't to oblige someone to do something for me okay hope you better you know do as I say while events are really all about autonomy and that's one of the keys to other but I think events can really help us build truly available available systems because it it breaks free of the coupling of commands okay not saying the commands are bad they have it in place but I personally favor to start thinking in events and and and and and try to design my system using events as much as possible just because it gives me a greater path towards autonomy and LU and lose your coupling okay so let us LITTLEST so we should listen let the defend the events is define our bounded contexts which events come in and which events come out and we do that by defining and working hard at finding the right protocols for these these for these boundaries it's extremely important and because if we do this right events can actually help us to invert the control it and put to serve the service itself in charge you know I can choose which events I accept no one can oblige me to do anything really okay so it serves or it gives us an inversion of control that can be really helpful when designing these type of systems okay so but what are the characteristics of an event we have a service that yeah first event an event driven service then can receive and react and if you if you choose to events that's we're coming to it or or commands right so essentially reactive facts anything can also after the reaction that essentially done something survey concur so create a new fact representing the fact you know that something happened you know it performs some sort of operation and and is as a result of that it creates a new fact and then it can choose to publish that to the world okay in an asynchronous fashion as I said already is already inverse this control for minimize the coupling which is really really helpful and you know it's a lot of talk about immutable events here some people might think then what about mutable state I mean I'm used to thinking immutable state I'm used to thinking in terms of a more imperative type of code there might be a lot of functional programmers here as well but most people are probably from an imperative background but I have to say the mutable state is still fine you know but the important thing here the key thing it really needs to be contained it means to be not observable to the rest of the world as long as you contain mutable state you know you confine it to the service it's fine you can use it it's more intuitive for a lot of people you can actually I perform as improvements etc to do to do processing etc but needs to be contained in sort of the safe haven not observable to the to the rest of the world and you know when you have a case accepted of fact and and as you do your processing and you sort of you end up with the with with the result then you can choose them to create a fact out of that the fact that something's already happened and then you publish that to the outside world you know this means that that other components can rely on stable values they can rely their reasoning of things that you won't change you know if you publish mutable things that means that things can change right under the hood right now right on the ice you know of the guys using it and that sort of makes for very for very very brittle protocols by but but by communicating only through facts I mean you get really reliable stable sort of communication protocols and how do we model this then in ddd in ddd i mean people talk about you know services and and you know boundary context and and and the aggregate is where you put your serve your state normally i mean that's usually persist it down down and down to disk so it's I think the aggregate is the perfect serve keeper of your events or speak it maintains the integrity and the consistency of your service it is serve our unit of consistency our unit of failure as well what I mean by that is that if if an aggregate fails it needs to fail as a whole it can't fail partially okay because that means that that you might end up being fully inconsistent fashion within yourself which is really a bad thing okay so so it's really an either/or thing and it's really a sort of and if we serve adhere to that and make sure that if if and if an aggregate fails it fails as a whole that means that we can actually sort of ensure strong consistency within within the aggregate which is we wish is essential in order to be able to reason with you know sanely across the street in services in a distributed system so then it becomes through our unit of determinism we can assure ensure that within our service we have full determinism and I'll get back to why this is important later okay it also gives us full autonomy which means that the answer that the entities can come and go we got without stroke or fail and and answer bounced back without affecting any other services in the system which is extremely important so that was a lot of theories let's try to dive in a little bit you know practically what what it can look like when event-driven service is communicated let's say we have some sort of user here the user might be another service it might be another system it might be actually physical human sending a requesting okay so it sense that by command it wants to do something okay that command is up on them it ends up on the mailbox of the service and some sort of action is triggered okay out of that action is we create an event okay and we publish that to our event stream or event bus servo or day of data fabric you you might call it whatever you like I call it event stream okay and then you know this event stream sort of relays that event to anyone that might be interested everyone that has subscribed to that yeah and as all of that means that the event is relayed to to the to the other services mailboxes triggers in action of some sort that might also trigger an event etc you know there might be an event sent back to the to the user now completing the tasks etc or sent out to some other service due to to continue to perform some sort of some sort of function it's also important that it doesn't necessarily need to be services that sort of subscribe to these events it might be some sort of external a system or summary in this case a database for example it might be H HDFS because I mean we have we have did data processing service or services that rounds nicely for example doing data mining on whatever events happen in the system etc or it might be other type of services for external services etc I mean so the really interesting thing here though is that this model requires a mindset that is that that's it is at peace with eventual consistency you know because because all these events are the commands flow in a fully asynchronous fashion which which might take some time to wrap wrap your head around if you if you use the monolithic services or monolithic systems etc okay but the event stream used in this fashion is great for a lot of things it can be used as or community as a communication fabric it can be used as our integration fabric integrating with other systems etc if you use be used as a replication fabric to replicate data for availability okay it can be used as our consensus fabric to make to make sense of the state across a distributed system etc it can also be used as the persistence fabric to actually give you really available and reliable sort of data prata persistence being the sort actual source of truth I'll dive into this more later ok so and then side ask persistence some people might say then ok but what's wrong with crud then I'm yeah I'm actually used to using crud yeah I'd say that crud is totally fine for totally isolated data however most services don't have totally isolated data what I beam a totally isolated you mean data there's no other service might ever be interested in totally isolated data ok as soon as you as you don't have that as soon as you need to do some sauce cross service consistency ok then then I think crowd completely breaks breaks down why because it's really really hard to do joins okay it gives us I mean we can really can't use any of our old tricks like normalization joins and all this you know they all the great things that we have in in sequel that would just take for granted they have we have with one single system image you know all that breaks down when you have split up your daytime serve across multiple services with a networking between okay it really gives us all the ad-hoc and very weak consistency guarantees sometimes that often to ask so weeks they are unusable okay are you often you know here customers complain about this though the reality continues to ruin to ruin my life you know the problem is that as Pat Allen said two-phase commit is the anti availability protocol it's extremely hard to build available systems using the distributed transaction two-phase commit etc so we're maintaining or it's actually the Illuma trying to pretend that that the world is strongly consistent across the network you know that's an illusion that we've been living with for too long I believe it's time that we were to end that announce or accept the fact that there that the world is eventually consistent there's really no such thing as strong consistency in the real world it's just something we we as developers try to shoehorn into our or a very sort of limited model of the world and we are surprised when it falls to Bart we shouldn't be you know it doesn't map how the reality actually actually actually works strong consistent day is really the wrong default as I said before I mean this roughly is really nothing wrong with the crowd it's really nothing wrong per se with strong consistency it's it can really help us to to think in terms you know around hard concepts in computer science it's a great thing a strong consistency but I believe it's the wrong thing to start with when you design your system it will lead you down the wrong path okay it makes yeah I think it just makes things worse so to speak so what can we do then yeah we really have to rely on eventual consistency from the start and then Oh on strong consistency when we have a chance okay instead of the other way around Starkist wrong consistency and try to loosen it up here and there for availability and and scale etc okay because I really think it's actually how the world works okay we need to be bed rest or are just embracing reality in our design session look at how the world actually works not the way we want it to work you know in order to teach to fit you know the way we learn to design systems the last pen in 10 20 years we should don't fight reality we should embrace it we will actually be better off in this new world of cloud computing and multi-core architectures were actually we actually do have a distributed system right in the course you know even we even though we don't think about it like that okay you know it's a fact of life that you know information travels at the speed at the speed of light and that puts a limit on the speed of information in in our in our system this means that information has latency it's actually contrary than what Newton thought but then the later Einstein proved that you know it's all in the in the eye of the beholder so to speak and this is a reality systems today it's not just an esoteric thing it really affects systems today that this latency is there it's something that we need to think of to think about you know but when we think about it you know information is always from the past you know it's really true for everything we observe in the real in the real world when we've seen something or hear something or experience something it's already very often happen quite some time ago there's always this delay you know sir we're also looking into the past it's always in the eye of the beholder you know this means that the present then now is actually relative we all experiencing different presses sorry for being you're going going off or philosophical on you here but I really think that we need to fully embrace this view of the world that there is no now okay everything is relative this means that when we design microservices system we need to think about each services having their own now then that's that it's okay that it's not fully consistent to cross services you know because as soon as we exit the boundary of the service we enter this wild ocean of non-determinism you know that's the world of distributed systems it's a quite scary world you know where systems failed in the most spectacular ways intricate ways you know where messages get lost never to be found again you know where messages get garbled get a good reorder etc you know and we're failure detection it's really nothing but a guessing game you have really no clue for sure if the services that you talked to us down we're just doing GC who's out for lunch or whatever you know this might sound terrifying but it's also this world the space in between the services the non-determinism between the services that gives us tools for resiliency elasticity isolation etc without this just really no isolation etc etc right so what I think what we need to do is find have better tools for modeling this uncertainty between the services because you know a lot of people say yeah micro services is not that hard they say just you know you know you know generate a micro service eg or PC or whatever and I'm done you know no I mean the e that's the easy part the services themselves is the easy part the hard part is the thing in between the micro services okay and yeah at the end I mean we've learned the hard way not to hide complexities you know not to hide the network for example we try that with our PC to fail we try to incorporate ABS etc the list you know is long the graveyard of distributed systems you know things that didn't work you know it's always better to embrace the constraints the constraints of the network the constraints of reality and and work with things instead of trying to you know pretend that they're there they're not there so Patel and once said you know that in a system which can encounter disabilities did distributed transaction the management of a surgery must be implemented in the business logic you know how a harm in doing the protocols how events flow between the systems and I really believe that events can lead to greater certainty in the system you know mark burgers he wrote this great book in search of certainty I really recommend you to read it it talks a lot about autonomy and it is very applicable to this new world ok and and where it says that an autonomous component can only promise its own behavior autonomy makes information local which leads to greater certainty and stability essentially since if you what it says that if you're only relying on local information there's not couple to anything else in the world then you're fully autonomous you're in charge of your own decisions ok and that's a really really good thing ok so events can really help us craft sir these autonomous islands of determinism mr. shields us from this craziness this crazy ocean of non-determinism where you are safe you know we can use mutable state where things are strongly consistent sorweful things are fully deterministic etc where you can live happily under the illusion that you know there is that that time is absolute that there is a single now so to speak and and we need in order to do that we need to craft well-defined protocols which events you accept and which events you omit which facts you accept which facts you omit ok the question there I mean how can we work with data across isolated services how can we assure consistency and consensus yep at Elland again he has a really nice framework I think for how to think of a consistency in this new world sir he talks about inside data as for current present that's the state inside your service ok then he talks about outside data as the blast from the past you know that's the events arriving to you things that already happen you can choose to accept or ignore ok and then we then between services he talks about hope for the future which is like almost poetic yes I love that you know that's our commands that someone sends to command hoping that someone else would care and do something about it ok this is a really good way there's mental model like to think about these these types of systems it's also important to understand that Oh microservices is really a never-ending stream towards convergence you will never ever really reach full convergence you're always trying to catch up you're always trying to reach that you might actually reach it for a millisecond but then you're you know behind again right because the system is in constant motion data constantly arrives at faster and faster speeds you know it really is know globally consistent now this is also a nice mental model I think in order to to try to understand how to think about systems in this new world ok another fundamental lesson I've learned the hard ways that resilience is by design ok this is a photo of a home in Christ in Gil Christ in Texas it was designed to service these flood waters and you can see I'm sure when when Hurricane Ike Karen if you remember that in 2008 came in is one of the few houses actually stood strong why because it was it was designed for resilience from the start ok and I really think that events can help us manage failure instead of trying to avoid it I see way way too many times when people building you know systems with track at statements literally everywhere but we're trying to prevent it's really really scared about failures and we call them exceptions if they're even though they're really not exceptional at all there are normally guaranteed to happen and as a real sewer service or Arcos are literally sort of sort of we scatter track caches across our code base everywhere but I really think that that's sort of the fundamentally wrong way to think to think about failure failures is sort of nothing to be scared about it's now is inevitable I mean it will happen even if you don't like it or not that it's better just to think about how to manage failure than trying to work hard to prevent it all the time you know being scared about of it it's actually a natural state in the application life cycle in the in the services life cycle I mean it's you have start you have stop your resume you have failed you know and if you draw it up it's this as a state machine and and and and if you then end up in the failed state you know that all this was actually accept and expected and you know exactly how to get out of there instead of being scared you know that everything will blow will blow up and I really believe that events and this isolation that micro-services gives us can really help with building these type of reliable systems you know and and and I think the requirements that I have for it for this we're saying failure model compared to you know this the strong clapping that we have often having MA in model lifts with one one failed looking like you know blow the whole pull stack all the way up in the users face is that failures need to be contained fully contained isolated to avoid cascading failures they need to be refined as events they need to be signaled asynchronously to whoever might be interested in learning that I just died okay and they need to be they're not observed ideal about AB at least one but why not by many okay and then they need to be managed so outside or they're all the failed context in this and here we have the network it's a really good boundary between the failed component or the failed service and ourselves you know and anis and there's no surprise that this model is very much as I talked about in the venturi with services if you look at if you see failure like this just being an event that means that it will fit right into the rest of the workflow it's just an event flower flow it's just business logic instead of you know coming emitting a valence you know from the result that you just computed to whoever's interested you emit an event see ya okay I couldn't complete this just died sorry to the to ever might be interested there's nothing exceptional about the way information flows you know when it comes to failures either just fits right into it and and and you have a really good way of managing failure they'd failure this way finally event event I was gonna talk a little bit about vent based persistence because it's it's something that people talk a lot about but not perhaps everyone knows exactly what it means so how one person that that I get a lot is like how can we transition from a crowd based system to a more a more event-driven system okay as I said I think it's fine to use crowd but if you want to use it to serve more and it worked towards a more event-driven system a really nice pattern is to combine that with event stream the event stream that I just showed you in these illustrations some some some some time ago okay and you can use that to get sort of an internal ecosystem materialized view of the world so try to illustrate what I mean here if let's say we have two services service a and service B both of them are using crowd okay sort of storing your data in table a and table B let's say and here and both of them have sort of hook to our event bus okay let now when when when service a in crowd stores some some record in table a it's also in anatomic fashion that's important pushes that change out to the event bus okay it's in either or either both of these operations fails or a Norden or not none of them fails if we have that in place you know we are on a good path here because we're then in the in let's say the service sir that the service B here also wants to sort of do the sale you do the same thing right then in service C services interested in updates of both or a loops service a and service be here okay so it's then it's served just subscribes to the events from both of these event streams and it can then it sits itself internally join table a in table B all the records that have be stored there eventually because this is also eventually consistent so ends up in in services C own own an internal table acid acid acid join and then it can serve the user using this this materialized view so to speak but it's also very important here that this is still eventually consistent so we doesn't we don't solve the problem of having strong curses across the network we just in serve loosening up the guarantees a little bit and we're on a path towards marriage medicine marrying crud with a more event based way way of thinking okay but I think we can do better I mean Jim Gray was said that update in place strikes system designer as the cardinal sin it violates the cap traditional accounting principles that's been used that that's been absurd for hundreds of years you know the the per the main problem with crowd is update in place that we actually have destructive updates you know yet again I think we need to relearn and learn from the real world will relearn basic accounting principles how things have actually these things have been done on pen and paper for hundreds of years okay Patel and also said that the truth is the law the database is the cache of the subset of the log the question is why work with the cache of the subset when it could work with the real thing I mean there is a reason why we use it update in place you know disk space used to be extremely expensive so there was a reason that you actually have to preserve disk space but today this debase is incredibly cheap there's no reason to throw away historic data and here the amount log can really really be a great thing a really brilliant log in here can be sort of the bedrock on which we can build a lot of the hard things here like consistency like availability scalability and traceability etc and you know Event log is all about storing each event in order as they arrive on disk sir in is in for durable ledger okay like just like a transaction log is used you know in in Oracle or or any sort of sequel database is just that we we expose it to us as developers write off and a great Patrick on top event logging is something called the best sourcing its sister that's the and that can work its where as secure for this card little scene of destructive updates and it's really that you log it works that you log all state changing events in the event log everything that will update the state of your internal state of your of your component right so we use it to back up the the aggregate this will give us a way to have strong consistency within each each service backed by a durable event log okay but we can add eventual consistency between the IRA gates so I think it really gives us the best of both worlds and sir the way it works is that let's say we have the happy path first in the in the in the happy path we first receive and verify the command let's say it is an approved payment we we create the event sort of representing that that action that we're about to do let's call it payment to prove we append it to the event log and then so we update in our internal component state and then finally we Robert where other side effects to whoever the fire fire the nukes or whatever I were in the end the sad pathway we want to recover from failure for example I wonder we replay it because we wanted replicate it for redundancy etc or replay it for debug ability or etc I mean then the only thing we do into a real we rerun sort of stage 3 we rehydrate the events from the event log and we update the internal component state we don't round the side effects again we just bring it back where it were when it's failed ok and this is something that marking foul sometimes called the memory image pattern we can wish you can have you know you serve your your service having in mem and like just in memory data you know but it still durably persisted on disk for full availability and there is the nice thing here is that we have one single source of truth for all our history and that's ever happened in the in the service and in all services it allows for this a durable in-memory state that can be extremely fast you know it's extremely easy to work with because it avoids this you know this in famous object relational impedance mismatch there's no reason to map your objects or your your your your runtime representation down to tables anymore which also saves a lot of time and a lot of hassles and it also allows others to subscribe on these on the on this on this stage Changez okay I mean there might be others there might be interested in it you know there are other serve serve others other type of systems you know running Hadoop jobs later on on the on the on the full data set you know the join of all your services etc etc it also tends Event log in general test a very good mechanical sympathy be kind of sympathy sort of a coin by Morton Thompson but what it means that that is sort of the pattern of the way design software matches very well with how modern hardware works you know the event log gives us a way to have a single writer writing right down to this in order you know fully fully uncontained it and you know if you've been around the block if we're building so concurrent systems or distribution systems for a while you know that contention is the biggest scalability care killer and and also contention you can also really bring down availability etc so another great sir we're great pattern on top or in tandem I'd say with the way the event sourcing is a practical CQRS you know securest gives us a great tool for the problem of when observe serve it gives us the great tool for so separating reads and writes it might be you know you're the reason in the system might have very different characteristics than write that then the writes sort of characteristic both in terms of consistency performance availability its etc secure server allows you to decouple these worlds and store both of these the read side and the right side and the optimal format for its purpose so sort of sort of so to speak you know you might for example have it have a read heavy system you know that then you can search you to scale out the read side independently author of the right side etc instead of having no all lumped together makes it really hard to give it like turn turn there right now it's you know she'll turn you to scale or both at the same time etc and and you know if you if you should combine this with the event sourcing there's I mean if that mean that means that you can use the the event log as a right side we all talked a lot about that it's really good fit for storing events in in in in an event-driven system however event logging makes it really hard to do queries to do joint etc and this is why CQRS who comes in very handy because it allows that to store the the real model in an optimal format for reading and that's often you know an AR DBMS or a relational database or or enough sequel database source or a graph database or something like that no so let's say that we have we have we have a user here that says the commands to the right side model here the right side instance just writes it down to the amount log in order very fast very convenient the its it also then sort of woman wants it down that's a syrup amidst an event saying I just store this in my event log who's who's who's interested okay then it might be another system here the read side mul that that's interested in this in this in this update and it can serve subscribe to that and then choose to store that in the ultimate format for the read side often a sequel databases or Cassandra or something like that and then and then serve that mall to the user so you can do queries using you know traditional sequel for example while still reaping all the benefits of the event-driven designed for for the for the for the durable data storage once again here it's important to understand this eventual consistent between these two subsystems so this there might be you know a delay for queries you might not even be read your own rights etc this of course ways to to add layer of reliability to that using stuff like raft or things like that you know but the bare-bones you know view of this is that it's eventually consistent between sir events can also really help us or manage time which is also a quite interesting thing you know Greg young wants that the modeling events forces you to have a temporal focus on what's going on the system where time becomes a crucial factor in the system and what I mean by that is events the real events working really allows us to model time in Nice where we're events can be sort of event is our snapshot in time you know the latest event this is where we are right now right an event ID can be an index of the time so the cursor how time flows through the system and the bount log is our full history you know with everything that's ever happened is there a database on the past as well as database of the present well the regular single day bitch sequel databases are just the database of the prescient discarding the past okay and and one finally one interesting thing is that it really allows for time travel okay we can where you can have things like you can replay the log for historic debugging for auditing for food food for full traceability understandable what went wrong and why it went wrong is replay the event log bringing component up to state you can replay it slowly etc you replay the logon failure you can replay or so bring the component up to the state where it was you can replay it for replication as well you can have all kind of replicas all that is essentially free because you can all you need is subscribers to the event log and choose to sort of replay it at any point so kid so I'm I just want to give you the key takeaway sir if the events first design helps your a reduce risk when modernizing application I really be believe there is a name able to move faster into this new world sir tours are resilient and fully scalable architecture by design autonomous services by usually invert the control flow compared to two regular systems it allows you to balance certainty and uncertainty and avoid things like like lock rod in orem the impedance mismatch etcetera take control of your systems history etc and and balance for strong and eventual consistency in in in which is really the hardest thing you can do I think in a distributed system in general I mean I I personally come from experience learning this to Hardware using arca archa is a great tool kit for that if you give to build like autonomous resilient services based on the actor model you can just go and check it out on our Kyodai oh I won't give you any more plugs about that if you want to learn more about this I wrote this mini book this for death free it's free it's freely available on our Riley it goes down into a little bit of more detail so you can download here a bitly reactive Microsystems so that's all I had sorry for being overtime a minute thanks for attention [Applause]


13 thoughts on “Designing Events-First Microservices”

  • הוنᄀバコᄂᄃɸฅจـهـ母表จาøนحöܠʢܢחق한국어י says:

    How is the event snapshot implemented? In a database or log file? Do all microservices see it? How is it all wired up?

  • Anthony Mastrean says:

    This is yet another example of "draw the rest of the fucking owl" software architecture. There's three minutes left in the presentation. You finally show some kind of idealized diagram and don't explain a lick of how to do it (and, believe me, implementation is the gnarly part here).

  • there has to be an exception made for deletions: holding on to a request to delete a fact, and then perform the delete when we find the fact to be deleted; knowing the fact can't be recreated (because of unique ids). GDPR is a great example.

  • Marcus Nielsen says:

    At Jonas says that the DB table and the event bus both succeed or fail atomically.
    Can't that be eventually succeeds or fails instead of atomically? If I use google pub/sub and postgresql, can't I log that I'm inserting the event into the DB, and then when I'm done sending it over pub/sub, log that I'm done with the event?

  • generally, Jonas is a good speaker, but this one he didn't seem to be well prepared…
    I believe he knows all of those stuff because I have seen him talking about those things before, but did he change a set of new slides the night before this talk?

Leave a Reply

Your email address will not be published. Required fields are marked *