I know how to model a state machine with a flow chart, it make sense.

But I’m designing an API to control websockets and manage a web websocket service via HTTP similar to pushpin. So I have at least 4 state machines. But then there’s the events and commands that get sent around that change state and I’m at a complete loss of how to model that.

I can muddle through in my head but, I want to prove it out somehow. Anyone have any experience modeling state in a distributed system?

I have at least 4 types of actors.
- API Gateway and it's connection state
- The lambda’s it executes when connections have activity (open, message, disconnect)
- Your application getting a http request to notify you of events and get commands
- and the commands you send to the control api anytime

The last three could have any number of them running at a time

And then I got quirks of api gateway which suck for application developers (one reason I'm not using it directly and building this service)

You have an open callback, and you can reject the connection or accept it, by the return of the callback. So you can't actually do any work (eg, sending messages) in this callback and you have to do the work after it returns. But how do you know if it's returned?

So you have to queue the work and poll to see if the connection exists and then do the work. Its possible the connection has been closed too, so you need to poll for a timeout before giving up and throwing away the work.

So the open callback needs an application response to know if it should accept the connection and now I'm in this weird state where without a queue I can't do reasonable things like on the application side;

```
accecpt()
subscribeToChannel(foo)
sendMessageToChannel(foo, 'new connection!')
```
And have the new connection actually get the message

I can do silly things like always accept the connection and queue the callback to the application, that isn't horrible tbh but breaks access control and now needs locking or queueing if any messages come in before the the now non-blocking open event happens.

I don't like it.

Or I can force the API to not do any operations but accept the connection on open events (maybe set metadata), but in this distributed system world, I can't actually guarantee that a call to the control api would happen after the connection would happen after the connection was accepted.

```
accept() // return back to the open callback
subscribeToChannel() // new event to the control api
sendmessageToChannel() // I hope api gateway knows I accepted it
```

Latencies inside and outside of datacenters would mean it should probably work. Maybe I can paper over it with retries? or a polling job? πŸ€”
I think marking a connection in my db as "recently opened" and giving it a few seconds grace to stop 404ing in api gateway might be enough here. Will report back.
And of course I now have TLAᐩ open in a tab http://lamport.azurewebsites.net/tla/tla.html
The TLA+ Home Page

I convinced myself I knew how to fix the immediate problem while ignoring my initial query which was to better understand the entire problem. So back to TLA+

This is so retro but also somehow on the cutting edge. I'm reading help docs in a framed layout with linked mp4s with pdf transcripts and zip files! It's .. helpful!?

Lol Professor Lamport just scolded me for not following his instruction to hit "how to play" in the web player, I notice that's not in the transcript πŸ˜‚

Link here if you're interested https://lamport.azurewebsites.net/video/intro.html

This is kind of refreshing from marketing landing pages. I guess this is comp sci academia at it's finest?

It's the "or some other method" that gets me here though. I also loled at "I don’t know exactly what else TLA+ is good for."

Besides #tlaplus it seems Lamport invented Paxos!? I feel like an idiot not realizing who he was sooner. https://en.wikipedia.org/wiki/Paxos_(computer_science)
Paxos (computer science) - Wikipedia

And I'm being told I should stop the video and read his Time Clock Systems paper http://lamport.azurewebsites.net/pubs/time-clocks.pdf

I think this video is a masterclass in a good straightforward video that makes clear points and communicates value. I'm enjoying the hell out of this class. (It's a class, no other way to do describe it)

I might as well drop this video here, but you should go to the lectures page for full context https://lamport.azurewebsites.net/video/videos.html

TLA+ Video Course

He wears a lot of hats

So I solved lecture 4 which is a program to solve the Die Hard 3 Jug riddle in TLA+

I stuck big != 4 (which is written "big # 4" or "big /= 4" because math) as an invariant and it failed on a path that got us to 4 gallons!

This is really neat!

and look a pretty pdf!

I did a lesson this morning before work. This escalated quickly. The amount of math concepts I never learned is astounding.

Programming looks like math and uses a lot of the same words but they mean all sorts of different things.

I knew "discrete math" had a lot more too it when I took it in school and oh boy it does. I think what I'm learning falls under that.

I don't know actual notation so bear with me.

In programming you have a `Map<key,value>` (Lamport kept calling it a set or array but my model of that doesn't fit) and in math you have a `function(key)=value` which sort of makes sense, but you treat the function like an array, and the key space is called a domain, and the domain doesn't have to be materialized, like the key could be "0 - infinity" or any range you can conceptualize.

Doing math on infinite Functions (did I use that right?) is commonplace, since it's not programming you don't have to compute forever, you're working with definitions not values.

When you do a model check (run every valid combination of state changes) on your spec (the mathematical definition of states in your program) you do need to define possible values of your sets.. so far anyway I'm not sure if that's always true.

I defined a set of 3 resource managers doing a commit transaction that requires knowing the state of every other resource manager. It introduced checking or changing the value of every other item in a set, or a single item.

I keep wondering if a simplified version of the TLA+ language would make it more accessible and I keep learning about the power of the math and maybe I should just accept math is a different language.

Still though tools to generate specs for programmers might go far, I'm sure there's prior art in this arena. I'll hold off on looking for it until I get really stuck or finish the lessons and do a few specs and model them.

@reconbot whatcha working on?
@sixwing I was making a prototype api for a websocket service using api gateway and couldn't model an interaction in my head or with truth tables or flow charts, so I knew there was more in this area of modeling and thus tla+ and frankly it as a tool and a way to change my thinking is much more interesting than my prototype.