Architecting for user (and engineer) happiness during Stadia Controller setup

In this new series of blog posts, we'll be detailing some of the more technically intricate aspects of working on a platform like Stadia and how we've worked to achieve simplicity in a world of complexities for our end users. In this first set of posts we're tackling some challenges specific to Android and iOS, with more to come in the future.

And now, here’s part two in our series on building Stadia Controller setup in our mobile app with Flutter.

In the first post of our series, we discussed why hardware setup is an engineering and UX challenge, why maximizing user success becomes increasingly difficult, and how making smart tradeoffs with the available time and manpower is necessary to reach a high level of success. In this post, we’ll describe how early architecture choices reduced the setup flow’s implementation complexity, and allowed the team to execute in a more efficient way.

We made architecture decisions that paid off

The Stadia app uses an MVVM-inspired, reactive architecture that’s an evolution of the BLoC pattern. With few exceptions, data in the app flows in a single direction, towards the UI layer. When a user taps on a button, for instance, the code that handles the event has been provided in advance to the UI by a piece of business logic. In most of the app, user interactions (or events) like button taps, swipes and the like are the most likely causes for the app state to change, or for the UI to update in some way.

Setting up the Stadia Controller has all of the same considerations, plus several more. In addition to events that can come from the UI, controller setup has events such as:

  • Messages or status updates received from the controller
  • Bluetooth events, like the controller connecting or disconnecting
  • OS-level events, like the Bluetooth antenna turning off unexpectedly or the user losing connectivity
  • Synchronous network calls, like checking if security requirements are satisfied, or if software updates are available for the controller
We need code to handle each of these event categories, but without a structured way of determining what, how and when events should be handled, the flow would quickly become a tangle of confusing, interwoven logic.

Workflow and Hierarchical State Machine (HSM)

To manage all these event sources, we extended the Stadia app’s architecture by combining it with a Hierarchical State Machine (HSM). These came together in a single Flutter StatefulWidget that we called a Workflow. This is the backbone of the controller setup flow.

We first broke down the flow into a set of states. Each state represents a position in the flow, and it may or may not be associated with UI on screen. Using a HSM affords us the ability to nest states within one another, and we used this to have superstates for large chunks of the flow.

Stadia Controller setup flow superstates

Next, we mapped incoming signals that might cause a state transition to events that could be handled by the HSM. In a state machine, each state has an explicit set of events that it handles, and in a HSM, if a child state doesn’t handle an event, it bubbles up to a parent state. This means that we can use super-states to handle certain events in a similar way across portions of the flow. Sticking to events that the state machine understands helps maintain that unidirectional dataflow model we use consistently throughout the app, and makes our code behave more predictably. The only signals or events that can affect the state machine are those that are explicitly handled by the current state or one of its parents.

The Workflow allows us to tie user-visible UI changes to the HSM. It maintains a map between HSM states, navigation routes, and UI screens. During every state transition, the Workflow’s route table is checked. If navigation to a new screen is required, the new screen is rendered with the appropriate animation, such as a forward or backward transition. It’s important to emphasize that a state transition inside the HSM triggers a UI transition, not the other way around.

Stadia controller and phone

A good example of the Workflow, state machine, and multiple event sources all coming together is the section where we prompt the user for system permissions. There are three screens in this section, Wi-Fi, Location Access, and Bluetooth, and corresponding states in the state machine for each. These three states are children of a ‘permissions’ superstate.

When the state machine first enters the superstate, the state machine checks to see whether it has an initial child state. It does, and it’s the Wi-Fi state, which then becomes active.

Inside the Wi-Fi state, a similar check for child states occurs. The Wi-Fi state follows a pattern that the team has found useful, where a state that might need to do something will have two children, an initial ‘checking’ state and a main state. To briefly explain, we adopted this pattern because it allowed for better separation of concerns, and decluttered the branching logic in our state machine. The WiFi state contains these checking and main child states, and because the checking state is the initial one, it becomes active. When this state transition happens, the Workflow checks the route table to determine whether anything needs to be added to the navigation stack. The Wi-Fi checking state has no associated UI, so there is no user-visible change.

Upon entering the Wi-Fi checking state, a piece of business logic invokes our connectivity plugin to determine whether the device is connected to a Wi-Fi network. Depending on the outcome, this business logic will fire one of two events. When either of these events is received by the state machine, it inspects the Wi-Fi checking state’s event handlers to see whether it handles the event. The checking state handles the Wi-Fi disconnected event, so if that’s what was fired, the state machine will execute the handler, transitioning to the main Wi-Fi state. During the transition, the Workflow’s route table is referenced, and the UI navigates to the appropriate screen.

Connect to Wi-Fi screen during Stadia Controller setup
Here, the user is prompted to connect to a Wi-Fi network. The same business logic as before is still running in the background, firing events if the user’s device connects or disconnects from a network. Once the device is on a network, the event is received by the state machine. Just like the previous example, the state machine searches for an event handler for the incoming Wi-Fi connected event, starting with the active state. When the event is handled, the state machine transitions to the next permission state, Location Access.

But what if the user is already connected to Wi-Fi when they get to the checking state? One approach to dealing with this is to explicitly handle the ‘connected’ event in both the checking state and the main state. However, recall this facet of the HSM: if a child state does not handle an event, the event is bubbled up to the next parent state. Instead of using two handlers for the same event, the parent Wi-Fi state handles the ‘connected’ state for both of its children. This ensures the states will handle the event in the same manner, and means that code to handle the event only needs to be written and tested once. If a third child state needed to handle the event differently, it could explicitly handle it and override its parent’s handler.

The Location Access state follows the same checking / main state pattern as the Wi-Fi state. Upon entering the location checking state, some business logic is called to check the status of the location access permission. Again, two possible events can be fired. If the user has already granted the permission, the checking state’s handler for that event causes the state machine to transition to the Bluetooth state. If the user does need to grant it, a different event is fired and the location access permission screen is pushed onto the navigation stack.

Location Access screen during Stadia Controller setup

This screen explains why the permission is needed. When the ‘Next’ button is tapped, the permission dialog is presented. If the user grants the permission, the app is notified in the same manner as in the checking state, an event is triggered, and the state machine transitions to the Bluetooth state. If the user doesn’t grant the permission, the screen updates to explain that they need to grant permission in their device’s system settings to proceed.

Alternate version of Location Access screen during Stadia Controller setup

On this version of the screen, if the ‘Next’ button is tapped, the app opens an appropriate OS-level settings page. If the user updates the permission and goes back to the Stadia app, they are able to proceed to the Bluetooth state.

Turn on Bluetooth screen during Stadia Controller setup

The Bluetooth state follows the same checking / main state pattern as the other states. Upon entering the Bluetooth checking state, business logic is executed to see whether Bluetooth is enabled. On iOS, there is a Bluetooth permission dialog that appears when CBCentralManager is initialized, so the checking state is our best opportunity to do that – it will be presented to the user in an appropriate context. This dialog appears above our Flutter UI and can cause some jank, so placing Bluetooth last in this part of the flow ensures that if the dialog needs to be presented, it isn’t appearing over the top of a UI transition, for example.

Similar to before, if the ‘Bluetooth is off’ event is fired, the state machine’s handler will take the user to a screen where they are prompted to turn it on. When the ‘Bluetooth is on’ event is fired, we have met all the permission requirements to complete the rest of the setup flow. The handler transitions to the controller selection menu, which begins scanning for Stadia Controllers.

These three screens are outwardly quite simple, but there are many paths between them that need to be considered. This part of the setup flow is illustrative of how the state machine, UI, and various data and event sources can come together in a Workflow, and how branching logic might be defined in an elegant way.

Consider a counterexample

Imagine if the setup flow did not use a HSM or an event-driven design. The implementation becomes more complicated from the very first line of code. Some of the first logic in the setup flow determines what permission screen to initially show to the user. In our real implementation, this is in a non-UI checking state. Without a state machine, there isn’t an easy way of representing a state without associated UI, and deciding where this logic should live becomes surprisingly difficult. The logic might hang off of the button handler that launches the setup flow, could exist inside of a new Welcome screen, or could be called as the “Connect to Wi-Fi” page is presented to the user. These all have downsides, ranging from leaking setup flow logic into the rest of the app, to presenting extraneous information to the user. (Recall the “masking complexity” concept from the first post.)

Similarly, the logic for traversing the flow might be decentralized, with each screen handling navigation on its own. This approach would make it more difficult for someone not familiar with the flow to get a complete picture of how the screens fit together (contrast this with the HSM, where connections between states are centrally, explicitly declared). This increases the complexity of every screen’s business logic, adds to the flow's maintenance cost, and multiplies the surface over which bugs can be introduced.

With the navigation logic spread out, moving screens around becomes much more challenging as well. Swapping two screens in the non-HSM flow means touching the two screens in question, plus every screen that can navigate to or from those screens, whereas with a HSM, in many cases, the swap can occur without touching the business logic of any screen.

Lastly, without a HSM, it can become harder to gracefully handle interruptions. In the setup flow there are multiple background services running at all times, and occasionally one may need to stop user progression, or otherwise show a message to the user. In many cases this can mean that other tasks need to stop running. Without the ability to fire an event when this happens, some logic would maintain references to all the services that need to stop for every interruption, and declare how each of them react. There is almost certainly some overlap between interruptions, but the need to explicitly declare each of these means that there’s a high risk of introducing bugs. When using a state machine instead, these reactions can be executed when states are entered and exited, which means that they can be handled much more consistently, and with less explicit declaration for each scenario.

Further benefits of the HSM approach

As a coda to the architecture discussion, it’s worth noting that our use of a HSM and the Workflow pattern is extremely testable and debuggable.

All of our event sources are piped through the HSM, and all handling of those events happens inside the HSM. Testing the sending side of the event stream involves making sure that the correct event is fired at the end of whatever logic is running. When testing on the receiving side, it’s trivial to inject events to simulate them being sent from elsewhere.

Additionally, there are a finite number of events, and each state explicitly declares what events it handles, meaning states have known inputs and outputs. All of this isolates potential problems, and makes fixing bugs often a matter of determining what input or output event is incorrect, and addressing the issue at the source.

Lastly, using the HSM as an extension of our existing architecture meant that multiple team members would all naturally write and structure their code in a standardized way that’s similar to that of the rest of the app. One specific benefit of this fact is that the setup flow’s UI uses our app-wide patterns. Because of this, it’s much easier for engineers to jump in and help on any part of the setup flow, and it means that the flow can benefit more from app-wide refactors and improvements.

There’s much more to say about the Stadia mobile app’s architecture, but we’ll pause here for the time being. Using a HSM and the Workflow pattern have given the team a great set of tools to work with when building and improving the flow and have cleared up the mystique within the team around working with hardware. In the final post in this series, we’ll discuss how our use of Flutter greatly boosted our development speed, and some of the ways the team improved the reliability and approachability of the setup flow.
--Nick Sparks, Software Engineer
Become a Stadia developer image

Become a Stadia developer

Apply Now

This site uses cookies to analyze traffic and for ads measurement purposes. Learn more about how we use cookies.