Transitions
Building on the Overall Modelβ
This document focuses on Transitions (T), the third component of the State Structure Ξ© = (E, S, T). Before diving into transitions, ensure you understand:
- Overall Model - The complete formal model (Ξ, Ξ©, a, M, Ο, Β§)
- States - The State Structure (Ξ©) and state management (M)
Transitions complete the State Structure by defining how to move between states. While states represent what exists (static), transitions represent what happens (dynamic).
What is a Transition?β
The Dynamic Counterpart to Statesβ
If states are the "nouns" of GUI automation (what exists), transitions are the "verbs" (what happens).
The Key Distinction:
- States (S): Static configurations of the GUI - what screens/configurations exist
- Transitions (T): Dynamic sequences of actions - how to change from one configuration to another
Analogy: Think of navigating a city:
- States are destinations (cities on a map)
- Elements are landmarks that identify each city
- Transitions are the roads and driving instructions to get from one city to another
Just as a road has both a physical pathway and turn-by-turn directions, a transition has both the actions to execute and knowledge about what should change when you arrive.
Why Transitions Matterβ
In process-based GUI automation, you write sequential scripts:
1. Click button X
2. Wait 2 seconds
3. Type "username"
4. Click submit
5. ...
This approach has a critical flaw: it doesn't know where it is. If step 3 fails, the script is lost.
In model-based automation, transitions are part of a graph structure:
LoginState --[login transition]--> DashboardState
The framework knows:
- Where it is (LoginState is active)
- Where it wants to go (DashboardState)
- What paths exist (transitions in the graph)
- How to recover (if one path fails, find another)
From the Paper: "The Transition Model defines how sequences of actions lead to transitions between different GUI states."
Formal Definitionβ
The Transition Tupleβ
A transition is formally defined as a tuple:
t = (A, S_t^def)
where:
- A = (aΒΉ, aΒ², ..., aβΏ) is a process or sequence of actions executed as part of the transition
- S_t^def β P(S Γ
{True, False}) is the intended state information if the transition succeeds
What This Means:
-
A (Action Sequence): The concrete steps that manipulate the GUI
- Examples: click login button, type username, press Enter
- These are the "how" - the physical actions performed
-
S_t^def (Defined State Information): Explicit knowledge about state changes
- Examples: "activate Dashboard", "deactivate LoginPage"
- These are the "what" - the expected state transitions
- Type: P(S Γ
{True, False}) means "set of (state, boolean) pairs"- (s,
True) means "activate state s" - (s,
False) means "deactivate state s"
- (s,
Example Transition:
Login Transition:
A = [click(loginButton), type(username, "user"), type(password, "pass"), click(submit)]
S_t^def = {(LoginPage, `False`), (Dashboard, `True`)}
Translation: "Execute these actions, then deactivate LoginPage and activate Dashboard"
The Transition Functionβ
The transition function processes the transition tuple:
f_Ο: (A, S_t^def) β (Ξ', r_t)
where:
- Ξ' is the resulting GUI state after executing the action process
- r_t is the transition result, containing:
- success or failure status: Boolean indicating if transition succeeded
- state information S_t:
- S_t = S_t^def if transition succeeds
- S_t = β (empty set) if transition fails
Why This Matters: The transition result provides explicit feedback about what states should be active. The State Management system (M) uses this information to update the active states (S_Ξ).
The Process Functionβ
Before the transition function applies state changes, the process function executes the action sequence:
f_A: (A) β (Ξ')
This function builds on the atomic action function f_a by iteratively applying each action in the sequence:
- Action aΒΉ operates on initial screen Ξ β produces Ξβ
- Action aΒ² operates on Ξβ β produces Ξβ
- ...
- Action aβΏ operates on Ξβββ β produces final Ξ' (resulting GUI)
The final output Ξ' represents the cumulative effect of all actions.
The Transition Relationβ
The transition relation defines which transitions are accessible from which states:
Ξ΄ β S Γ T
where:
- S is the set of all possible states
- T is the set of all transitions
- (s_i, t_j) β Ξ΄ means "transition t_j is accessible from state s_i"
Important: Ξ΄ is a relation (β), not a function. This means:
- Multiple transitions can be accessible from the same state
- A transition can be accessible from multiple states
Example:
States: S = {Login, Dashboard, Settings}
Transitions: T = {t_login, t_logout, t_toSettings, t_fromSettings}
Transition Relation Ξ΄:
(Login, t_login) β Ξ΄ β Can login from Login state
(Dashboard, t_logout) β Ξ΄ β Can logout from Dashboard
(Dashboard, t_toSettings) β Ξ΄ β Can go to Settings from Dashboard
(Settings, t_fromSettings) β Ξ΄ β Can leave Settings
From the Paper: "A transition is a tuple t = (A, S_t^def) comprising: A, a process or sequence of actions A = (aΒΉ, aΒ², ..., aβΏ) executed as part of the transition; S_t^def β P(S Γ
{True, False}), the intended state information if the transition succeeds."
The Critical Role of S_t^def: Optimization Through Knowledgeβ
One of the most important features of the Transition Model is S_t^def (defined state information). This component enables significant optimization.
Why S_t^def Existsβ
From the Paper: "The state information S_t provides explicit state change details that complement the information inferred from actions. Although not strictly necessary for state management, transitions can significantly streamline the process through S_t by containing known relationships between states. For example, if state A always disappears when state B is activated, S_t can include this rule directly in the transition, eliminating the need for additional actions to verify the absence of state A."
Without S_t^def (Process-Based Approach)β
In traditional automation:
// Navigate to Settings
action.click(settingsButton);
Thread.sleep(2000); // Wait for transition
// Must verify Dashboard closed
if (findImage("dashboard.png") != null) {
throw new Exception("Dashboard still visible!");
}
// Must verify Settings opened
if (findImage("settings.png") == null) {
throw new Exception("Settings not visible!");
}
Problems:
- Extra verification actions required
- Hard-coded delays
- Manual error checking
- No explicit knowledge about state relationships
With S_t^def (Model-Based Approach)β
// Transition encodes state knowledge
Transition toSettings = new Transition(
actions: [click(settingsButton)],
S_t^def: {(Dashboard, `False`), (Settings, `True`)}
);
// Framework automatically:
// 1. Executes actions
// 2. Applies state changes from S_t^def
// 3. Verifies expected states are active
// 4. Returns success/failure
Benefits:
- No extra verification steps - S_t^def encodes expected changes
- Automatic state management - Framework updates S_Ξ using S_t
- Explicit knowledge - State relationships are declared, not implicit
- Error recovery - Framework knows what states should be active
Practical Example: The DoT Applicationβ
In the DoT (Dawn of Titans) application from the paper, transitions encode state relationships:
World β Island Transition:
t_goToIsland = (
A: [click(searchButton)],
S_t^def: {(World, `True`), (Island, `True`)}
)
Note: Both World and Island remain active (World, True) because the World map stays visible when viewing an Island. This knowledge is explicitly encoded in S_t^def, so the framework knows not to deactivate World.
Without S_t^def: The automation would need extra actions to verify World remains visible while Island becomes active. This would add complexity and potential for error.
With S_t^def: The transition explicitly states "keep World active, also activate Island." The framework handles this automatically.
Transitions in Brobotβ
Implementation Architecture: FromTransitions and ToTransitionsβ
Brobot implements the transition model by separating transitions into two complementary parts:
- FromTransition: State-specific initial actions required to leave the current state
- ToTransition: Universal finalization actions that always occur when entering the target state
From the Paper (Section 6.3): "In Brobot, transitions are separated into two parts: FromTransitions and ToTransitions. A FromTransition handles the initial actions for moving from the current state to another state, and a ToTransition completes the transition."
Visual Representation:
State 1 (active) β [FromTransition 1β6] β [ToTransition β6] β State 6 (active)
β β
State-specific actions Universal entry actions
Why Split Transitions?β
Reason 1: Reusability
Multiple states might transition TO the same target:
State 1 β State 6: From(1β6) + To(β6)
State 2 β State 6: From(2β6) + To(β6)
State 5 β State 6: From(5β6) + To(β6)
The To(β6) actions are reused across all transitions! This eliminates duplication.
Reason 2: Composability
The "always-execute" nature of ToTransitions ensures that no matter how you reach a state, certain actions always run:
- Initialize state-specific resources
- Verify critical elements are visible
- Set up state-specific context
Reason 3: Separation of Concerns
- FromTransition: "How do I leave my current location?"
- ToTransition: "What always happens when arriving at the destination?"
This separation mirrors natural decomposition of navigation tasks.
Example:
Going from Login β Dashboard might involve:
- FromTransition (LoginβDashboard): Type credentials, click submit
- ToTransition (βDashboard): Wait for logo, initialize dashboard widgets
Going from Settings β Dashboard would have:
- FromTransition (SettingsβDashboard): Click back button
- ToTransition (βDashboard): Same as above - wait for logo, initialize widgets
The ToTransition is shared, avoiding code duplication.
Code Example: The DoT Applicationβ
Note: The following example is based on the paper's DoT application (Appendix II). Brobot also supports annotation-based transitions (see "Modern Annotation Pattern" below).
// From the DoT test application in the paper (Appendix II)
public class WorldTransitions {
// Injected dependencies
private final Action action;
private final WorldState world;
private final StateTransitionsRepository stateTransitionsRepository;
public WorldTransitions(Action action, WorldState world,
StateTransitionsRepository repo) {
this.action = action;
this.world = world;
this.stateTransitionsRepository = repo;
// Define transitions FROM the WORLD state
StateTransitions transitions =
new StateTransitions.Builder("WORLD") // State name as string
.addTransitionFinish(this::finishTransition) // ToTransition
.addTransition(new JavaStateTransition.Builder() // Note: JavaStateTransition, not StateTransition
.addToActivate("ISLAND") // Activate ISLAND state (state name as string)
.setFunction(this::goToIsland) // The action sequence
.setStaysVisibleAfterTransition(StateTransition.StaysVisible.TRUE) // WORLD remains visible
.build())
.build();
stateTransitionsRepository.add(transitions);
}
// ToTransition: always executed when entering WORLD state
private boolean finishTransition() {
ActionOptions find = new ActionOptions.Builder()
.setMaxWait(10)
.build();
ObjectCollection worldImages = new ObjectCollection.Builder()
.withAllStateImages(world.getState())
.build();
return action.perform(find, worldImages).isSuccess();
}
// FromTransition: WORLD β ISLAND action sequence
public boolean goToIsland() {
// The action sequence for this transition is a single click
return action.perform(
ActionType.CLICK, world.getSearchButton()
).isSuccess();
}
}
Mapping to Formal Model:
This code implements the transition tuple t = (A, S_t^def):
- A =
goToIsland()method (action sequence: click search button) - S_t^def =
{(ISLAND, True), (WORLD, True)}(encoded via.addToActivate("ISLAND")and.setStaysVisibleAfterTransition(TRUE))
The ToTransition finishTransition() always verifies World state images are visible, ensuring the state is properly activated.
Modern Annotation Pattern (Alternative)β
Brobot also supports annotation-based transitions for cleaner code:
@TransitionSet(state = WorldState.class)
@Component
public class WorldTransitions {
@Autowired
private Action action;
@Autowired
private WorldState world;
@OutgoingTransition(
activate = {IslandState.class},
staysVisible = true
)
public boolean goToIsland() {
return action.click(world.getSearchButton()).isSuccess();
}
@IncomingTransition
public boolean finishTransition() {
return action.find(world.getAllImages()).isSuccess();
}
}
Advantages:
- Less boilerplate code
- Declarative state relationships
- Automatic registration with Spring context
- Type-safe state references
Transitions in Path Traversalβ
Transitions are the fundamental building blocks of paths in the state graph. Understanding how transitions compose into paths is essential for understanding navigation.
Paths as Transition Sequencesβ
A path is defined as Ο = (S_Ο, T_Ο) where:
- S_Ο = [sβ, sβ, ..., sβ] is the sequence of states
- T_Ο = [tβ, tβ, ..., tβββ] is the sequence of transitions
Path Validity Conditions:
- sβ β S_Ξ - First state must be currently active
- sβ = s_target - Last state must be the target
- (s_i, t_i) β Ξ΄ - Each transition must be accessible from its source state
- (si+1,
True) β S_ti^def - Each transition must activate the next state
Example Path:
Active States: {Login}
Target: Settings
Valid Path:
S_Ο = [Login, Dashboard, Settings]
T_Ο = [t_login, t_toSettings]
Where:
- (Login, t_login) β Ξ΄ β
- (Dashboard, `True`) β S_{t_login}^def β
- (Dashboard, t_toSettings) β Ξ΄ β
- (Settings, `True`) β S_{t_toSettings}^def β
Dynamic Path Selection with Transition Failuresβ
When a transition fails, the Path Traversal Model (Β§) dynamically recalculates paths:
Scenario:
Initial Path: Start β [t1] β State2 β [t2] β State5 β [t3] β End
β β
What Happens:
- Transition t1 succeeds β State2 becomes active
- Transition t2 fails β State5 not activated
- Framework queries State Management: "What states are active?"
- Answer:
{Start, State2}(State2 is active, State5 is not) - Framework calculates new paths from
{Start, State2}to End - Framework selects alternative path:
State2 β [t4] β State6 β [t5] β End
This is only possible because:
- Transitions are part of an explicit graph structure (Ξ΄ β S Γ T)
- State Management tracks active states (S_Ξ)
- Path Traversal can recalculate from current position
In process-based automation, there is no graph, no state tracking, and no recalculation. A failed step means total failure.
Transition Costs and Path Selectionβ
The framework can assign costs to transitions to guide path selection:
Transition Cost Function: c_T: T β β
Path Cost:
c(Ο) = Ξ£[c_T(t_i) + c_S(s_{i+1})]
Path Selection (heuristic-based):
Ο* = arg min_{Ο β P}[c(Ο)]
What This Means: Among all valid paths P, select the path Ο* with minimum total cost.
Example Transition Costs:
- Fast, reliable transition: cost = 1.0
- Slow transition (requires waiting): cost = 5.0
- Unreliable transition (sometimes fails): cost = 10.0
The framework will prefer paths with lower total cost, avoiding slow or unreliable transitions when alternatives exist.
Formal Properties of Transitionsβ
Understanding these formal properties helps reason about transition behavior:
1. Transition Composabilityβ
Property: Executing transitions in sequence composes their effects.
For path Ο with transitions T_Ο = [tβ, tβ, ..., tβββ]:
Result of path = f_Ο(tβββ) β f_Ο(tβββ) β ... β f_Ο(tβ)
Each transition builds on the GUI state produced by the previous transition.
2. Conditional Executionβ
Property: Transition t_i executes only if all previous transitions succeeded.
t_i executes βΊ βj < i: r_{t_j}.success = True
If any transition in a path fails, subsequent transitions do not execute.
3. State Information Consistencyβ
Property: Successful transitions guarantee their defined state changes.
If r_t.success = True, then S_t = S_t^def
The State Management system (M) applies S_t to update active states S_Ξ.
4. Transition Accessibilityβ
Property: A transition can only be executed if accessible from an active state.
t can be executed βΊ βs β S_Ξ: (s, t) β Ξ΄
5. State Activation Guaranteeβ
Property: If (s, True) β S_t^def and transition succeeds, then s β S_Ξ after transition.
Proof:
- Transition succeeds β r_t.success =
True - By Property 3 β S_t = S_t^def
- State Management applies S_t β s becomes active
- Therefore β s β S_Ξ
6. Process Function Determinismβ
Property: For fixed action sequence A and fixed starting GUI Ξ, f_A produces deterministic result.
f_A(A)(Ξ) is deterministic (ignoring environmental stochasticity Ξ)
However, in practice, environmental stochasticity (network delays, rendering variations, etc.) can affect results.
Practical Guidance for Defining Transitionsβ
When to Create a Transitionβ
Create a transition when:
- State-to-state navigation exists: There's a clear path from one GUI configuration to another
- Actions are repeatable: The action sequence can be executed multiple times
- State relationship is known: You know which states should activate/deactivate
- Reusability matters: The transition might be part of multiple paths
How Granular Should Transitions Be?β
Decision Tree:
Is this a single atomic action?
ββ YES: Consider making it a transition if it changes states
ββ NO: Is it a coherent sequence with clear start/end states?
ββ YES: Make it a transition
ββ NO: Break into smaller transitions
Examples:
Too Granular (anti-pattern):
// Each click is a separate transition
transition1: click(menu)
transition2: click(submenu)
transition3: click(item)
Good Granularity:
// Coherent sequence as one transition
transitionToItem: [click(menu), click(submenu), click(item)]
Too Coarse (anti-pattern):
// Entire workflow as one transition
transitionCompleteWorkflow: [login, navigate, processData, logout]
Good Granularity:
// Logical steps as separate transitions
transitionLogin: [enterCredentials, clickSubmit]
transitionNavigate: [clickMenu, selectOption]
transitionProcess: [loadData, transform, save]
transitionLogout: [clickLogout]
FromTransition vs ToTransition: When to Use Eachβ
Use FromTransition when:
- Actions depend on the source state
- Different source states require different actions to reach the same target
- The action is about "leaving" the current state
Use ToTransition when:
- Actions are always required when entering the target state
- The action is independent of where you came from
- The action is about "entering" or "initializing" the target state
Example:
Entering Dashboard from multiple states:
// FromTransition (Login β Dashboard): credentials required
fromLogin() {
type(username);
type(password);
click(submit);
}
// FromTransition (Settings β Dashboard): just click back
fromSettings() {
click(backButton);
}
// ToTransition (β Dashboard): always verify dashboard loaded
toDashboard() {
waitFor(dashboardLogo);
verifyWidget(summaryWidget);
}
Common Mistakes and How to Avoid Themβ
Mistake 1: Not Using S_t^defβ
Wrong:
// Only actions, no state information
new JavaStateTransition.Builder()
.setFunction(this::navigate)
.build();
Why It's Wrong: Framework doesn't know which states should activate/deactivate. State management becomes implicit and error-prone.
Correct:
new JavaStateTransition.Builder()
.setFunction(this::navigate)
.addToActivate("TargetState")
.addToDeactivate("SourceState")
.build();
Mistake 2: Including Verification in Action Sequenceβ
Wrong:
public boolean navigate() {
action.click(button);
// Verification shouldn't be in action sequence
if (!action.find(targetElement).isSuccess()) {
return false;
}
return true;
}
Why It's Wrong: Verification is the framework's job via State Management. Including it here duplicates logic.
Correct:
public boolean navigate() {
// Just perform the action
return action.click(button).isSuccess();
}
// Let S_t^def and State Management handle verification
Mistake 3: Forgetting setStaysVisibleAfterTransitionβ
Wrong:
// World stays visible when Island opens, but not specified
new JavaStateTransition.Builder()
.addToActivate("ISLAND")
.setFunction(this::goToIsland)
.build();
// Default behavior: World deactivates (WRONG)
Why It's Wrong: Framework will deactivate the source state by default, even though World remains visible.
Correct:
new JavaStateTransition.Builder()
.addToActivate("ISLAND")
.setFunction(this::goToIsland)
.setStaysVisibleAfterTransition(StateTransition.StaysVisible.TRUE)
.build();
Mistake 4: Using StateTransition.Builder Instead of JavaStateTransition.Builderβ
Wrong:
new StateTransition.Builder() // StateTransition is an interface!
.addToActivate("Target")
.build();
Why It's Wrong: StateTransition is an interface, not a concrete class. This won't compile.
Correct:
new JavaStateTransition.Builder() // JavaStateTransition is the implementation
.addToActivate("Target")
.build();
Mistake 5: Passing State Objects Instead of State Namesβ
Wrong:
.addToActivate(DashboardState.class) // Wrong type
Why It's Wrong: The method expects state names (strings), not class objects.
Correct:
.addToActivate("Dashboard") // State name as string
Testing Transitionsβ
Unit Testing a Transitionβ
@ExtendWith(MockitoExtension.class)
class WorldTransitionsTest {
@Mock
private Action action;
@Mock
private WorldState world;
@InjectMocks
private WorldTransitions transitions;
@Test
void testGoToIsland_success() {
// Arrange
StateImage searchButton = mock(StateImage.class);
when(world.getSearchButton()).thenReturn(searchButton);
ActionResult successResult = new ActionResult();
successResult.setSuccess(true);
when(action.perform(ActionType.CLICK, searchButton))
.thenReturn(successResult);
// Act
boolean result = transitions.goToIsland();
// Assert
assertTrue(result);
verify(action).perform(ActionType.CLICK, searchButton);
}
}
Integration Testing Path Traversalβ
@SpringBootTest
class PathTraversalIntegrationTest {
@Autowired
private StateNavigator navigator;
@Test
void testNavigateWorldToIsland() {
// Arrange: Start at World state
stateMemory.setActiveStates("WORLD");
// Act: Navigate to Island
boolean success = navigator.openState("ISLAND");
// Assert: Both states should be active
assertTrue(success);
assertTrue(stateMemory.isActive("WORLD"));
assertTrue(stateMemory.isActive("ISLAND"));
}
}
Implementation Detailsβ
Key Files in Brobot:
-
StateTransition.java (interface)
- Location:
/library/src/main/java/io/github/jspinak/brobot/model/transition/StateTransition.java - Defines transition contract
- Location:
-
JavaStateTransition.java (implementation)
- Location:
/library/src/main/java/io/github/jspinak/brobot/navigation/transition/JavaStateTransition.java - Line 208:
.addToActivate(String... stateNames)method - Line 180:
.setStaysVisibleAfterTransition(StaysVisible)method
- Location:
-
StateTransitions.java (container)
- Location:
/library/src/main/java/io/github/jspinak/brobot/navigation/transition/StateTransitions.java - Manages collection of transitions for a state
- Location:
-
TransitionFunction.java (executor)
- Location:
/library/src/main/java/io/github/jspinak/brobot/actions/methods/basicactions/transitions/TransitionFunction.java - Implements f_Ο function
- Location:
Key Theorems and Properties Summaryβ
| Property | Formal Statement | Intuitive Meaning |
|---|---|---|
| Transition Definition | t = (A, S_t^def) | Transition = actions + state info |
| Transition Function | f_Ο: (A, S_t^def) β (Ξ', r_t) | Executes transition, returns result |
| Process Function | f_A: (A) β (Ξ') | Executes action sequence |
| Transition Relation | Ξ΄ β S Γ T | Maps states to accessible transitions |
| State Information Type | S_t^def β P(S Γ {True, False}) | Set of (state, activate/deactivate) pairs |
| Transition Result | r_t = (success, S_t) | Contains success flag and state info |
| State Application | S_t = S_t^def if success, β if fail | State changes applied only on success |
| Path Validity | (s_i, t_i) β Ξ΄ β§ (si+1, True) β S_ti^def | Transition accessible and activates next state |
Comparison with Finite State Machines (FSMs)β
Model-based GUI automation shares concepts with FSMs but has critical differences:
| Aspect | Traditional FSM | Model-Based GUI Automation |
|---|---|---|
| States | Single active state | Multiple simultaneous active states |
| Transitions | State β State | State β {States} with explicit S_t^def |
| Determinism | Deterministic transitions | Non-deterministic (stochasticity Ξ) |
| Observation | Perfect state knowledge | Probabilistic pattern matching |
| Recovery | No error recovery | Dynamic path recalculation |
| State Information | Implicit in transition | Explicit in S_t^def |
The key innovation is S_t^def, which allows transitions to explicitly encode state relationships that FSMs leave implicit.
Further Readingβ
Academic Foundationβ
- Academic Foundation - Research background and citations
- Section 5.6: The Transition Model (Ο)
- Section 6.3: The Transition Implementation
- Figure 8: FromTransition and ToTransition architecture
Related Modelsβ
- Overall Model - Complete formal model (Ξ, Ξ©, a, M, Ο, Β§)
- States - State Structure and state management
- Testing the Automation - Testing transitions and paths
Practical Implementationβ
- AI Brobot Project Creation Guide - Complete API reference
- Getting Started - Hands-on tutorials
Appendix: Mathematical Notation Quick Referenceβ
- t = A single transition
- A = Action sequence (aΒΉ, aΒ², ..., aβΏ)
- S_t^def = Defined state information for transition t
- f_Ο = Transition function
- f_A = Process function (executes action sequence)
- Ξ΄ = Transition relation (which transitions are accessible from which states)
- r_t = Transition result
- Ξ' = Resulting GUI state after transition
- Ο = Path (sequence of states and transitions)
- S_Ο = State sequence in path
- T_Ο = Transition sequence in path
- c_T(t) = Cost of transition t
- P(S Γ
{True, False}) = Power set of (state, boolean) pairs