Skip to main content
Version: Latest

Labeling Images with Brobot

Brobot uses SikuliX for image recognition β€” a deterministic, rule-based system based on pixel-level pattern matching. While this method does not use machine learning itself, it can be used to automatically generate labeled image datasets that are ideal for training modern vision models.

This guide explains how Brobot can help create high-quality labeled data, and how those datasets can now be used more effectively thanks to advances in computer vision.


🎯 Why Use Brobot to Label Images?​

Many machine learning models require large sets of labeled data to learn visual concepts. But manually labeling images is time-consuming, error-prone, and expensive.

Brobot automates software through visual feedback, capturing screenshots of UI elements and states. Since the automation is driven by precise, known targets, every captured image has an inherent label β€” making it a perfect candidate for automated dataset generation.

Use cases include:

  • Labeling UI elements (buttons, icons, dialogs)
  • Capturing application states (logged in, error, success)
  • Creating classification datasets for vision models
  • Bootstrapping segmentation datasets using regions of interest

πŸ“¦ Using the Labeled Data​

The labeled data generated by Brobot can now be used more effectively than ever, thanks to:

βœ… Transfer Learning and Fine-Tuning​

You no longer need thousands of examples to train a useful classifier. Instead, you can:

  • Pretrain a model on large public datasets
  • Fine-tune it using your Brobot-labeled images

Even 50–100 examples per class can yield strong results.

βœ… Self-Supervised Pretraining​

With tools like DINOv2, CLIP, and MAE, you can train vision models on Brobot screenshots with minimal or no labels β€” and then fine-tune with your labeled examples.

This gives you strong representations, even in low-data regimes.

πŸ› οΈ Brobot Labeling Workflow​

Here’s a simple example of a modern labeling pipeline using Brobot:

StepAction
1. Automate UI flows with BrobotRun scripts that exercise key app behaviors
2. Capture labeled imagesSave individual elements using Brobot's knowledge of the environment to create descriptive labels
3. Augment data (optional)Add variations: resized, blurred, color-shifted, etc.
4. Use for trainingFine-tune a model or evaluate embeddings

This process can produce hundreds or thousands of labeled images with minimal human effort.


🧠 Why It Matters​

While this process alone is not intelligent, its deterministic behavior makes it ideal for collecting clean training data. Brobot turns every automation run into an opportunity to:

  • Capture variation (resolutions, themes, localization)
  • Identify failures and mismatches
  • Build a dataset that reflects real usage patterns

If your goal is to eventually replace deterministic matching with robust, ML-based perception, this is how to start.


πŸ“Ž Summary​

Brobot can help bridge the gap between rule-based automation and intelligent visual understanding by generating datasets for modern ML training. In today’s ecosystem of foundation models, transfer learning, and self-supervised learning, even small, Brobot-labeled datasets can power:

  • Classification of UI elements
  • Detection of specific screen states
  • Embedding-based visual search
  • Improved generalization across devices and themes