Theoretical Concepts

Before diving into implementation, it’s essential to understand the theoretical foundations that underpin robotic arm systems with computer vision. This section covers the key concepts you’ll apply throughout the course.

A comprehensive PDF covering these concepts in detail is available in the course materials: theorical_concepts.pdf

Learning Objectives

After completing this section, you will understand:

Core robotics principles and coordinate systems
Computer vision fundamentals and object detection
Control system architectures
Communication protocol design principles
Real-time system considerations

1. Robotics Fundamentals

Coordinate Systems and Kinematics

Robotic arms operate in three-dimensional space using coordinate systems: Cartesian Coordinates (X, Y, Z)

World frame: Fixed reference point in the environment
Robot base frame: Origin at the robot’s mounting point
End-effector frame: Tool center point (TCP)

Joint Space

Each motor/actuator has an angular position
Robot configuration defined by joint angles: θ₁, θ₂, θ₃…

Forward Kinematics: Calculate end-effector position from joint anglesInverse Kinematics: Calculate required joint angles to reach a target position

Degrees of Freedom (DOF)

The number of independent movements a robot can make:

3-DOF: Position in 3D space (X, Y, Z)
6-DOF: Position + orientation (roll, pitch, yaw)
VEX IQ robots typically have 3-4 DOF

Robot Control Architecture

In this course, the Vision System runs on Raspberry Pi, while Joint Controllers execute on VEX Brain. Communication between them is critical!

2. Computer Vision Basics

Image Representation

Digital images are arrays of pixels:

Grayscale: Single channel (0-255 intensity)
RGB Color: Three channels (Red, Green, Blue)
Resolution: Width × Height pixels

# Example: Image dimensions
import cv2
image = cv2.imread('object.jpg')
height, width, channels = image.shape  # e.g., (480, 640, 3)

Object Detection Fundamentals

Object detection identifies and locates objects in images: Key Components:

Classification: What is the object? (apple, orange, bottle)
Localization: Where is it? (bounding box coordinates)
Confidence Score: How certain is the detection? (0.0 to 1.0)

Bounding Boxes:

Rectangular regions containing detected objects
Represented as: (x1, y1, x2, y2) or (x, y, width, height)
x1, y1: Top-left corner
x2, y2: Bottom-right corner

In this course, we use YOLO (You Only Look Once) for real-time object detection. YOLO processes the entire image in a single forward pass, making it fast enough for robotics applications.

Deep Learning for Vision

Neural Networks:

Learn features automatically from training data
Hierarchical layers: edges → shapes → objects
Trained using labeled datasets

Convolutional Neural Networks (CNNs):

Specialized for image processing
Convolutional layers detect spatial patterns
Pooling layers reduce dimensionality

YOLO Architecture:

Single-stage detector (fast inference)
Divides image into grid cells
Each cell predicts bounding boxes and class probabilities
Output: List of detections with class, confidence, and location

Model Inference Pipeline

# Conceptual inference flow
image = capture_frame()           # Get image from camera
preprocessed = resize_and_normalize(image)  # Prepare for model
results = model.predict(preprocessed)       # Run inference
detections = parse_results(results)         # Extract bounding boxes
best = select_highest_confidence(detections)  # Choose target

3. Control Systems

Feedback Control Loop

Robotic systems use feedback to achieve accurate positioning: Components:

Setpoint: Desired target position
Controller: Calculates corrective actions
Plant: The physical robot
Feedback: Sensor measurements of actual position
Error: Difference between target and actual position

Control Strategies

Open-Loop Control:

Send commands without feedback
Simple but less accurate
Example: “Move motor forward for 2 seconds”

Closed-Loop Control:

Use sensor feedback to correct errors
More accurate and robust
Example: “Move arm until camera detects object centered”

PID Control (Proportional-Integral-Derivative):

P: Correct based on current error
I: Correct based on accumulated past errors
D: Correct based on rate of error change

VEX Brain controllers often include built-in PID control for motor positioning. In this course, you’ll focus on high-level vision-based control logic.

4. Serial Communication Concepts

Why Serial Communication?

Serial communication sends data one bit at a time over a single wire:

Simple hardware: Fewer wires than parallel communication
Long distances: More reliable over extended cables
Universal: Supported by most embedded systems

UART Protocol

UART (Universal Asynchronous Receiver-Transmitter) is the standard for serial communication: Key Parameters:

Baud Rate: Bits per second (common: 9600, 115200)
Data Bits: Usually 8 bits per byte
Stop Bits: Signal end of byte (usually 1)
Parity: Error checking (often none)

Both devices must use the same baud rate to communicate successfully. In this course, we use 115200 baud for fast communication.

Communication Patterns

Simplex: One-way (sensor → computer) Half-Duplex: Two-way, but not simultaneous Full-Duplex: Simultaneous bidirectional (Raspberry Pi ↔ VEX Brain)

Message Framing

How to identify where one message ends and another begins: Delimiter-Based:

Hello\n    # Newline character marks message end

Length-Prefixed:

5:Hello     # Number indicates message length

Fixed-Length:

HELLO       # Always exactly 5 characters

In this course, we use newline-delimited JSON messages for clear structure and easy debugging.

5. Real-Time System Considerations

Latency and Timing

Latency Sources:

Camera capture: 30-60ms (30 FPS)
Model inference: 50-200ms (depends on model size)
Serial transmission: 1-5ms
Motor response: 10-100ms

Total System Latency: 100-400ms from detection to action

For pick-and-place tasks, 200-300ms latency is acceptable since objects are typically stationary. For tracking moving objects, optimization is critical.

Threading and Concurrency

Robotic systems often need to do multiple things simultaneously:

Main Thread: Capture frames and process vision
Communication Thread: Listen for incoming serial messages
Control Thread: Send commands and monitor status

from threading import Thread

# Separate threads for reading and writing
read_thread = Thread(target=serial_read_loop)
write_thread = Thread(target=vision_processing_loop)

read_thread.start()
write_thread.start()

Error Handling

Robust systems must handle failures gracefully:

Connection Loss: Retry serial connection automatically
Invalid Data: Validate and discard malformed messages
Timeout: Don’t wait forever for responses
Sensor Failure: Continue operating with degraded performance

6. System Integration

Component Interaction

Our complete system architecture: Data Flow:

Camera captures frame → Raspberry Pi
YOLO model processes image → detections
Best detection formatted as JSON → serial TX
VEX Brain receives and parses message
Motion planning executes arm movement
Status message sent back → Raspberry Pi

Design Principles

Modularity: Each component has a single responsibility

class ModelLoader:    # Only loads models
class DetectionModel: # Only runs inference  
class SerialComm:     # Only handles communication

Abstraction: Hide implementation details

class DetectionModelInterface(ABC):
    @abstractmethod
    def inference(self, image: np.ndarray):
        pass

Robustness: Handle errors without crashing

try:
    result = model.predict(image)
except Exception as e:
    log.error(f'Inference failed: {e}')
    return default_result

Knowledge Check

Before moving to implementation, ensure you can answer:

What is the difference between forward and inverse kinematics?
How does YOLO differ from traditional object detection methods?
Why is full-duplex communication important for robotics?
What causes latency in vision-based robot control?
Why do we use threading in serial communication?

Review the theorical_concepts.pdf in the course materials for detailed diagrams, equations, and additional examples.

Next Steps

Now that you understand the theoretical foundations, you’re ready to start building! Begin with the Serial Protocol lesson to establish communication between your devices.

Serial Protocol

Learn to implement reliable serial communication

Course Overview

Return to course structure and learning path

Learning Path

Communication Class

Vision Class

Theoretical Concepts

Theoretical Concepts

Learning Objectives

1. Robotics Fundamentals

Coordinate Systems and Kinematics

Degrees of Freedom (DOF)

Robot Control Architecture

2. Computer Vision Basics

Image Representation

Object Detection Fundamentals

Deep Learning for Vision

Model Inference Pipeline

3. Control Systems

Feedback Control Loop

Control Strategies

4. Serial Communication Concepts

Why Serial Communication?

UART Protocol

Communication Patterns

Message Framing

5. Real-Time System Considerations

Latency and Timing

Threading and Concurrency

Error Handling

6. System Integration

Component Interaction

Design Principles

Knowledge Check

Next Steps

Serial Protocol

Course Overview

Build docs developers (and LLMs) love

Learning Path

Communication Class

Vision Class

​Theoretical Concepts

​Learning Objectives

​1. Robotics Fundamentals

​Coordinate Systems and Kinematics

​Degrees of Freedom (DOF)

​Robot Control Architecture

​2. Computer Vision Basics

​Image Representation

​Object Detection Fundamentals

​Deep Learning for Vision

​Model Inference Pipeline

​3. Control Systems

​Feedback Control Loop

​Control Strategies

​4. Serial Communication Concepts

​Why Serial Communication?

​UART Protocol

​Communication Patterns

​Message Framing

​5. Real-Time System Considerations

​Latency and Timing

​Threading and Concurrency

​Error Handling

​6. System Integration

​Component Interaction

​Design Principles

​Knowledge Check

​Next Steps

Serial Protocol

Course Overview

Build docs developers (and LLMs) love

Theoretical Concepts

Learning Objectives

1. Robotics Fundamentals

Coordinate Systems and Kinematics

Degrees of Freedom (DOF)

Robot Control Architecture

2. Computer Vision Basics

Image Representation

Object Detection Fundamentals

Deep Learning for Vision

Model Inference Pipeline

3. Control Systems

Feedback Control Loop

Control Strategies

4. Serial Communication Concepts

Why Serial Communication?

UART Protocol

Communication Patterns

Message Framing

5. Real-Time System Considerations

Latency and Timing

Threading and Concurrency

Error Handling

6. System Integration

Component Interaction

Design Principles

Knowledge Check

Next Steps