Skip to content

COS301-SE-2026/Gesture-Based-Drone-Control

Repository files navigation


Test Lint Issues


Python React TypeScript FastAPI Docker License


A real-time computer vision system that eliminates the physical drone controller entirely. Hand gestures are detected through a live camera feed, classified by a dual rule-based and ML recognition engine, and translated directly into drone flight commands — with safety fail-safes built in.



◈ Documentation

Document Link
‣ Software Requirements Specification (SRS) View SRS →
‣ Architecture & Design View Design →
‣ API Reference View API →
‣ Testing Strategy View Testing →
‣ CI/CD Pipeline View CI/CD →
‣ Git Conventions View Git Guide →
‣ GitHub Project Board View Board →

◈ Team — Codex Merchants

Name Student Number GitHub LinkedIn
Ayush Beekum u23596351 GitHub LinkedIn
Jaitin Moodally u23621372 GitHub
Shavir Vallabh u23718146 GitHub LinkedIn
Diya Narotam u23533596 GitHub
Chinmayi Santhosh u24585671 GitHub

Team Email: codexmerchants@gmail.com


◈ System Architecture

The system is structured across six integrated tiers, from raw hardware input through to deployed application and CI/CD automation.

┌─────────────────────────────────────────────────────────────────────┐
│  Tier 1  ·  Hardware Input                                          │
│           OpenCV  ·  xFly SDK  ·  DJI Tello SDK  ·  AirSim         │
├─────────────────────────────────────────────────────────────────────┤
│  Tier 2  ·  CV & Gesture Pipeline          [ Core Intelligence ]    │
│           MediaPipe Hands  ·  NumPy  ·  TFLite  ·  Asyncio          │
├─────────────────────────────────────────────────────────────────────┤
│  Tier 3  ·  Backend API                    [ Orchestration ]        │
│                   FastAPI  ·  SQLite  ·  Pytest                     │
├─────────────────────────────────────────────────────────────────────┤
│  Tier 4  ·  Frontend Dashboard             [ Command & Control ]    │
│               React + TypeScript  ·  Recharts  ·  Jest              │
├─────────────────────────────────────────────────────────────────────┤
│  Tier 5  ·  Application Deployment                                  │
│            Electron  ·  PWA / Capacitor  ·  Docker                  │
├─────────────────────────────────────────────────────────────────────┤
│  Tier 6  ·  DevOps & Docs                                           │
│            Git  ·  GitHub Actions  ·  SwaggerDocs  ·  Overleaf      │
└─────────────────────────────────────────────────────────────────────┘

◈ Repository Structure

gesture-drone-control/
├── apps/
│   ├── backend/                   # FastAPI backend (REST + WebSocket)
│   │   ├── app/
│   │   │   ├── api/               # Route handlers & WebSocket gateway
│   │   │   ├── core/              # Config, startup, app factory
│   │   │   └── dependencies/      # Dependency injection (auth, db, etc.)
│   │   └── tests/                 # Pytest unit & integration tests
│   ├── desktop/                   # Electron wrapper (Win/Linux/macOS)
│   ├── frontend/                  # React + TypeScript dashboard
│   │   ├── public/                # Static assets (favicon, icons)
│   │   ├── src/                   # Components, pages, hooks
│   │   └── tests/                 # Playwright E2E tests
│   └── mobile/                    # Capacitor shell (iOS + Android)
│       ├── android/
│       └── ios/
│
├── services/                      # Core Python service layer
│   ├── cv_pipeline/               # Computer vision & gesture recognition
│   │   ├── camera/                # Camera feed capture (OpenCV)
│   │   ├── gestures/              # Gesture engine
│   │   │   └── recognizers/       # Rule-based & ML recognizers
│   │   ├── hand-detection/        # MediaPipe landmark detection
│   │   └── processing/            # Async queue & pipeline
│   ├── drone_control/             # Flight controller
│   │   └── adapters/              # AirSim, Gazebo, xFly adapters
│   ├── input/                     # Input source abstraction
│   │   └── sources/               # Gesture, keyboard & generic adapters
│   ├── commands/                  # Command model & dispatch logic
│   ├── telemetry/                 # Observer, manager & storage
│   │   └── storage/               # SQLite & PostgreSQL repos
│   └── tests/
│       └── cv_pipeline_testing/
│
├── packages/                      # Shared code across apps & services
│   ├── contracts/
│   │   ├── python/                # Pydantic schemas
│   │   └── typescript/            # Shared type definitions
│   ├── domain/                    # Domain models
│   └── utils/                     # Shared utility helpers
│
├── infrastructure/
│   ├── docker/                    # Per-service Dockerfiles
│   │   └── airsim/
│   └── scripts/                   # Setup & utility scripts
│
├── docs/                          # All project documentation
│   ├── api/
│   ├── assets/
│   │   ├── Sequence Diagrams/
│   │   ├── UC Diagrams/
│   │   └── UI/
│   ├── demo/
│   ├── diagrams/
│   ├── reports/
│   └── testing/
│
├── sandbox/                       # Manual testing & throwaway scripts
├── tests/
│   └── integration/
├── docker-compose.yml
├── makefile
└── README.md

◈ Technology Stack

▸ Computer Vision & ML Backend
Technology Purpose Version
Python Primary language — CV, ML, drone SDK 3.11.x
OpenCV Live camera feed & image preprocessing 4.8+
MediaPipe Real-time hand landmark detection & tracking 0.10+
NumPy Vector math — angles, distances, gesture ID 1.26+
TFLite ML-based gesture recognition (secondary) 2.x
Rule-Based Engine Deterministic, zero-latency gesture mapping Custom
Asyncio + Bounded Queue Non-blocking real-time pipeline Built-in
▸ Drone Integration
Technology Purpose Version
xFly SDK Primary physical drone communication Latest
DJI Tello SDK Alternative drone hardware interface Latest
AirSim Primary simulation environment (lightweight) Latest
Gazebo + ArduPilot High-fidelity physics simulation 4.x
▸ Backend & Database
Technology Purpose Version
FastAPI REST + WebSocket API (ASGI) 0.110+
SQLite Gesture logs, telemetry, command history 3.x
PostgreSQL Scale-out alternative database 15+
Pytest Backend unit, integration & API testing Latest
▸ Frontend
Technology Purpose Version
React Component-based UI framework 18
TypeScript Static typing across the frontend 5.x
Recharts Real-time telemetry & data visualisation 2.x
Jest Frontend unit & component testing Latest
Playwright End-to-end browser testing Latest
▸ Deployment
Technology Purpose Version
Electron Native desktop packaging (Win/Linux/macOS) Latest
PWA + Capacitor Cross-platform mobile delivery (iOS/Android) Latest
Docker Containerisation — consistent environments Latest
▸ DevOps & Documentation
Technology Purpose
GitHub Actions CI/CD — automated testing, linting, quality gates
Git Version control — trunk-based, main / dev / feature/*
SwaggerDocs Auto-generated interactive API documentation
Overleaf Formal written documentation (LaTeX)

◈ Use Cases

# Use Case Status
UC-01 User Registration & Login ◆ Core
UC-02 Live Hand Gesture Detection & Tracking ◆ Core
UC-03 Gesture → Drone Command Mapping ◆ Core
UC-04 Real-Time Dashboard (telemetry, status, feed) ◆ Core
UC-05 Safety Logic (hover on tracking loss, emergency stop) ◆ Core
UC-06 Manual Override (gesture → keyboard/controller) ◇ Optional
UC-07 Gesture Recording & Automated Playback ◇ Optional
UC-08 Idle Detection Auto-Land ◇ Optional
UC-09 Gesture-Activated Follow Mode ◇ Optional

◈ Getting Started

Prerequisites

Python Node Docker

Clone & install

git clone https://github.com/codex-merchants/gesture-drone-control.git
cd gesture-drone-control

# Backend
cd apps/backend && pip install -r requirements.txt

# Services layer
cd ../../services && pip install -e .

# Frontend
cd ../apps/frontend && yarn install

Run locally

# Terminal A — backend API
cd apps/backend && uvicorn app.main:app --reload

# Terminal B — frontend dashboard
cd apps/frontend && yarn dev

Or with Docker (recommended)

docker-compose up --build

Run tests

# Backend (from apps/backend)
pytest

# Services layer (from services/)
pytest

# Frontend (from apps/frontend)
yarn test # Jest unit tests
yarn playwright # E2E tests

◈ Branching Strategy

Trunk-based development. The main branch must always be in a deployable state — all merges happen before each demo.

Branch Purpose
main Production. Protected. Merged into before every demo.
dev Integration. Completed features land here before main.
feature/<name> Short-lived. One branch per feature or fix, branched from dev.

◈ CI/CD & Code Quality

Tool Badge
Build (GitHub Actions) Build
Code Coverage (Coveralls) Coverage
Issues Issues
Uptime Uptime

◈ Constraints

Constraint Detail
▸ Single-User Operation One operator tracked at a time — reduces tracking complexity
▸ Indoor Only Designed and tested for indoor, controlled-lighting environments
▸ Fixed Gesture Set Predefined commands only (take-off, land, move, hover)

◈ Contact

Role Name Email
▸ Project Owner Bryan Janse Van Vuuren bryan.janse.van.vuuren@epiuse.com
▸ Project Mentor Cameron Taberer cameron.taberer@epiuse.com
▸ Team Codex Merchants codexmerchants@gmail.com

COS 301 Software Engineering · University of Pretoria · 2026 · EPI-USE

About

Real-time gesture-to-flight-command pipeline using MediaPipe Hands, rule-based + TFLite recognition, FastAPI WebSocket backend, and a React/TypeScript telemetry dashboard.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors