Getting Started

Guide

Whether you want to run inference right now or build on top of the codebase — pick your path below.

App User Guide

No installation needed — everything runs in the browser.

Requirements

BrowserChrome 113+ / Edge 113+
Best WithWebGPU-capable GPU
FallbackWASM (any modern browser)

WebGPU delivers the best performance. If your browser or hardware doesn't support it, the app automatically falls back to multi-threaded WASM.

1

Open the App

Navigate to the Workspace page. The app will instantly begin loading the default model (yolo11n-seg).

First load downloads ~13MB. After that, the model is cached in your browser's IndexedDB — no re-download needed.

2

Select Model & Device

Use the toolbar at the top of the workspace to pick your model and execution provider:

ModelYOLO11 Nano (D+S)
ModelYOLO11 Nano Pose (D+P)
ModelYOLO11 Small (D+Q)
DeviceWebGPU
DeviceWASM
3

Upload an Image

Click Upload Image or drag-and-drop any image onto the workspace. You can also click one of the example thumbnails at the bottom to try a pre-loaded sample.

Inference runs automatically once the image is loaded. You'll see bounding boxes, segmentation masks, or pose skeletons drawn over the image depending on the active model.

4

Use Your Camera

Click Open Camera to start real-time detection from your webcam. Grant camera permissions when prompted. The app runs inference continuously at up to 60 FPS.

If you have multiple cameras (e.g., front and back), use the camera selector dropdown to switch between them.

5

Adjust Confidence

The CONF. slider in the toolbar controls the minimum confidence threshold. Drag it to filter out low-confidence detections. Default is 55%.

6

Inspect Detections

The sidebar panel lists all detected objects with their class name and confidence score. Click any detection to highlight it on the canvas. For pose models, clicking a person reveals their individual keypoints.

7

Save Results

Click Save in the sidebar to download the annotated image with all bounding boxes, masks, and labels burned in.

8

Add Custom Models

Click the + button next to the model selector to add your own ONNX model. Provide a URL to any publicly hosted .onnx file, select the task type (Detection, Segmentation, or Pose), and tag its precision.

Tips for Best Performance

  • Use Chrome or Edge for WebGPU support — Firefox and Safari use WASM fallback.
  • Close other GPU-intensive tabs (games, video editors) for smoother inference.
  • The first inference after load is a "warm-up" and may be slower. Subsequent frames are faster.
  • If the model feels slow, switch to WASM — some integrated GPUs perform better on CPU.

Developer Guide

Clone, build, extend, or bring your own model.

Prerequisites

Node.js≥ 18
pnpm≥ 9
FrameworkNext.js 16
1

Clone the Repository

git clone https://github.com/pranta-barua007/yolo11-onnx.git
cd yolo11-onnx
2

Install Dependencies

pnpm install
3

Start Development Server

pnpm dev

Open http://localhost:3000 in a WebGPU-capable browser. The app hot-reloads on file changes.

4

Project Structure

src/
├── app/              # Next.js App Router pages
│   ├── page.tsx      # Workspace (main inference UI)
│   ├── about/        # Technical overview
│   └── guide/        # This guide
├── components/       # UI components
│   ├── Header.tsx    # Global navigation
│   ├── MediaDisplay/ # Image/camera display
│   └── ModelStatus/  # Detection sidebar
├── hooks/            # React hooks
│   ├── useYoloModel  # Model lifecycle
│   ├── useCamera     # Camera stream
│   └── useFps        # FPS counter
├── workers/          # Web Worker pipeline
│   └── workerPipeline.ts
└── utils/            # Inference utilities
    ├── img_preprocess.ts
    ├── mask_processing.ts
    └── draw_bounding_boxes.ts
5

Export Your Model (ONNX)

Use the Ultralytics Python SDK to export your trained YOLO model to ONNX format. Run this in a Google Colab notebook or your local Python environment.

Standard Export (FP32)

from ultralytics import YOLO

model = YOLO("yolo11s.pt")
model.export(format="onnx", opset=12, dynamic=False, nms=False)

WebGPU-Safe Half-Precision (FP16) — Recommended

Reduces model size by 50% with faster inference on WebGPU hardware.

Do NOT use half=True directly

Ultralytics' half=True blindly converts all ops to float16, including Resizewhich WebGPU doesn't support. This causes Invalid data type runtime errors.

Step 1 — Export as FP32:

from ultralytics import YOLO

model = YOLO("yolo11s.pt")
model.export(format="onnx", opset=12, dynamic=False, nms=False)

Step 2 — Convert to FP16 with op blocking:

ONNX Runtime's converter keeps incompatible ops (like Resize) in FP32 and inserts Cast nodes automatically.

import onnx
from onnxruntime.transformers.float16 import convert_float_to_float16

model = onnx.load("yolo11s.onnx")

model_fp16 = convert_float_to_float16(
    model,
    keep_io_types=True,
    op_block_list=["Resize", "GridSample"]
)

onnx.save(model_fp16, "yolo11s_webgpu.onnx")
print("✅ WebGPU-safe FP16 model saved")

Install: pip install onnx onnxruntime

6

Add Your Own ONNX Model

Place your .onnx model in the public/models/ directory. The model must follow the standard YOLO output format:

  • Detection: Output shape [1, 4+C, N] where C = classes, N = proposals
  • Segmentation: Two outputs — detections + prototype masks [1, 32, H, W]
  • Pose: Output shape includes 17 × 3 keypoint channels

You can add models at runtime via the + button in the UI. For built-in models, add a new SelectItem in StatusBar.tsx.

7

Build for Production

pnpm build
# Static output → out/ (configured for GitHub Pages)

The project is configured with output: "export" for static hosting. Deploy the out/ directory to any static host (GitHub Pages, Vercel, Netlify, S3).

8

Customize the UI

The design system uses Tailwind CSS with shadcn-style tokens. Theme colors are defined in globals.css as CSS custom properties. Both light and dark modes are supported via next-themes.

To change the primary accent, update the --primary HSL values in your CSS.

9

Extend the Pipeline

The inference pipeline lives in src/workers/workerPipeline.ts. All model loading, pre-processing, inference, and post-processing happens inside this Web Worker.

To add custom post-processing (e.g., counting objects, tracking across frames), modify the worker's message handler or add new utility functions in src/utils/.

10

Personalize & Customize

To make this application your own, you will want to update several hardcoded default parameters scattered throughout the codebase.

1. Default Confidence Rate

The default confidence threshold is set to 0.55 (55%). To change the starting value for the slider, update DEFAULT_SCORE_THRESHOLD in src/hooks/useYoloModel.ts.

2. Dataset Classes

The default 80 COCO classes are stored in src/utils/yolo_classes.json. Update this file with your own dataset classes so the UI correctly auto-colors your bounding boxes.

3. Application URLs & Links

The GitHub link in the top right is located inside src/components/Header.tsx. You should also update the URLs in the package.json file.

4. Built-in Models

The default available models in the dropdown (e.g., YOLO11 Nano, Nano Seg) are hardcoded inside the Select menu in src/components/media-display/StatusBar.tsx. Update or remove the SelectItem elements to match the models in your public/models/ folder.

Common Issues

  • Model cache stale? Clear the yolo-model-cachein DevTools → Application → Cache Storage. The app caches models in IndexedDB and doesn't auto-invalidate.
  • WebGPU not available? Check chrome://flags/#enable-unsafe-webgpu. On Linux, you may also need --enable-features=Vulkan.
  • Camera permission denied? Ensure you're on localhost or HTTPS. Browsers block camera access on insecure origins.