Guide
Whether you want to run inference right now or build on top of the codebase — pick your path below.
App User
Just want to use the app? Open it in your browser, pick a model, and start detecting objects in images or camera feeds.
Jump to guideDeveloper
Clone the repo, run it locally, add your own ONNX models, or extend the pipeline with custom post-processing.
Jump to guideApp User Guide
No installation needed — everything runs in the browser.
Requirements
WebGPU delivers the best performance. If your browser or hardware doesn't support it, the app automatically falls back to multi-threaded WASM.
Open the App
Navigate to the Workspace page. The app will instantly begin loading the default model (yolo11n-seg).
First load downloads ~13MB. After that, the model is cached in your browser's IndexedDB — no re-download needed.
Select Model & Device
Use the toolbar at the top of the workspace to pick your model and execution provider:
Upload an Image
Click Upload Image or drag-and-drop any image onto the workspace. You can also click one of the example thumbnails at the bottom to try a pre-loaded sample.
Inference runs automatically once the image is loaded. You'll see bounding boxes, segmentation masks, or pose skeletons drawn over the image depending on the active model.
Use Your Camera
Click Open Camera to start real-time detection from your webcam. Grant camera permissions when prompted. The app runs inference continuously at up to 60 FPS.
If you have multiple cameras (e.g., front and back), use the camera selector dropdown to switch between them.
Adjust Confidence
The CONF. slider in the toolbar controls the minimum confidence threshold. Drag it to filter out low-confidence detections. Default is 55%.
Inspect Detections
The sidebar panel lists all detected objects with their class name and confidence score. Click any detection to highlight it on the canvas. For pose models, clicking a person reveals their individual keypoints.
Save Results
Click Save in the sidebar to download the annotated image with all bounding boxes, masks, and labels burned in.
Add Custom Models
Click the + button next to the model selector to add your own ONNX model. Provide a URL to any publicly hosted .onnx file, select the task type (Detection, Segmentation, or Pose), and tag its precision.
Tips for Best Performance
- Use Chrome or Edge for WebGPU support — Firefox and Safari use WASM fallback.
- Close other GPU-intensive tabs (games, video editors) for smoother inference.
- The first inference after load is a "warm-up" and may be slower. Subsequent frames are faster.
- If the model feels slow, switch to
WASM— some integrated GPUs perform better on CPU.
Developer Guide
Clone, build, extend, or bring your own model.
Prerequisites
Clone the Repository
git clone https://github.com/pranta-barua007/yolo11-onnx.git cd yolo11-onnx
Install Dependencies
pnpm install
Start Development Server
pnpm dev
Open http://localhost:3000 in a WebGPU-capable browser. The app hot-reloads on file changes.
Project Structure
src/
├── app/ # Next.js App Router pages
│ ├── page.tsx # Workspace (main inference UI)
│ ├── about/ # Technical overview
│ └── guide/ # This guide
├── components/ # UI components
│ ├── Header.tsx # Global navigation
│ ├── MediaDisplay/ # Image/camera display
│ └── ModelStatus/ # Detection sidebar
├── hooks/ # React hooks
│ ├── useYoloModel # Model lifecycle
│ ├── useCamera # Camera stream
│ └── useFps # FPS counter
├── workers/ # Web Worker pipeline
│ └── workerPipeline.ts
└── utils/ # Inference utilities
├── img_preprocess.ts
├── mask_processing.ts
└── draw_bounding_boxes.tsExport Your Model (ONNX)
Use the Ultralytics Python SDK to export your trained YOLO model to ONNX format. Run this in a Google Colab notebook or your local Python environment.
Standard Export (FP32)
from ultralytics import YOLO
model = YOLO("yolo11s.pt")
model.export(format="onnx", opset=12, dynamic=False, nms=False)WebGPU-Safe Half-Precision (FP16) — Recommended
Reduces model size by 50% with faster inference on WebGPU hardware.
Do NOT use half=True directly
Ultralytics' half=True blindly converts all ops to float16, including Resizewhich WebGPU doesn't support. This causes Invalid data type runtime errors.
Step 1 — Export as FP32:
from ultralytics import YOLO
model = YOLO("yolo11s.pt")
model.export(format="onnx", opset=12, dynamic=False, nms=False)Step 2 — Convert to FP16 with op blocking:
ONNX Runtime's converter keeps incompatible ops (like Resize) in FP32 and inserts Cast nodes automatically.
import onnx
from onnxruntime.transformers.float16 import convert_float_to_float16
model = onnx.load("yolo11s.onnx")
model_fp16 = convert_float_to_float16(
model,
keep_io_types=True,
op_block_list=["Resize", "GridSample"]
)
onnx.save(model_fp16, "yolo11s_webgpu.onnx")
print("✅ WebGPU-safe FP16 model saved")Install: pip install onnx onnxruntime
Add Your Own ONNX Model
Place your .onnx model in the public/models/ directory. The model must follow the standard YOLO output format:
- Detection: Output shape
[1, 4+C, N]where C = classes, N = proposals - Segmentation: Two outputs — detections + prototype masks
[1, 32, H, W] - Pose: Output shape includes 17 × 3 keypoint channels
You can add models at runtime via the + button in the UI. For built-in models, add a new SelectItem in StatusBar.tsx.
Build for Production
pnpm build # Static output → out/ (configured for GitHub Pages)
The project is configured with output: "export" for static hosting. Deploy the out/ directory to any static host (GitHub Pages, Vercel, Netlify, S3).
Customize the UI
The design system uses Tailwind CSS with shadcn-style tokens. Theme colors are defined in globals.css as CSS custom properties. Both light and dark modes are supported via next-themes.
To change the primary accent, update the --primary HSL values in your CSS.
Extend the Pipeline
The inference pipeline lives in src/workers/workerPipeline.ts. All model loading, pre-processing, inference, and post-processing happens inside this Web Worker.
To add custom post-processing (e.g., counting objects, tracking across frames), modify the worker's message handler or add new utility functions in src/utils/.
Personalize & Customize
To make this application your own, you will want to update several hardcoded default parameters scattered throughout the codebase.
1. Default Confidence Rate
The default confidence threshold is set to 0.55 (55%). To change the starting value for the slider, update DEFAULT_SCORE_THRESHOLD in src/hooks/useYoloModel.ts.
2. Dataset Classes
The default 80 COCO classes are stored in src/utils/yolo_classes.json. Update this file with your own dataset classes so the UI correctly auto-colors your bounding boxes.
3. Application URLs & Links
The GitHub link in the top right is located inside src/components/Header.tsx. You should also update the URLs in the package.json file.
4. Built-in Models
The default available models in the dropdown (e.g., YOLO11 Nano, Nano Seg) are hardcoded inside the Select menu in src/components/media-display/StatusBar.tsx. Update or remove the SelectItem elements to match the models in your public/models/ folder.
Common Issues
- Model cache stale? Clear the
yolo-model-cachein DevTools → Application → Cache Storage. The app caches models in IndexedDB and doesn't auto-invalidate. - WebGPU not available? Check
chrome://flags/#enable-unsafe-webgpu. On Linux, you may also need--enable-features=Vulkan. - Camera permission denied? Ensure you're on
localhostor HTTPS. Browsers block camera access on insecure origins.
Ready to start?