Perception Engineer
Our Mission
We believe robotics is entering a transformative decade, much like the arrival of the internet. Advances in AI, cloud computing, and hardware are reshaping what autonomous systems can do. Our mission is to build the intelligent software that powers physical security in the real world - enabling robots and sensors to handle dangerous and critical tasks that humans shouldn't have to. By engineering the orchestration layer for intelligent security, we aim to create a world that is safer, more secure, and more resilient.
We're a strong founding team based in Zurich, backed by visionary investors and advisors. We are engineering the future of security today!
The Role
As a Perception Engineer, you will build and own the perception systems behind our multi-sensor surveillance and robotics stack. This is a builder's role first: you start from the best available models, off-the-shelf where they do the job and trained in-house where they don't, and turn them into reliable, real-time systems that detect, track, and re-identify objects across cameras and run efficiently on edge hardware. We care less about novel papers and more about whether what you build works on a live site and stays working.
What You'll Work On
Design and ship cross-camera object re-identification (ReID) for people and vehicles across distributed camera networks, in all light and environmental conditions
Build high-throughput object detection and multi-object tracking pipelines that run across many simultaneous video streams
Integrate and adapt vision-language models (VLMs) for open-vocabulary detection, scene understanding, and operator-facing situational awareness
Own the video ingestion and streaming path (RTSP, WebRTC) from camera to model, with attention to latency, resilience, and dropped-frame handling
Optimize and deploy models on edge hardware: TensorRT, quantization (INT8/FP16), pruning, and other techniques to hit real-time targets on Jetson and edge-class devices
Evaluate, fine-tune, and integrate existing open-source and commercial models, and train custom models when off-the-shelf options fall short, knowing when each is the right call
Work closely with the software and robotics teams so perception output feeds downstream autonomy and alerting
Who We're Looking For
We're looking for a strong engineer who has shipped computer vision or ML systems into production, ideally in an embodied, multi-camera, or multi-modal setting. You think rigorously about data, evaluation, and where models break under real-world constraints. You're happy taking an existing state-of-the-art model and doing the unglamorous work of making it fast, reliable, and deployable on hardware, and equally happy training your own when nothing off-the-shelf fits. You measure success by whether the system works on a live site, not by novelty.
Your Background
Proven track record building and shipping computer vision or ML systems in production (3+ years or equivalent depth)
Strong understanding of SOTA techniques for cross-camera ReID, object detection, and multi-object tracking
Hands-on experience with VLMs, and a working grasp of the current model landscape
Solid grasp of real-time video streaming (RTSP, WebRTC) and the realities of multi-stream pipelines
Experience taking models from prototype to deployment on edge hardware: TensorRT, quantization, latency tuning
Strong Python and PyTorch (or JAX) skills, with a systems-builder mindset: you ship working systems, building on SOTA and training your own when needed
Broad ML literacy beyond vision, and solid software engineering hygiene: version control, reproducibility, evaluation discipline
Strong data pipeline skills: curating, cleaning, labelling, and managing large image and video datasets
Nice to have:
Experience with NVIDIA DeepStream
MLOps experience: model versioning, CI/CD for ML, monitoring deployed models in the field
Familiarity with RAG and LLM-based pipelines, and integrating them into a wider system
Background in surveillance, robotics, or safety-critical / defence systems, including on-prem or air-gapped deployment
Publications at top venues (ICCV, CVPR, ICLR, NeurIPS) are a plus but not expected - shipped systems matter more
What We Offer
Mission: Build the foundation models that decide how robots see, reason, and act in the real world. Research with deployment, not papers without impact.
Autonomy: Real independence on what to train, how to evaluate, and how to ship.
Compute: Access to the compute and field data you need to do serious work.
Team: Work alongside PhD-level co-founders in AI, Robotics, and Physics, plus a strong founding engineering team.
Location: In-person in Zürich, Switzerland, with remote considered for exceptional candidates.
Culture: Small, international founding team that's serious about building, but doesn't take itself too seriously. Curiosity and quirks welcome.
About TechTree's client
We believe robotics is entering a transformative decade. Our mission is to build the intelligent software that powers physical security in the real world - enabling robots and sensors to handle dangerous and critical tasks that humans shouldn't have to. Strong founding team based in Zurich, backed by visionary investors and advisors.