GrowthStage
Computer Vision Engineer
Requirements
The role requires strong hands-on experience in building computer vision systems and proficiency in Python and performance-oriented stacks. Candidates should have experience with various computer vision techniques and the ability to work in ambiguous environments.
Job Description
About the Company
We are hiring a Computer Vision Engineer to build the visual understanding layer for an AI-native product.
About the Role
You will work on systems that interpret live visual input, understand what is happening in real time, and turn messy visual data into reliable product context. This could include screen understanding, object and UI detection, OCR, tracking, segmentation, visual embeddings, video understanding, and multimodal reasoning. This is a high-ownership engineering role for someone who enjoys taking ambiguous product needs, choosing the right technical approach, and shipping production systems that are fast, reliable and useful.
Responsibilities
- Build real-time or near-real-time computer vision pipelines for live visual input.
- Detect and interpret objects, UI states, entities, scenes, changes and other relevant visual signals.
- Develop tracking and temporal reasoning systems that understand what changes frame-to-frame.
- Evaluate and combine OCR, object detection, segmentation, visual embeddings, VLMs, classical CV and custom models.
- Optimise inference for latency, throughput, model size, GPU memory and production reliability.
- Design clean APIs and event streams that expose visual signals to product, reasoning, retrieval or automation systems.
- Create evaluation loops, confidence thresholds and fallback behaviours for uncertain visual outputs.
- Work closely with product and engineering teams to turn prototype models into robust user-facing systems.
- Help shape the early architecture for a vision system that can scale across use cases and environments.
Qualifications
- Strong hands-on experience building computer vision systems in production or production-like environments.
- Experience with real-time or low-latency visual processing.
- Strong Python skills and experience with at least one performance-oriented stack such as C++, CUDA, TensorRT, OpenCV, ONNX Runtime, OpenVINO, Metal or similar.
- Experience with one or more of: object detection, segmentation, OCR, visual embeddings, tracking, scene understanding, video understanding, SLAM, 3D/spatial computing or multimodal/VLM systems.
- Strong judgement around latency, throughput, batching, model size, GPU memory, confidence scoring and runtime behaviour.
- Ability to prototype quickly, measure performance, improve systematically and ship.
- Comfort working in ambiguous product environments where the right technical approach may need to be discovered.
- Pragmatic engineering instincts: you care about model quality, but also about whether the system works reliably for users.
Required Skills
- Experience with screen capture, streaming video, WebRTC, media pipelines or low-latency desktop/mobile applications.
- Experience deploying CV models on constrained hardware, edge devices, mobile GPUs/NPUs or real-time production systems.
- Experience with VLMs, CLIP-style embeddings, multimodal retrieval, RAG, knowledge graphs or agentic systems.
- Experience in gaming, robotics, autonomy, AR/VR, industrial automation, healthcare imaging, security, sports analytics or another vision-heavy domain.
- Open-source work, research, demos or side projects showing strong visual and technical taste.
Preferred Skills
- Location: London preferred, with flexibility depending on the team setup.
- Working arrangement: Hybrid or on-site for close product and engineering collaboration.
- Compensation: Suggested range for an early-stage London role: £80,000-£150,000 plus equity.
- Visa sponsorship: Available where applicable.
Pay range and compensation package
£80,000-£150,000 plus equity.
Equal Opportunity Statement
We are committed to diversity and inclusivity.
Skills
About GrowthStage
You want a senior full-stack engineer who can help take your product from PoC to Production. We can help find the right candidate for you. We provide technically vetted S-tier candidates, specialising in full-stack engineers, AI/LLM engineers, and also PhD level ML talent. Why us? We are run by PhDs. CTOs and founding engineers. We have hands on experience with the profile you want to hire for, because we are the profile you want to hire. We are deeply technical, thoughtful, and strategic in our placements.