Member of Technical Staff (Data): World Models
Remote
Full Time
1 month ago
Senior LevelEngineeringWorldwide
$80K - $120K

USD per year

Job Description

Member of Technical Staff (Data): World Models

Location

Remote

Employment Type

Full time

Location Type

Remote

Department

Artificial IntelligenceAI Engineering

Your Charter

  • Data at Scale: Own the pipelines and storage systems that feed petabyte-scale multimodal datasets into model training.
  • Sustainable Platforms: Build tooling and systems that are automated and efficient, enabling processing at scale and handling many small heterogeneous datasets.

Required Skillsets

  • Data Engineering: Knowledge of Python ETL pipelines and supporting infrastructure, data formats, and storage systems at scale.
  • ML Data Ops: Experience managing datasets, annotations, and data versioning for model training.
  • Basic ML Knowledge: Solid grasp of ML fundamentals is essential to collaborate effectively with researchers and make sound data platform decisions.
  • Agentic Engineering: Skilled at writing high-quality specifications for AI agents, while maintaining effective human review of AI-generated work.

Responsibilities

  • Design, automate, maintain, and optimize Python ETL pipelines (Spark/Ray) for large-scale multimodal data.
  • Build and maintain data cataloging, lineage, quality tooling, integrity verification, access controls, and lifecycle management systems.
  • Provide guidance, internal tools, and documentation to colleagues on data best practices.
  • Serve as a custodian of the company’s datasets, ensuring overall data health, quality, and discoverability.

Challenges You'll Tackle

  • Implement high-performance, multimodal data pipelines capable of processing petabyte-scale datasets on 10,000s of CPUs and 100s of GPUs.
  • Evolve data formats,...
Job Expired

This job posting has expired and is no longer accepting applications.

Browse Active Jobs
About Moonvalley

Moonvalley AI Inc. offers Marey, a studio-grade quality video system with frame-perfect control, trained on licensed data. Marey enables precise model controls for filmmakers including pose transfer, camera control, motion transfer, and trajectory control. The system generates image-to-video and text-to-video content with cinematic detail, natural physics, dynamic motion, composable layers, and advanced lighting. Marey is commercially safe and built for high-resolution cinematic footage.

View Company Profile