Data Systems and ML engineer
This job is in your area. Enjoy a short commute and work close to home.
Job Description
The Data platform is currently based on a combination of Python, PostgreSQL, dbt, Dagster, and cloud services (AWS & GCP). You'll have the opportunity to expand and transform these services to support our ambitious growth plan.
You'll play a crucial role in the tools and processes that allow us to:
Collect large datasets continuously from various sources, filter, sort, process, store, and redirect data into our training pipelines, R&D experiments, and analytics solutions. Importantly, we expect to leverage AI-agent pipelines to ingest messy data locked in documents and images.
Support data access to our R&D team by contributing to our ETL processes (APIs, dbt, PostgreSQL) and our core data-access library in python: pnx.
Expand our data monitoring and data-quality control using pipelines, models, dashboards, alerts, tracing products, etc.
Efficiently train new model, evaluate them, release them.
Serve frequen...