Data Engineer

Looking for a highly skilled Data Engineer to join our team. You’ll be the first data hire, working closely with the backend team and reporting directly to the CEO/CTO. This is a fully on-site role in San Francisco, ideal for someone who wants to work closely with a lean, high-impact product team

Core Responsibilities:

You will be part of a small team, with a large amount of ownership and autonomy for managing things directly.
You will have the freedom to suggest and drive organization-wide initiatives.
Build & Own ETL Pipelines: Aggregate and normalize data from public APIs and web-scraped sources; clean and structure property data for downstream use
Design Scalable MySQL Schemas: Optimize relational models for query performance, indexing, and future scalability
Automate Workflows: Set up robust orchestration using tools like Airflow, DBT, or similar — replacing ad-hoc scripting with maintainable pipelines
Ensure Data Quality: Implement validation checks and monitoring across pipelines to ensure consistency, completeness, and reliability
Work Cross-Functionally: Collaborate with backend developers to deliver structured data that feeds app features, analytics, and AI models

Requirements:

Strong experience with TypeScript, Python in building production-grade data pipelines
Expertise in MySQL schema & database design, performance tuning, and query optimization
Comfortable working with semi-structured data from various sources (JSON, CSV, XML, APIs, etc.)
Experience building web scrapers & scripts for aggregating unstructured data from web-based sources
Familiarity with orchestration tools (e.g., Airflow, DBT) and data processing best practices
Solid understanding of data engineering principles — lineage, auditing, versioning, and scalability
Familiarity working with and integrating LLM APIs
Familiarity with prop-tech/appraisals, real-estate, geospatial data (nice to have)