Brightwheel is seeking a Staff Data Engineer to join the Data Engineering team.
As a Staff Data Engineer at Brightwheel, you will play a key role in the implementation and evolution of our web scraping and data platform. You will be a technical leader responsible for crafting and implementing a best in class web scraping strategy and infrastructure. You will build and scale pipelines that garner millions of records, across hundreds of sites, stored as measurable data that enable insights for our Analytics team and our customers.
You are passionate about data engineering and possess deep technical skills. You have built scalable web scraping platforms from the ground up. You have experience juggling multiple projects with shifting priorities while continuing to deliver value to the business. You are a curious, detail oriented, self-starter who wants to take full ownership of high impact projects with visibility throughout the organization.
What You'll Do
Use modern tooling to build robust, extensible, and performant web scraping platform
Build thoughtful and reliable data acquisition and integration solutions to meet business requirements and data sourcing needs.
Deliver best in class infrastructure solutions for flexible and repeatable applications across disparate sources.
Troubleshoot, improve and scale existing data pipelines, models and solutions
Build upon data engineering's CI/CD deployments, and infrastructure-as-code for provisioning AWS and 3rd party (Apify) services.
Qualifications, Skills, & Abilities
5+ years of work experience as a data engineer/full stack engineering, coding in Python.
5+ years of experience building web scraping tools in python, using Beautiful Soup, Scrapy, Selenium, or similar tooling
3-5 years of deployment experience with CI/CD
Strong experience of HTML, CSS, JavaScript, and browser behavior.
Experience with RESTful APIs and JSON/XML data formats.
Knowledge of cloud platforms and containerization technologies (e.g., Docker, Kubernetes).
Advanced understanding of how at least one big data processing technology works under the hood (e.g. Spark / Hadoop / HDFS / Redshift / BigQuery / Snowflake)
Excellent analytical, problem solving, and troubleshooting skills to manage complex process and technology issues without guidance
Preferred Experience
2+ years of experience developing in Airflow
2+ deploying Infrastructure as Code within AWS or similar
2+ deploying microservices and/or APIs within cloud environment
1+ years using ML / AI workflows for data enrichment and/or sentiment analysis by integrating scraped data into ML pipelines.