Lead Software Engineer, Data

Job Description

We are looking for a Lead Software Engineer - Data to join our growing Webapps Team! Reporting to the Engineering Manager, you will evolve our data models in several styles of datastores, improve internal tooling to allow data self-service, operationalize production-grade data pipelines, and build out strong data store abstractions for our web-application services. As part of this role, you'll collaborate with a world-class team, experience growth and mentorship, and apply data engineering solutions to shape the future of scientific discovery.


You Will:
  • Leverage a deep understanding of the business context and the team’s goals to unlock independent technical decisions in the face of open-ended requirements
  • Identify and address misalignments between project outcomes and Platform/team goals or customer needs
  • Proactively identify new opportunities (from both internal and external sources) and advocates for and implements improvements to the current state of projects — potentially having a broader business impact across multiple teams or products
  • Take responsibility for your own decisions and mistakes on projects and take action to prevent them in the future. Embrace and share the learnings with others to help prevent future occurrences
  • Encounter barriers; unblock self and team by proactively assessing and eliminating the root cause and focusing on the solutions
  • Respond with urgency and drive urgency in own team to operational issues, owning resolution within one's sphere of responsibility
  • Actively seek out and eliminate sources of toil on the team and help reduce the impact of priority shifts
  • Be unafraid of challenging the status quo and proposing newer technologies or ways of working
  • Proactively create and update playbooks or Retros for components in their domain
  • Acts as a Technical DRI on multiple projects and enforces quality standards on team's code
  • Scale data pipelines to allow our data to go from research to platform quickly and reliably
  • Manage sources that contain both semi-structured and unstructured data
  • Collaborate with ML, Data engineers, and Science to solve complex document mining challenges, helping us capture and model additional scientific experiments
  • Use your experience to help define and apply best practices for a broad platform of technologies in a cloud-based environment
  • Lead or consult the authoring of engineering design proposals following the unified Platform Stream roadmap at BenchSci and be responsible for overseeing architectural decisions in your area
  • Lead sprint planning, estimations, and code + design reviews
  • Partner with the Engineering Manager and other Tech Leads to recruit and build the team
  • Provides troubleshooting analysis and resolution
  • Spend ~50% of your time with hands-on coding tasks

  • You Have:
  • A degree in Computer Science/Engineering  or a related field within science
  • 8+ years of experience working as a professional developer
  • Experience in Python with related data engineering libraries (Pandas, Spark, etc.)
  • Expertise in relational databases and SQL (e.g., PostgreSQL, MySQL) and experience with vector and graph databases (e.g., Pinecone, Neo4j)
  • Experience with cloud reference architectures and developing specialized stacks on cloud services
  • Deep understanding of data quality and integration test frameworks
  • Conceptual understanding in a data-driven software engineering environment and how to use data to make decisions
  • A track record in building high-quality, maintainable code
  • Strong communication, collaboration, and strategic problem-solving skills
  • A proactive approach to driving team success and improving workflows.