We are looking for a Lead Software Engineer - Data to join our growing Webapps Team! Reporting to the Engineering Manager, you will evolve our data models in several styles of datastores, improve internal tooling to allow data self-service, operationalize production-grade data pipelines, and build out strong data store abstractions for our web-application services. As part of this role, you'll collaborate with a world-class team, experience growth and mentorship, and apply data engineering solutions to shape the future of scientific discovery.
You Will:
Leverage a deep understanding of the business context and the team’s goals to unlock independent technical decisions in the face of open-ended requirements
Identify and address misalignments between project outcomes and Platform/team goals or customer needs
Proactively identify new opportunities (from both internal and external sources) and advocates for and implements improvements to the current state of projects — potentially having a broader business impact across multiple teams or products
Take responsibility for your own decisions and mistakes on projects and take action to prevent them in the future. Embrace and share the learnings with others to help prevent future occurrences
Encounter barriers; unblock self and team by proactively assessing and eliminating the root cause and focusing on the solutions
Respond with urgency and drive urgency in own team to operational issues, owning resolution within one's sphere of responsibility
Actively seek out and eliminate sources of toil on the team and help reduce the impact of priority shifts
Be unafraid of challenging the status quo and proposing newer technologies or ways of working
Proactively create and update playbooks or Retros for components in their domain
Acts as a Technical DRI on multiple projects and enforces quality standards on team's code
Scale data pipelines to allow our data to go from research to platform quickly and reliably
Manage sources that contain both semi-structured and unstructured data
Collaborate with ML, Data engineers, and Science to solve complex document mining challenges, helping us capture and model additional scientific experiments
Use your experience to help define and apply best practices for a broad platform of technologies in a cloud-based environment
Lead or consult the authoring of engineering design proposals following the unified Platform Stream roadmap at BenchSci and be responsible for overseeing architectural decisions in your area
Lead sprint planning, estimations, and code + design reviews
Partner with the Engineering Manager and other Tech Leads to recruit and build the team
Provides troubleshooting analysis and resolution
Spend ~50% of your time with hands-on coding tasks
You Have:
A degree in Computer Science/Engineering or a related field within science
8+ years of experience working as a professional developer
Experience in Python with related data engineering libraries (Pandas, Spark, etc.)
Expertise in relational databases and SQL (e.g., PostgreSQL, MySQL) and experience with vector and graph databases (e.g., Pinecone, Neo4j)
Experience with cloud reference architectures and developing specialized stacks on cloud services
Deep understanding of data quality and integration test frameworks
Conceptual understanding in a data-driven software engineering environment and how to use data to make decisions
A track record in building high-quality, maintainable code
Strong communication, collaboration, and strategic problem-solving skills
A proactive approach to driving team success and improving workflows.