Avomind

Data Engineer

  • Avomind

Job Description

This is a remote freelance position: no employment, relocation or visa support possible.

 

About the Company 
Our client is an innovative biotechnology company which dedicated to unlocking the secrets of aging by leveraging advanced artificial intelligence and machine learning technologies. By analyzing extensive biological datasets, the company seeks to identify and develop therapeutics that can slow, halt, or even reverse age-related decline. Focused on understanding the molecular mechanisms behind aging and age-associated diseases, they are at the forefront of pioneering research to extend healthy lifespans and improve overall wellness.

 

Role Overview: 
They are seeking a Data Engineer for their client (Biotech Company in the USA, PT timezone) to design, build, and optimise scalable data solutions that support organisation's analytical and operational needs. This role involves developing data pipelines, implementing data quality checks. Domain expertise in biological or scientific data is a significant advantage. 


Key Responsibilities: 
• Design, build, and maintain ETL/ELT pipelines to process and transform data efficiently. 
• Develop and optimise scalable data architectures in the cloud. 
• Implement and maintain data cataloging solutions to ensure discoverability and governance. 
• Build APIs and integrations for seamless data exchange across systems. 
• Perform data quality checks and implement automated testing frameworks to ensure data accuracy and reliability. 
• Collaborate with teams to build self-service systems and promote data democratisation. 
• Document and maintain data engineering processes and best practices. 

Requirements

Qualifications: 

• Strong experience in data engineering and cloud platforms (preferably GCP). 

• Proficiency in programming languages like Python, SQL, and shell scripting. 

• Familiarity with data catalog tools (e.g., DataHub, Apache Atlas) and metadata management. 

• Experience with building and maintaining scalable ETL pipelines using orchestration tools (Dagster, Airflow). 

• Understanding of API development and integration. 

• Knowledge of data governance and data quality principles. 

• Background in biological or scientific data is a plus but not mandatory. Domain expertise can substitute formal education. 

• Strong problem-solving skills and ability to work with cross-functional teams. 

• Excellent communication skills in English, both written and verbal.