We are seeking a highly skilled and experienced Senior Big Data Engineer to join our dynamic team. The ideal candidate will have a strong background in developing batch processing systems, with extensive experience in Oozie, the Apache Hadoop ecosystem, Airflow, and a solid understanding of public cloud technologies, especially GCP. This role involves working in a remote environment, requiring excellent communication skills and the ability to solve complex problems independently and creatively.
Work Location: US-Remote
Key Responsibilities:
Develop scalable and robust code for batch processing systems. This includes working with technologies like Hadoop, Oozie, Pig, Hive, Map Reduce, Spark (Java), Python, Hbase.
Develop, Manage and optimize data workflows using Oozie and Airflow within the Apache Hadoop ecosystem.
Leverage GCP for scalable big data processing and storage solutions.
Implementing automation/DevOps best practices for CI/CD, IaC, etc.
Qualifications:
Bachelors's degree in Computer Science, software engineering or related field of study.
Experience with managed cloud services and understanding of cloud-based batch processing systems are critical.
Proficiency in Oozie, Airflow, Map Reduce, Java.
Strong programming skills with Java (specifically Spark), Python, Pig, and SQL.
Expertise in public cloud services, particularly in GCP.
Proficiency in the Apache Hadoop ecosystem with Oozie, Pig, Hive, Map Reduce.
Familiarity with BigTable and Redis.
Experienced in Infrastructure and Applied DevOps principles in daily work. Utilize tools for continuous integration and continuous deployment (CI/CD), and Infrastructure as Code (IaC) like Terraform to automate and improve development and release processes.
Proven experience in engineering batch processing systems at scale.
Must Have:
5+ years of experience in customer-facing software/technology or consulting.
5+ years of experience with “on-premises to cloud” migrations or IT transformations.
5+ years of experience building, and operating solutions built on GCP (ideally) or AWS/Azure.