Data Engineer (Mid - Senior)

  • Umpisa Inc
  • Verified

Job Description

At Umpisa Inc., our mission is to make the Philippines be known globally as a tech hub. 

Umpisa Inc. is a progressive technology services company that partners with select industries, clients and people to work on pioneering and industry-changing solutions via digital transformation, modern software development and venture building.

We create a set of world-class and impactful products and solutions to help organizations and individuals live better lives. We offer demanding, challenging and rewarding careers in software development, product development, emerging technologies, and more for the right candidates.

Essential Skills:

  • Aligns with our values: Excellence, Integrity, Professionalism, People Success, Customer Success, Fun, Innovation and Diversity
  • Strong communication skills
  • Strong problem solving and analytical skills
  • Excellent problem-solving ability
  • Would like to work as part of a self-organizing Scrum team in a scaled agile framework
  • Must be a self-starter and loves to collaborate with the team and client

Job Summary

The Data Engineer will be responsible for designing, building, and maintaining the data architecture, pipelines, and systems that enable the efficient collection, processing, and storage of data across the organization. You’ll work closely with data analysts and other technical teams to ensure that the data infrastructure is robust, scalable, and able to handle the increasing volume and complexity of data. The role will empower teams to leverage data for actionable insights and strategic decision-making.

Requirements

Key Responsibilities:

  • Data Pipeline Development: Design, build, and maintain scalable data pipelines for collecting, processing, and storing data from various internal and external sources.
  • Data Integration: Work with various data sources, including APIs, databases, and cloud services, to integrate and centralize data into a unified data warehouse or data lake.
  • ETL/ELT Processes: Develop and optimize ETL (Extract, Transform, Load) or ELT (Extract, Load, Transform) processes to ensure efficient and accurate data ingestion and transformation.
  • Data Warehousing & Architecture: Design and implement data architecture solutions to ensure data is structured, stored, and accessible for analysis. This includes the maintenance of relational and non-relational databases (e.g., SQL, NoSQL, data lakes).
  • Data Quality & Governance: Monitor and enforce data quality standards, data governance policies, and best practices for data security, privacy, and compliance.
  • Automation: Automate repetitive data processing tasks to ensure efficiency, scalability, and reliability in data workflows.
  • Collaboration: Work closely with data scientists, data analysts, and business intelligence teams to understand their data needs and ensure that data infrastructure supports their requirements.
  • Performance Optimization: Identify and implement optimizations to improve the performance of data systems, pipelines, and queries, ensuring timely and reliable access to data.
  • Documentation & Support: Maintain detailed documentation of data architecture, pipeline processes, and best practices to ensure knowledge sharing and system sustainability.
  • Monitoring & Troubleshooting: Monitor data systems and pipelines to proactively identify issues, perform debugging, and implement solutions to ensure system reliability.

Minimum Requirements:

  • Trustworthy, self-motivated, responsive and sharp
  • Strong listening, verbal, written and presentation communication skills are essential
  • Excellent attention to detail
  • Strong communication skills, organization skills, time management skills

Preferred Qualifications

  • Bachelor’s degree in Computer Science, Engineering, Data Science, or a related field. Master’s degree is a plus.
  • 3+ years of experience in data engineering or a related technical field.
  • Proficiency in programming languages such as Python, Java, or Scala.
  • Strong experience with data warehousing solutions (e.g., Amazon Redshift, Google BigQuery, Snowflake, or similar).
  • Expertise in building and managing ETL/ELT pipelines using tools like Apache Airflow, Apache Kafka, Talend, or similar.
  • Solid understanding of relational and NoSQL databases (e.g., MySQL, PostgreSQL, MongoDB, Cassandra, etc.).
  • Experience with cloud computing platforms (AWS, Google Cloud, Azure) and services like S3, Lambda, Dataflow, or Databricks.
  • Familiarity with containerization tools (e.g., Docker, Kubernetes) is a plus.
  • Experience with data modeling and schema design for large-scale data systems.
  • Knowledge of version control systems (e.g., Git).
  • Strong problem-solving skills with a focus on scalable, efficient data processing and storage.
  • Experience in optimizing database performance and query efficiency.
  • Ability to work effectively with both technical and non-technical teams.
  • Excellent written and verbal communication skills for documenting workflows and presenting complex concepts.
  • Familiarity with machine learning pipelines and frameworks (e.g., TensorFlow, PyTorch) is a plus.
  • Experience with real-time data processing systems (e.g., Apache Flink, Spark Streaming).
  • Knowledge of data privacy regulations (e.g., GDPR, CCPA) and implementing compliance measures.
  • Experience working in an Agile environment.