Sayari provides instant access to structured business information from hundreds of millions of corporate, legal, and trade records for a variety of use cases. As a member ofSayari's data team you will work with our Product and Software Engineering to build the graph that underlies Sayari’s products.
Job Responsibilities
Build and maintain ETL pipelines to process and export record data to Sayari Graph application
Develop and improve entity resolution processes
Implement logic to calculate and export risk information
Work with product team and other development teams to collect and refine requirements
Run and maintain regular data releases
Required Skills & Experience
Expertise with Python and a JVM programming language (e.g., Scala)
Expertise with SQL (e.g., Postgres) and NoSQL (e.g., Cassandra, Elasticsearch, Memgraph, etc.) databases
7+ years of experience designing, maintaining, and orchestrating ETL pipelines (e.g., Apache Spark, Apache Airflow) in cloud based environments (e.g., GCP, AWS, or Azure).
Desired Skills & Experience
Experience with entity resolution, graph theory, and/or distributed computing
Experience with Kubernetes
Experience working as part of an agile development team using Scrum, Kanban, or similar