Contribution and key responsibilities:
- Design, develop and maintain scalable batch ETL and near-real-time data pipelines and architectures for various parts of our business, on fast and versatile data sources with hundreds of thousands of changes per day
- Ensure all data provided is of the highest quality, accuracy, and consistency
- Identify, design, and implement internal process improvements for optimizing data delivery and re-designing data pipelines for greater scalability
- Build out new API integrations to support continuing increases in data volume and complexity
- Communicate with data scientist, DevOps engineers, and BI analysts in order to understand business processes and data needs for specific features
For efficiency and effective role performance:
- 2+ years of experience in data engineering, data platforms, BI, or data-centric applications, such as data warehouses, operational data stores, and data integration projects
- Experience with one or more ML workflow orchestration frameworks (Apache Airflow, Kubeflow, MLFlow, etc.)
- Proficient in SQL and PL/SQL programming, working with Oracle databases
- Excellent coding skills in Python
- Experience with software development automation tools like Jenkins, and with VCSs like GitHub, GitLab or other VCSs
- Understanding of containerization and orchestration technologies like Docker/Kubernetes
Prospects and opportunities:
- To be successful and receive recognition - with us, the path from a personal initiative to a working practice is quick and short
- To feel understanding and support – every day we build and preserve the pleasure of our work together