We are looking for a skilled Data Engineer to join our team and help build and maintain our data infrastructure. The ideal candidate will be responsible for designing, implementing, and managing our data processing systems and pipelines. You will work closely with data scientists, analysts, and other teams to ensure efficient and reliable data flow throughout the organization.
Key Responsibilities
- Design, develop, and maintain scalable data pipelines for batch and real-time processing.
- Implement ETL processes to extract data from various sources and load it into data warehouses or data lakes.
- Optimize data storage and retrieval processes for improved performance.
- Collaborate with data scientists and analysts to understand their data requirements and provide appropriate solutions.
- Ensure data quality, consistency, and reliability across all data systems.
- Develop and maintain data models and schemas.
- Implement data security measures and access controls.
- Troubleshoot data-related issues and optimize system performance.
- Stay up-to-date with emerging technologies and industry trends in data engineering.
- Document data architectures, pipelines, and processes.
Requirements
- Bachelor's degree in Computer Science, Engineering, or a related fields.
- 2-4 years of experience in data engineering or similar roles.
- Strong programming skills in Python, Java, or Scala.
- Proficiency in SQL and experience with relational databases (e.g., Databrick, PostgreSQL, MySQL).
- Familiarity with cloud platforms (AWS, Azure, or Airflow) and their data services.
- Knowledge of data warehousing concepts and ETL best practices.
- Experience with version control systems (e.g., Git).
- Understanding of data cleansing, data modeling and database design principles.
- Experience with Azure data platform (ADF, Databrick) is a plus.
- Familiarity with data visualization tools (e.g., Tableau, Power BI).
- Knowledge of stream processing technologies (e.g., Kafka, API, Google Big Query, MongoDB, SFTP sources).
- Experience with containerization technologies (e.g., Docker).
- Experience to deal with large data and optimization skill in development.
- Understanding of machine learning concepts and data science workflows.
- Solid problem-solving skills and attention to detail.
- Good communication skills and ability to work with technical and non-technical team members.