Data Pipeline Development: Design, develop, and maintain efficient ETL (Extract, Transform, Load) processes and data pipelines to collect, process, and store data from various sources.
Data Warehouse Management: Create and manage data warehouses and data lakes, optimising storage and query performance for both structured and unstructured data.
Data Quality Assurance: Implement data quality checks, validation processes, and error handling to ensure data accuracy and consistency.
Database Management: Administer and optimise relational and NoSQL databases, ensuring data integrity and high availability.
Performance Tuning: Identify and address performance bottlenecks in data pipelines and databases to improve overall system efficiency.
Data Security: Implement data security measures and access controls to protect sensitive data assets.
Collaboration: Collaborate with data scientists, analysts, and other stakeholders to understand their data needs and provide support for their analytics and reporting projects.
Documentation: Maintain clear and comprehensive documentation for data processes, pipelines, and infrastructure.
Monitoring and Troubleshooting: Monitor data pipelines and databases, proactively identify issues, and troubleshoot and resolve data-related problems in a timely manner.
Requirements :
Bachelor's degree in Computer Science, Information Technology, or a related field (advanced degrees are a plus).
4+ years of experience in data engineering roles.
Proficiency in programming languages such as Python, Java, or Scala.
Experience with data warehousing solutions (e.g., AWS Redshift, Google BigQuery) and database systems (e.g., PostgreSQL, MongoDB).
Strong knowledge of ETL processes, data integration, and data modelling.
Familiarity with data orchestration and workflow management tools (e.g, Apache Airflow).
Understanding of data security best practices and data governance principles.
Excellent problem-solving skills and the ability to work in a fast-paced, collaborative environment.
Strong communication skills and the ability to explain complex technical concepts to non-technical team members.