Informații principale
Data Platform Developer (m/f/d) with Big Data Experience - remote/Berlin
Poziție: Nu este specificat
Start: 17 Feb. 2025
Final: 31 Dec. 2025
Localizare:
Berlin, Germania
Germania
Metoda de colaborare: Doar proiect
Tarif pe oră: Nu este specificat
Ultima actualizare: 20 Ian. 2025
Descrierea și cerințele proiectului
For our costumer, we are looking for a Data Platform Developer (m/f/d) with Big Data Experience.
Start: Ca. 15th of February 2025
Location: remote and in Berlin
Full-/Part- Time: Full-Time
Tasks:
Data/Data MeshCapabilities development:
- Design, develop, and maintain scalable data architectures, including databases, data lakes, and data warehouses.
- Implement best practices for data storage, retrieval, and processing.
- Drive the adoption of Data Mesh principles, promoting decentralized data ownership and architecture.
- Conceptualize, design, and implement Data Mesh Proof of Concepts (PoCs) to validate decentralized data architectures.
- Implement a comprehensive data catalog to document metadata, data lineage, and data dictionaries for all data assets. Ensure that data is easily discoverable and accessible across the organization.
Data Products:
- Develop and maintain reusable data products that serve various business units. These data products should ensure high-quality, reliable data that can be easily integrated and utilized across different platforms and applications.
- Design and implement data models supporting business requirements.
- Collaborate closely with data scientists and analysts to understand and structure data needs.
- Develop and maintain ETL (Extract, Transform, Load) processes for moving and transforming data from various sources into the data infrastructure aligning with Data Mesh principles
Data Quality and Governance:
- Implement and enforce data quality standards and governance policies.
- Develop and maintain data documentation for metadata, lineage, and data dictionaries.
Platform Adaptation:
- Design and implement Kubernetes-based deployment strategies for scalable, reliable, and manageable data technologies.
- Collaborate with DevOps and Infrastructure teams to optimize data technology deployment processes within a Kubernetes environment.
Documentation:
- Document Data Mesh implementations, PoC results, and best practices for knowledge sharing and future reference
Profile Requirements
The ideal candidate must be at a middle/senior level combining a strong technical background in data engineering with expertise in implementing data architectures, having collaboration skills, and an innovative mindset to contribute effectively to the organization's data strategy.
- Bachelor's or Master's degree in Computer Science, Data Science, or a related field.
- 5+ years of general IT experience
- 3+ years of Big Data experience
- Proven experience as a Data Engineer with a focus on designing and implementing scalable data architectures.
- Extensive experience in developing and maintaining databases, data lakes, and data warehouses.
- Hands-on experience with ETL processes and data integration from various sources.
- Familiarity with modern data technologies and cloud services.
- Proficient in designing and implementing data models to meet business requirements.
- Extensive experience with Data Catalogue technology in conjunction with Data Mesh
- A keen interest in staying updated on emerging technologies in the data engineering and Data Mesh space.
- Ability to evaluate and recommend the adoption of new tools and technologies.
- Innovative mindset to propose solutions enhancing the organization's data architecture
Must Haves:
- Proven hands-on software development experience
- Proficiency in data processing languages such as SQL, Java, Python or Scala
- Knowledge and experience with at least some of the Data technologies/frameworks:
RDBMS (PostgreSQL/MySql etc.)
NoSQL Storages (MongoDB, Cassandra, Neo4j etc.)
Timeseries (InfluxDB, OpenTSDB, TimescaleDB, Prometheus etc.)
Workflow orchestration (AirFlow/Oozie etc.)
Data integration/Ingestion (Flume etc) .
Messaging/Data Streaming (Kafka/RabbitMQ etc.)
Data Processing (Spark, Flink etc.) And/Or with their Cloud provided counterparts, i.e., Cloud Data/Analytics services (GCP, Azure, AWS)
- Familiarity with reference Big Data architectures (Warehouse, Data Lake, Data Lakehouse) and their implementation.
- Experience in implementing and operating data intensive applications
- Strong focus on DataOps/DevOps
- Good k8s knowledge and experience
Must-have language skills
Proficiency in both speech and writing in English (at least C1)
Nice to haves:
- Deeper K8s skills and experience, e.g. k8s operators’ development experience and/or k8s operators for Big Data technologies)
- In-depth knowledge of best practices in data privacy and data protection
- Proven experience with DataMesh principles in practice
- Data platform development and/or operations experience
- Knowledge and experience in lifecycle management in Data (e.g. CD4ML, MLOps, …)
- Proficiency in German