* Collect business requirements, definition of robust data models and architectures
* Design, and build scalable and reliable data pipelines and workflows in cloud environments
* Apply DevOps practices, including basic Git workflows and involvement in CI/CD pipelines.
* Contribute to maintaining data quality, security, and data governance standards across all data-related activities.
* Collaborate with cross-functional teams to ensure data solutions align with business needs and quality standards.
* Specification and design of presentation interfaces with optimal usability/user experience
* Document processes and tasks to ensure explainability and understanding across the team
* Support the integration of AI-based enrichment and transformation processes into existing data pipelines and workflows.
The following knowledge2, experience and skills are required for the performance of the above listed tasks:
* Business analysis & requirements gathering
* Ability to collect, analyse and translate business needs into technical specifications.
* Data modelling & architecture design
* Skills in designing conceptual, logical and physical data models.
* ETL/ELT and data integration
* Ability to extract, transform, load, clean and merge datasets from multiple sources.
* Building data pipelines & workflows
* Experience with automated workflows and orchestration tools.
* Big data management
* Ability to handle large and complex datasets efficiently.
Specific expertise:
* Εxcellent knowledge in Python, Spark and SQL
* Εxcellent knowledge in designing and building ETL pipelines using tools such as Azure Synapse, Microsoft Fabric and/or AWS Glue
* Εxcellent knowledge of data modelling and database design principles using the Medallion Architecture
* Good knowledge of business intelligence tools, notably Microsoft Power BI
* Knowledge of Machine Learning, Natural Language Processing and Large Language Models (LLMs) fundamentals
Additional skills:
* Understanding of Microsoft Power Platform (e.g., Power Automate, SharePoint Lists)
* Good knowledge with Microsoft Fabric components (Lakehouses, Pipelines, Dataflows Gen2, Notebooks, Semantic Models)
* Good knowledge with cloud environments (AWS or Microsoft Azure)
* Understanding of DevOps practices, including Git workflows and CI/CD pipelines with experience using tools such as Azure DevOps, GitHub, and GitLab.
* Knowledge with no-code / low-code data science platforms such as KNIME and/or Dataiku.
* Familiarity with European Commission IT ecosystem and best practices
* Documenting and organising processes using task management tools (e.g., Jira, OpenProject) and documentation platforms (e.g., Confluence, GitLab Wiki, GitHub Wiki).