Xebia sp. z o.o. - Senior Data Engineer (Java & Spark)

Job description

Hello, let’s meet!

Who We Are

While Xebia is a global tech company, in Poland, our roots came from two teams – PGS Software, known for world-class cloud and software solutions, and GetInData, a pioneer in Big Data. Today, we’re a team of 1,000+ experts delivering top-notch work across cloud, data, and software. And we’re just getting started.

What We Do

We work on projects that matter – and that make a difference. From fintech and e-commerce to aviation, logistics, media, and fashion, we help our clients build scalable platforms, data-driven solutions, and next-gen apps using ML, LLMs, and Generative AI. Our clients include Spotify, Disney, ING, UPS, Tesco, Truecaller, AllSaints, Volotea, Schmitz Cargobull, and Allegro or InPost.

We value smart tech, real ownership, and continuous growth. We use modern, open-source stacks, and we’re proud to be trusted partners of Databricks, dbt, Snowflake, Azure, GCP, and AWS. Fun fact: we were the first AWS Premier Partner in Poland!

Beyond Projects

What makes Xebia special? Our community. We run events like the Data&AI Warsaw Summit, organize meetups (Software Talks, Data Tech Talks), and have a culture that actively support your growth via Guilds, Labs, and personal development budgets — for both tech and soft skills. It’s not just a job. It’s a place to grow.

What sets us apart?

Our mindset. Our vibe. Our people. And while that’s hard to capture in text – come visit us and see for yourself.

You will be:

developing new features and integrations within OpenLineage and the Marquez metadata service,
building integrations with data processing systems such as Apache Spark, Airflow, dbt, and others,
collaborating with open-source users, partners, and the wider community,
sharing your work through articles, blogs, or talks at open-source events (optionally).

Job requirements

Your profile:

5+ years of professional experience in data engineering,
hands on experience with OpenLineage,
strong programming skills in Java,
proven experience with Apache Spark,
solid understanding of data pipeline orchestration and transformation (Airflow, dbt),
comfortable working in collaborative, open-source environments,
communicative, proactive, and eager to work at the intersection of data engineering and open-source development.

Nice to have:

experience with Hive, Flink, BigQuery, Snowflake,
knowledge of Amundsen or Looker,
familiarity with modern collaboration tools (GitHub, JIRA, Slack, Google Meet).

Work from EU and a work permit to work from EU are required.

Candidates must have an active VAT status in the EU VIES registry: https://ec.europa.eu/taxation_customs/vies/

Recruitment Process:

CV review – HR call – Technical Interview (with Live-coding) – Client Interview (with Live-coding) – Hiring Manager Interview - Decision

Senior Data Engineer (Java & Spark)