Ripjar Logo

Ripjar

Data Engineer

Reposted 9 Days Ago
In-Office or Remote
Hiring Remotely in Bristol, England, GBR
Junior
In-Office or Remote
Hiring Remotely in Bristol, England, GBR
Junior
Build and operate distributed ingestion and processing pipelines, ensure reliability and performance, define data contracts, add observability and testing, improve platform reliability and CI/CD, and participate in design/code reviews and incident retrospectives.
The summary above was generated by AI

About Ripjar

Ripjar was founded by veterans of GCHQ to bring national security-grade intelligence tools to the fight against financial crime. Financial crime funds human trafficking, terrorism, corruption and sanctions evasion on a global scale, and the organisations on the front line need technology built to match the threat.

Today, Ripjar's AI-native software and data fusion products are used by governments, the world's largest banks, and global enterprises to automate the detection, investigation and monitoring of serious financial crime. Every day, hundreds of customers and thousands of daily active users rely on the platform to screen hundreds of millions of names for risk in real time, prevent money laundering and stop terrorist financing.

If you want your work to matter, this is where it happens.

We are a remote first team, with a head office based in Cheltenham. This position is open to UK wide candidates. If you are based near Cheltenham, you are more than welcome to work from our office at any time.

About the Role

We see a Data Engineer as a software engineer who specialises in distributed data systems. You’ll join the Data Engineering team, whose prime responsibility is the development and operation of the Data Collection Hub, a platform that ingests data from many sources, processes/enriches it, and distributes it to multiple downstream systems.

We’re looking for someone with 2+ years of industry experience building and operating production software who enjoys working across data pipelines, distributed systems, and operational reliability.

What you’ll do

  • Engineer distributed ingestion services that reliably pull data from diverse sources, handle messy real-world edge cases, and deliver clean, well-structured outputs to multiple downstream products.
  • Build high-throughput processing components (batch and/or near-real-time) with a focus on performance, scalability, and predictable cost, using strong profiling and measurement practices.
  • Design and evolve data contracts (schemas, validation rules, versioning, backward compatibility) so downstream teams can build with confidence.
  • Own production quality: write maintainable code, strong unit/integration tests, and add the observability you need (metrics/logs/tracing) to diagnose issues quickly.
  • Improve platform reliability by hardening pipelines against partial failures, retries, rate limits, data drift, and infrastructure issues—then codify those learnings into better tooling and guardrails.
  • Contribute to CI/CD and developer experience: faster builds, better test signal, safer releases, and automated operational checks.
  • Participate in design reviews, code reviews, incident retrospectives, and iterative delivery—making pragmatic trade-offs and documenting them clearly.

Technology Stack

  •  Languages: Predominantly Python and Node.js
  • Distributed/data platforms: HDFS, HBase, Spark, plus increasing use of Kubernetes and cloud services
  • Storage/search: MongoDB, OpenSearch
  • Orchestration: Airflow, Dagster, NiFi
  • Tooling: GitHub, GitHub Actions, Rundeck, Jira, Confluence
  • Deployment/config: Ansible (physical), Terraform / Argo CD / Helm (Kubernetes)
  • Development environment: MacBook (typical)

Requirements

Essential:

  • 2+ years building and operating production software systems
  • Fluency in at least one programming language (Python/Node.js a plus)
  • Experience debugging moderately complex systems and improving reliability/performance
  • Strong fundamentals: data structures, testing, version control, Linux basics

Nice to have:

  • Spark/PySpark experience
  • Hadoop ecosystem exposure (HDFS/HBase)
  • Workflow orchestration (Airflow/Dagster/NiFi)
  • Search/indexing (OpenSearch, MongoDB)
  • Kubernetes and infrastructure-as-code
  • Degree in Computer Science or numerical degree

Benefits
  • Competitive salary DOE
  • 25 days annual leave + your birthday off, in addition to bank holidays, rising to 30 days after 5 years of service.
  • Remote working
  • Private Family Healthcare.
  • 35 hour working week.
  • Employee Assistance Programme.
  • Company contributions to your pension.
  • Pension salary sacrifice.
  • Enhanced maternity/paternity pay.
  • The latest tech including a top of the range MacBook Pro.

Similar Jobs

4 Days Ago
Remote or Hybrid
Senior level
Senior level
Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
Senior Data Engineer on PwC's Managed Data, Analytics & Insights team to design, build and manage advanced data ecosystems. Responsibilities include designing data solutions and scalable pipelines, solving complex problems, mentoring junior staff, maintaining high delivery standards, and building client relationships while aligning solutions to business context.
Top Skills: DatabricksKafka
8 Days Ago
In-Office or Remote
Mid level
Mid level
Digital Media • Gaming • Software • Esports • Automation
Build, migrate and maintain regulatory reporting systems on GCP/BigQuery. Develop automated ETL/ELT pipelines, implement data validation/monitoring, support SQL Server legacy migrations, use IaC/CI-CD, and leverage AI tooling to improve automation and quality.
Top Skills: Claude CodeCloud ComposerCloud FunctionsEltETLGitGitlabGoogle BigqueryGoogle Cloud StoragePub/SubSQLSQL ServerTerraform
8 Days Ago
Remote or Hybrid
Mid level
Mid level
Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
Lead design and implementation of data infrastructure, pipelines, and integrations using cloud platforms. Manage teams and client accounts, ensure data quality, security, and compliance, deploy scalable solutions (Databricks, Snowflake), mentor junior staff, and identify data-driven business opportunities.
Top Skills: Amazon Web Services (Aws)Azure Data FactoryDatabricksSnowflake

What you need to know about the Bristol Tech Scene

Along with Gloucester, Swindon and Bath, Bristol is part of the "Silicon Gorge" tech hub, a region in the U.K. renowned for its high-tech and research-driven industries, with a particular emphasis on sustainability and reducing environmental impact. As the European Green Capital, Bristol is home to 25,000 cleantech companies, including Baker Hughes and unicorn Ovo Energy. The city has committed to achieving net-zero emissions within the next decade.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account