Ripjar Jobs

Data Engineer

Ripjar

Data Engineer

Reposted 15 Days Ago

In-Office or Remote

Hiring Remotely in Bristol, England, GBR

Junior

In-Office or Remote

Hiring Remotely in Bristol, England, GBR

Junior

Build and operate distributed ingestion and processing pipelines, ensure reliability and performance, define data contracts, add observability and testing, improve platform reliability and CI/CD, and participate in design/code reviews and incident retrospectives.

The summary above was generated by AI

About Ripjar

Ripjar specialises in the development of software and data products that help governments and organisations combat serious financial crime. Our technology is used to identify criminal activity such as money laundering and terrorist financing, enabling organisations to enforce sanctions at scale to help combat rogue entities and state actors.

Data infuses everything Ripjar does. We work with a wide variety of datasets of all scales, including an ever-growing archive of billions of news articles covering most languages going back over 30 years, sanctions and watchlist data provided by governments, and vast organisation and ownership datasets.

We are a remote first team, with a head office based in Cheltenham. This position is open to UK wide candidates. If you are based near Cheltenham, you are more than welcome to work from our office at any time.

About the Role

We see a Data Engineer as a software engineer who specialises in distributed data systems. You’ll join the Data Engineering team, whose prime responsibility is the development and operation of the Data Collection Hub, a platform that ingests data from many sources, processes/enriches it, and distributes it to multiple downstream systems.

We’re looking for someone with 2+ years of industry experience building and operating production software who enjoys working across data pipelines, distributed systems, and operational reliability.

What you’ll do

Engineer distributed ingestion services that reliably pull data from diverse sources, handle messy real-world edge cases, and deliver clean, well-structured outputs to multiple downstream products.
Build high-throughput processing components (batch and/or near-real-time) with a focus on performance, scalability, and predictable cost, using strong profiling and measurement practices.
Design and evolve data contracts (schemas, validation rules, versioning, backward compatibility) so downstream teams can build with confidence.
Own production quality: write maintainable code, strong unit/integration tests, and add the observability you need (metrics/logs/tracing) to diagnose issues quickly.
Improve platform reliability by hardening pipelines against partial failures, retries, rate limits, data drift, and infrastructure issues—then codify those learnings into better tooling and guardrails.
Contribute to CI/CD and developer experience: faster builds, better test signal, safer releases, and automated operational checks.
Participate in design reviews, code reviews, incident retrospectives, and iterative delivery—making pragmatic trade-offs and documenting them clearly.

Technology Stack

Languages: Predominantly Python and Node.js
Distributed/data platforms: HDFS, HBase, Spark, plus increasing use of Kubernetes and cloud services
Storage/search: MongoDB, OpenSearch
Orchestration: Airflow, Dagster, NiFi
Tooling: GitHub, GitHub Actions, Rundeck, Jira, Confluence
Deployment/config: Ansible (physical), Terraform / Argo CD / Helm (Kubernetes)
Development environment: MacBook (typical)

Requirements

Essential:

2+ years building and operating production software systems
Fluency in at least one programming language (Python/Node.js a plus)
Experience debugging moderately complex systems and improving reliability/performance
Strong fundamentals: data structures, testing, version control, Linux basics

Nice to have:

Spark/PySpark experience
Hadoop ecosystem exposure (HDFS/HBase)
Workflow orchestration (Airflow/Dagster/NiFi)
Search/indexing (OpenSearch, MongoDB)
Kubernetes and infrastructure-as-code
Degree in Computer Science or numerical degree

Benefits

Competitive salary DOE

25 days annual leave + your birthday off, in addition to bank holidays, rising to 30 days after 5 years of service.
Remote working
Private Family Healthcare.
35 hour working week.
Employee Assistance Programme.
Company contributions to your pension.
Pension salary sacrifice.
Enhanced maternity/paternity pay.
The latest tech including a top of the range MacBook Pro.

Similar Jobs

bet365

Data Engineer

9 Days Ago

In-Office or Remote

Mid level

Digital Media • Gaming • Software • Esports • Automation

The Data Engineer will create, maintain, and migrate regulatory reporting systems to Google Cloud Platform (GCP), develop data pipelines, and ensure accuracy for regulatory submissions.

Top Skills: BigQueryCi/CdCloud ComposerCloud FunctionsCloud StorageGCPGitGitlabGoogle Cloud PlatformPub/SubSQLTerraform

ElevenLabs

Data Engineer

7 Days Ago

Remote

United Kingdom

Mid level

Artificial Intelligence • Information Technology

The Data Engineer will build and scale data pipelines, ensure data quality, implement dbt best practices, and empower cross-functional teams with self-service data access.

Top Skills: Bi ToolsDbtPythonSQL

Influur

Data Engineer

8 Days Ago

In-Office or Remote

Mid level

AdTech

The AI Data Engineer will design and implement the entire data lifecycle for autonomous AI agents, focusing on video ingestion, processing, and system architecture.

Top Skills: AWSGCPPython

What you need to know about the Bristol Tech Scene

Along with Gloucester, Swindon and Bath, Bristol is part of the "Silicon Gorge" tech hub, a region in the U.K. renowned for its high-tech and research-driven industries, with a particular emphasis on sustainability and reducing environmental impact. As the European Green Capital, Bristol is home to 25,000 cleantech companies, including Baker Hughes and unicorn Ovo Energy. The city has committed to achieving net-zero emissions within the next decade.