Definitive Healthcare Logo

Definitive Healthcare

Data Engineer

Reposted 2 Days Ago
Remote or Hybrid
Hiring Remotely in Framingham, MA
Mid level
Remote or Hybrid
Hiring Remotely in Framingham, MA
Mid level
Design, build, and maintain scalable ETL/ELT pipelines using Python, Spark, Databricks, Airflow and SSIS. Integrate and cleanse diverse healthcare datasets, implement Unity Catalog for metadata and governance, optimize Spark performance and JVM tuning, support Medallion architecture, and collaborate with cross-functional teams to automate CI/CD, observability, and data quality processes.
The summary above was generated by AI
About Definitive Healthcare :
At Definitive Healthcare (NASDAQ: DH), we're passionate about turning data, analytics, and expertise into meaningful intelligence that helps our customers achieve success and shape the future of healthcare. We empower them to uncover the right markets, opportunities, and people-paving the way for smarter decisions and greater impact.
Headquartered just outside of Boston, Massachusetts, Definitive Healthcare operates across North America, Europe, and India, supporting a growing global client base of more than 2,400 customers since our founding in 2011.
We're also a great place to work. In 2024 and 2025, we earned multiple workplace honors, including Built In's 100 Best Places to Work in Boston (both years), a Stevie Bronze Award for Great Employers, and recognition as a Great Place to Work in India.
We foster a collaborative, inclusive culture where diverse perspectives drive innovation. Through programs like DefinitiveCares and our employee-led affinity groups we strive to promote connection, education, and inclusion.
We are looking for a Data Engineer who is passionate about building scalable data pipelines, working with complex healthcare datasets, and contributing to a modern, cloud-native data architecture.
If you thrive in a fast-paced, data-driven environment and have strong experience with Python, Spark, Databricks, AWS, SQL, and related technologies, we'd love to hear from you.
What You'll Do:
Design and Develop Data Pipelines:
  • Develop and maintain robust data pipelines using Python, Spark, Databricks, SQL, and SSIS
  • Develop, maintained, and optimized ETL workflows using SQL Server Integration Services (SSIS) within Visual Studio, enabling reliable data ingestion, transformation, and automation across enterprise data pipelines.
  • Implement and orchestrate ETL/ELT workflows using SSIS
  • Build reliable, repeatable processes that support the ingestion and transformation of large healthcare datasets

Data Integration and Management:
  • Integrate data from diverse sources (AWS, on-prem, third-party vendors) into our enterprise data platform
  • Work with a wide range of file formats including CSV, XML, Parquet, Delta, and more
  • Apply strong data quality, cleansing, and curation practices to ensure accuracy and consistency
  • Optimize storage and compute resources for performance, cost, and scalability
  • Automate observability and monitoring across data pipelines and workloads

Metadata Management and Governance:
  • Implement and manage Unity Catalog for metadata, lineage, and access control
  • Ensure adherence to data governance, security, and privacy standards
  • Maintain clear documentation, data dictionaries, and lineage tracking
  • Contribute to automation of data observability and governance workflows

Performance Tuning and Troubleshooting:
  • Tune and optimize Spark jobs for speed, reliability, and cost efficiency
  • Diagnose and resolve performance bottlenecks across distributed systems
  • Apply JVM tuning and Spark optimization techniques to improve throughput

Data Maturity Lifecycle:
  • Support and enhance our Medallion architecture (bronze/silver/gold) to improve data quality and usability
  • Ensure data is processed, enriched, and validated at each stage of the lifecycle

Collaboration and Continuous Improvement:
  • Partner with data scientists, analysts, product teams, and business stakeholders to understand data needs
  • Implement CI/CD pipelines to streamline deployment and testing of data assets
  • Stay current with emerging technologies and bring forward recommendations to evolve our data platform

What You Bring:
Technical Skills:
  • Strong programming experience in SQL and Python or Scala
  • Hands-on experience with Apache Spark and Databricks
  • Knowledge of data cleansing, curation, and quality frameworks
  • Familiarity with Unity Catalog or other metadata management tools
  • Understanding of data governance, security, and compliance best practices
  • Experience working with AWS cloud services
  • Experience implementing or working within a Medallion architecture

Soft Skills:
  • Strong analytical and problem-solving abilities
  • Excellent communication and cross-functional collaboration skills
  • Ability to work independently and within a team environment
  • High attention to detail and commitment to quality

Preferred Qualifications:
  • AWS certifications (e.g., AWS Certified Data Analytics)
  • Experience with SQL and NoSQL databases
  • Background in a fast-paced, data-centric SaaS or healthcare environment

Compensation and Benefits
The salary range for this position is $69,000 - $129,000 per year, which represents the base pay the company reasonably and in good faith expects to pay for this role. Actual pay within this range will be determined based on factors such as relevant experience, skills, and qualifications.
Depending on the position, employees may also be eligible to participate in a company bonus or commission plan. All employees are eligible for a comprehensive benefits package, including medical, dental, and vision coverage, unlimited paid time off, and participation in the company's 401(k) plan with employer contribution.
Why we love Definitive, and why you will too!
  • Industry leading products
  • Work hard, and have fun doing it
  • Incredibly fast growth means limitless opportunity
  • Flexible and dynamic culture
  • Work alongside some of the most talented and dedicated teammates
  • Definitive Cares, our community service group, gives all of us a chance to give back
  • Competitive benefits package including great healthcare benefits and a 401(k) match

What our Employees are saying about us on Glassdoor:
"Great Work atmosphere, great work life balance, excellent company to work for, amazing top notch product, incredible customer service, lots of tools to help you succeed."
-Business Development Manager
"Great team. Amazing growth. Employees are treated very well."
-Research Analyst
"I have waited 36 years to work at a dream job for a dream company and I am so happy to have finally got there."
-Profile Analyst
If you don't fit all of these qualifications, but believe you're still a great fit, feel free to apply and tell us why in your cover letter.
If you are a California, Colorado, New York City or Washington resident and this role is a remote role, you can receive additional information about the compensation and benefits for this role, which we will provide upon request.
Definitive Hiring Philosophy
Definitive Healthcare is an equal opportunity employer that celebrates diversity and is committed to creating an inclusive workplace with equal opportunity for all applicants and teammates. Our goal is to recruit the most talented people from a diverse candidate pool regardless of race, color, religion, age, gender, gender identity, sexual orientation or any other status. If you're interested in working in a fast growing, exciting working environment - we encourage you to apply!
Privacy
Your privacy is important to us. Please review our Candidate Privacy Notice which tells you how we use and process your personal information.
Please note: All communications regarding the hiring process at Definitive Healthcare will come directly from one of our corporate recruiters or coordinators using an @definitivehc.com email address. We do not advertise open roles on Facebook and will never request money transfers or ask candidates to purchase equipment with a promise of reimbursement. If you receive any suspicious communication, please contact [email protected] to verify your status in the application process.

Top Skills

Apache Airflow
Spark
AWS
Csv
Databricks
Delta
Gitlab Ci
Jenkins
Jvm
Medallion Architecture
NoSQL
Parquet
Python
Scala
SQL
Ssis
Unity Catalog
XML

Similar Jobs at Definitive Healthcare

3 Hours Ago
Remote or Hybrid
Framingham, MA, USA
Mid level
Mid level
Big Data • Healthtech • Software
The Customer Success Manager ensures customer satisfaction by advocating for clients, driving product adoption, documenting customer data, and collaborating with internal teams to meet customer success metrics.
7 Days Ago
Remote or Hybrid
Framingham, MA, USA
Senior level
Senior level
Big Data • Healthtech • Software
The Senior Data Scientist will develop advanced analytics to enhance product value, mentor junior members, and lead initiatives in healthcare commercialization.
Top Skills: AWSAzureGCPJupyter NotebookPysparkPythonSQL
8 Days Ago
Remote or Hybrid
Framingham, MA, USA
Mid level
Mid level
Big Data • Healthtech • Software
As a Data Scientist, you will design and operationalize advanced analytics, develop scalable models, and collaborate with teams to enhance product capabilities in healthcare.
Top Skills: AWSAzureGCPJupyter NotebookMS OfficePysparkPythonSQL

What you need to know about the Bristol Tech Scene

Along with Gloucester, Swindon and Bath, Bristol is part of the "Silicon Gorge" tech hub, a region in the U.K. renowned for its high-tech and research-driven industries, with a particular emphasis on sustainability and reducing environmental impact. As the European Green Capital, Bristol is home to 25,000 cleantech companies, including Baker Hughes and unicorn Ovo Energy. The city has committed to achieving net-zero emissions within the next decade.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account