Senior Data Engineer
Mass General Brigham
Mass General Brigham relies on a wide range of professionals, including doctors, nurses, business people, tech experts, researchers, and systems analysts to advance our mission. As a not-for-profit, we support patient care, research, teaching, and community service, striving to provide exceptional care. We believe that high-performing teams drive groundbreaking medical discoveries and invite all applicants to join us and experience what it means to be part of Mass General Brigham.
Job Summary
Channing Division of Network Medicine - Chronic Disease & Epidemiology UnitThe Data Engineer will play a crucial role in our research enterprise by modernizing the data warehouse of the Nurses’ Health Studies, a respected and far-reaching investigation into women’s health. We are seeking an individual who can redesign the data interface and successfully increase access and efficiency by improving the data model, structure, and workflow. The NHS research platform has been well served for decades by a homegrown computing infrastructure. This legacy system is now showing its age, however, affecting productivity and data-sharing ability. The NHS data collection continues to grow in scale and scope, requiring advanced and innovative methodologies for storage, access, sharing, and analysis.
We seek a talented and experienced Data Engineer to build interactive access queries through Gen3 (gen3.org); and develop accurate and automated data processing pipelines that implement validation tools. The ideal candidate will excel at self-direction, task management, creative problem-solving, and iterative communication with stakeholders. The Data Scientist must be someone who thrives in a dynamic academic research environment.
The Nurses’ Health Studies (NHS), renowned for epidemiological research of chronic disease, encompass four primary studies with over 300,000 participants followed for over four decades. The NHS research platform includes extensive questionnaire data covering disease, lifestyle, and nutrition, a biobank with millions of specimens and thousands of associated biomarkers, GIS-based exposure measures, imaging, and a wide range of disease outcomes. NHS researchers are among the world's foremost health researchers from Mass General Brigham, Harvard T.H. Chan School of Public Health, and other top institutions worldwide. Our data infrastructure team includes experts with decades of experience and dedication to the NHS. NHS researchers have published thousands of scientific papers that have helped improve the treatment and prevention of chronic disease.
Qualifications
PRINCIPAL DUTIES AND RESPONSIBILITIES:
Design and implement scalable data pipelines and storage solutions to accommodate the growth of NHS
Optimize data access and retrieval methods for maximum efficiency
Attain project goals through individual contributions and successful development team leadership
Develop and oversee an integration plan for new processes and tools
Meet regularly with stakeholders to assess data requirements and solution validation
Apply both strategic and hands-on efforts when delivering data and analytics solution action plans to stakeholders
Establish data governance practices to ensure data quality, security, and compliance
Understand and maintain health data compliance requirements
QUALIFICATIONS:
Master’s or PhD in Computer Science, Data Science, Statistics, or related field
3+ years data engineering experience
Indepth knowledge of GraphQL and/or other graph query languages.
Experience with Gen3 query development and data submission workflows
Experience with Kubernetes Management. Extensive knowledge on postgresSQL.
Postgres and Kubernetes are mandatory.
Previous experience with SQL preferred
Extensive UNIX scripting for Kubernetes/Files
Proficient in Python and/or Java, including the use of APIs or web services
Experience with Pandas and Large scale data movements. Experience with data modeling and architecture.
Understanding of health data compliance, security plans, data protection, and the types of controls associated with them is a plus
SKILLS/ ABILITIES/ COMPETENCIES REQUIRED:
Promote a collegial, team-oriented work style
Work with IT and Project Management personnel to establish strategy, deadlines and resource needs
Foresee obstacles, identify workarounds, leverage resources, rally teammates.
Proven analytical and problem-resolution skills
Proactive, innovative, self-motivated, and able to focus in a dynamic working environment
Strong grasp of business processes, industry trends, and organizational goals to align technology technology decisions with business objectives
Stay current with updates to established data technologies as well as emerging technologies
Excellent presentation skills
Excellent written and oral communication skills are critical; must summarize and present data and information in a highly effective manner over the phone, email or video chat.
Additional Job Details (if applicable)
Physical Requirements
- Standing Occasionally (3-33%)
- Walking Occasionally (3-33%)
- Sitting Constantly (67-100%)
- Lifting Occasionally (3-33%) 20lbs - 35lbs
- Carrying Occasionally (3-33%) 20lbs - 35lbs
- Pushing Rarely (Less than 2%)
- Pulling Rarely (Less than 2%)
- Climbing Rarely (Less than 2%)
- Balancing Occasionally (3-33%)
- Stooping Occasionally (3-33%)
- Kneeling Rarely (Less than 2%)
- Crouching Rarely (Less than 2%)
- Crawling Rarely (Less than 2%)
- Reaching Occasionally (3-33%)
- Gross Manipulation (Handling) Constantly (67-100%)
- Fine Manipulation (Fingering) Frequently (34-66%)
- Feeling Constantly (67-100%)
- Foot Use Rarely (Less than 2%)
- Vision - Far Constantly (67-100%)
- Vision - Near Constantly (67-100%)
- Talking Constantly (67-100%)
- Hearing Constantly (67-100%)
Remote Type
Work Location
Scheduled Weekly Hours
Employee Type
Work Shift
Pay Range
$92,102.40 - $134,056.00/Annual
Grade
7
EEO Statement:
Mass General Brigham Competency Framework
At Mass General Brigham, our competency framework defines what effective leadership “looks like” by specifying which behaviors are most critical for successful performance at each job level. The framework is comprised of ten competencies (half People-Focused, half Performance-Focused) and are defined by observable and measurable skills and behaviors that contribute to workplace effectiveness and career success. These competencies are used to evaluate performance, make hiring decisions, identify development needs, mobilize employees across our system, and establish a strong talent pipeline.