We’re building a world of health around every individual — shaping a more connected, convenient and compassionate health experience. At CVS Health®, you’ll be surrounded by passionate colleagues who care deeply, innovate with purpose, hold ourselves accountable and prioritize safety and quality in everything we do. Join us and be part of something bigger – helping to simplify health care one person, one family and one community at a time.
Position Summary
CVS Health's Analytics & Behavior Change (A&BC) team is an organization working to solve some of the most challenging problems at the intersection of technology and healthcare. A&BC leverages advanced analytics, clinical informatics, and hypothesis-driven approaches to transform data into actionable, customer-centric insights that drive growth, improve health outcomes, and expand access to healthcare across all CVS Health businesses. Our teams build next-generation data and AI products that help power CVS Health to make healthier happen for 100+ million customers.
The A&BC organization is looking to grow its Clinical Data Science & AI team. Join us as we embark on an exciting journey to drive a transformational shift in how CVS Health leverages clinical data and analytics to become the leader in consumer healthcare in the U.S.
As a
Data Scientist - Clinical AI, you are tasked with activating CVS Health's clinical data repository to improve outcomes across multiple lines of business and use cases. You will serve as a bridge between clinical data assets and the analysts, data scientists, and business partners who consume them—ensuring data is accessible, well-documented, fit for purpose, and aligned with clinical and regulatory standards.
You Will
- Extract signal from unstructured clinical text. Apply NLP and language model techniques to clinical notes, CCD documents, and other free-text clinical data to generate structured, actionable features for downstream analytics and predictive models.
- Build and fine-tune Small Language Models (SLMs). Design, train, and evaluate domain-specific SLMs tailored to clinical use cases — balancing performance, cost, latency, and compliance requirements.
- Utilize LLMs where applicable. Leverage large language models where they add clear value (e.g., training data creation, entity extraction, zero-shot classification) while knowing when traditional ML, rules-based approaches, or simpler statistical methods are the right tool for the job.
- Develop predictive analytics solutions. Build and validate predictive models using both classical ML (gradient boosting, logistic regression, survival analysis) and modern deep learning approaches to support clinical decision-making and population health initiatives.
- Conduct rigorous Exploratory Data Analysis (EDA). Deeply explore clinical datasets — structured and unstructured — to uncover patterns, assess data quality, identify feature candidates, and inform modeling strategy before jumping to solutions.
- Communicate findings clearly. Present methodology, results, and recommendations to technical and non-technical stakeholders through well-crafted visualizations, notebooks, and presentations. Translate complex AI/ML concepts into language that clinical and business partners can act on.
- Collaborate across teams. Work with machine learning engineers, data engineers, clinical informaticists, and business partners to ensure clinical data pipelines support AI/ML workflows and that model outputs are integrated into products and decision-making processes.
- Stay current and stay curious. Continuously evaluate emerging techniques in NLP, foundation models, and clinical AI. Bring new ideas to the team, prototype rapidly, and advocate for approaches grounded in evidence rather than hype.
- Uphold data governance standards. Ensure all work complies with HIPAA, data privacy regulations, and internal data stewardship policies, particularly when handling PHI and unstructured clinical text.
Required Qualifications
- 2+ years of experience in data science, machine learning, or applied NLP — preferably in healthcare or a similarly regulated domain.
- Hands-on experience with NLP — text preprocessing, tokenization, named entity recognition (NER), text classification, topic modeling, or similar techniques applied to real-world unstructured data.
- Practical experience with LLMs and/or SLMs — prompt engineering, fine-tuning, RAG architectures, evaluation frameworks, or deploying language models in production or research settings.
- Strong foundation in traditional machine learning — supervised and unsupervised methods, feature engineering, model selection, cross-validation, and performance evaluation.
- Best coding practices – you use version control (Git/Github), commit your work regularly, write clean and reproducible code, and understand that well-organized repositories are as important as well-build models.
- Deep EDA skills — ability to systematically explore datasets, identify data quality issues, surface insights, and make informed decisions about modeling approach before writing a single line of model code.
- Proficiency in Python (pandas, scikit-learn, PyTorch or TensorFlow, Hugging Face Transformers) and SQL for working with large-scale healthcare datasets.
- Experience with cloud-based data and ML platforms, preferably Google Cloud Platform (GCP) — BigQuery, Vertex AI, or equivalent.
- Excellent presentation and communication skills — you can stand in front of a room and clearly explain what you built, why you built it that way, and what it means for the business.
- Judgment and common sense — you understand that not every problem needs an LLM, you meet your deadlines, you ask for help when you're stuck, and you don't over-engineer solutions.
- A genuine curiosity and desire to learn — you read papers, you try new tools, you ask "why," and you're energized by problems you haven't solved before. You know when a rabbit hole is worth diving into and when to pull back, stay focused, and deliver.
Preferred Qualifications
- Experience working with clinical text data — clinical notes, discharge summaries, pathology reports, or similar unstructured healthcare documents.
- Knowledge of clinical coding systems and terminologies (ICD-10, SNOMED-CT, LOINC, RxNorm, CPT, NDC, UMLS) and their relevance to NLP pipelines.
- Familiarity with clinical data standards (HL7, FHIR, CCD/C-CDA) and common data models (e.g., OMOP).
- Experience building or contributing to clinical NLP pipelines — entity extraction, relation extraction, negation detection, or section segmentation from clinical narratives.
- Experience with model evaluation in clinical contexts — understanding of sensitivity/specificity tradeoffs, clinical validation, and responsible AI practices in healthcare.
- Familiarity with MLOps practices — model versioning, experiment tracking, CI/CD for ML, model monitoring.
- Experience working directly with clinical stakeholders (physicians, nurses, clinical operation teams, etc) and tailoring presentations, findings, and recommendations to the appropriate audience level – from executive summaries for leadership to detailed methodology reviews for technical notes.
- Privacy, security, and compliance experience: HIPAA/HITRUST, de-identification/tokenization, PHI handling.
Education
- Bachelor’s degree in health informatics, biostatistics, computer science, data science mathematics, biomedical informatics, or related—or an equivalent combination of formal education and experience.
- Master's degree or higher in Health Informatics, Biomedical Informatics, Clinical Informatics, Public Health, Epidemiology, Data Science or a related field is a plus – but not a substitute for demonstrated ability to ship real-world solutions
- Clinical background (RN, PharmD, MD, or similar) with transition into data science or AI is a genuine differentiate for this role.
Anticipated Weekly Hours
40
Time Type
Full time
Pay Range
The Typical Pay Range For This Role Is
$79,310.00 - $158,620.00
This pay range represents the base hourly rate or base annual full-time salary for all positions in the job grade within which this position falls. The actual base salary offer will depend on a variety of factors including experience, education, geography and other relevant factors. This position is eligible for a CVS Health bonus, commission or short-term incentive program in addition to the base pay range listed above.
Our people fuel our future. Our teams reflect the customers, patients, members and communities we serve and we are committed to fostering a workplace where every colleague feels valued and that they belong.
Great Benefits For Great People
We take pride in offering a comprehensive and competitive mix of pay and benefits that reflects our commitment to our colleagues and their families.
This full‑time position is eligible for a comprehensive benefits package designed to support the physical, emotional, and financial well‑being of colleagues and their families. The benefits for this position include medical, dental, and vision coverage, paid time off, retirement savings options, wellness programs, and other resources, based on eligibility.
Additional details about available benefits are provided during the application process and on Benefits Moments.
We anticipate the application window for this opening will close on: 07/31/2026
Qualified applicants with arrest or conviction records will be considered for employment in accordance with all federal, state and local laws.