Certificate in Clinical Informatics and Data Science: 3 Digital Badges

Informatics Badge
For More Information

For more information about the Certificate in Clinical Informatics and Data Science, please contact Noëlle Foster, PhD, at nsf44@rwjms.rutgers.edu.

For more information on Badging, please see the Microcredentialing and Digital Badging at Rutgers University and Digital Badges at RBHS.

Description of the Clinical Informatics and Data Science Certificate

This certificate will be granted upon the completion of a digital three badge sequence in the process and methods of machine learning as used in clinical research and reporting.  It is designed for students who are already professionally engaged in health care or public health and who want to improve their technical skills.  Using asynchronous lectures, quizzes, and a final project, students will develop the knowledge and skills needed to take advantage of new health information technologies for clinical research, quality reporting, and surveillance, and care delivery improvement.  Topics center largely on various methods of machine learning as applied in health care and the context in which they are used, including compliance and legal issues and fair AI considerations.

Each digital badge is taught asynchronously beginning on the first day of every other month and stays open for eight weeks.  Knowledge and skills are assessed via a quiz at the end of each module. Participants must score a minimum of 80% on each quiz prior to moving on to the next module.  The final project is incorporated into the third digital badge and will require the student to design and execute their own project to be assessed by program staff.  As the final project will require a fair amount of advanced planning, please review the final project description to ensure you can meet all the requirements before deciding to pursue the full certificate.

The ideal participant for this certificate will possess a background in clinical research, public health, or healthcare delivery, and have a basic understanding of statistics and computing.  Familiarity with HIPAA and the health care environment is necessary.

Earners of this certificate will have demonstrated an understanding of the process and methods for applying advanced machine learning and data science methods to clinical data for research, clinical, and public health reporting use. This certificate is issued by the Robert Wood Johnson Medical School and the New Jersey Alliance for Clinical and Translational Science (NJACTS).


Course Instructors

Branimir Ljubic, Ph.D., M.D. – Dr. Ljubic will be the primary instructor for the core technical elements (computer science methods, machine learning) of the program.  He is widely published on the use of machine learning and other computer methods in clinical applications.  He holds a Ph.D. in Computer and Information Science from Temple University and an M.D. from Medical School, University of Belgrade. He is the Senior Scientist for Clinical, Medical Informatics and Advanced Research Computing in the Rutgers Office of Advanced Research Computing (OARC). The OARC is the university’s centralized resource for research computing, including machine learning and clinical informatics. As part of the Office of Information Technology (OIT), OARC provides Rutgers researchers with essential computing, networking, storage, and data-handling capabilities, and offers students valuable exposure, training, and education in advanced research computing (HPC computing), clinical and medical informatics, machine learning methods, data analysis, and programming languages. In addition to these core services, OARC manages critical resources like the Clinical Research Data Warehouse (CRDW), REDCap for secure data collection and management, and the Research Data Storage (RDS) system, ensuring comprehensive support for the university’s research community.

Noëlle Foster, Ph.D. – Dr. Foster specializes in the collection and preparation of secondary data for use in research, and will teach the compliance and data preparation sections.  She also serves as the overall badge coordinator and administrator, and is your first point of contact for general questions.  She is the Principal Clinical Informatics Analyst for the RWJMS Psychiatry Department.  She holds an MS in Information Technology (Database Systems) and a Ph.D. in Science and Technology Studies, both from Rensselaer Polytechnic Institute.

Course One: Core Skills

This is the first of three digital badges that comprise the Certificate in Clinical Informatics and Data Science.  Participants will be assessed on core technical skills, data science workflow, and compliance and other legal considerations in which clinical data is collected and used.


  • Introduction and Getting Started
  • Compliance Considerations and Secondary Data Use
  • Basics of Data Cleanup and Preparation
  • Clinical Informatics and Advanced Research Computing
  • Introduction to Machine Learning
  • Linear Models: Linear Regression, SVM, Logistic Regression
  • Traditional Learning Models: k-Nearest Neighbor
  • Traditional Learning Models: Decision Trees
  • Ensemble Methods: Random Forest
  • Traditional Learning Models: Perceptron
Course Two: Advanced Skills

In this second badge, we will be dealing with more sophisticated complex methods for knowledge discovery. 


  • Deep Machine Learning Methods: Artificial Neural Networks
  • Deep Machine Learning Methods: Convolutional Neural Networks
  • Deep Machine Learning Methods: Recurrent Neural Networks
  • Deep Machine Learning Methods: Generative Adversarial Networks
  • Data Mining Introduction
  • Data Mining Methods
  • Unsupervised Learning: Clustering
Course Three: Specialized Methods

This badge will focus on specialized methods for unstructured data.  We will also touch on emerging social considerations of machine learning such as the use of artificial intelligence in clinical sites and its potential to ameliorate or reinforce social disadvantages.  This badge will include the final project. 


  • Probabilistic Models: Bayes
  • Social Networks: Basic Theory
  • Social Networks: Random and Power Law Networks
  • Natural Language Processing
  • Emerging considerations in clinical data
  • Image Processing
  • Final Project
Final Project

For the final project, each student will be expected to design and execute a machine learning project on a subject of their choosing.  The student will be responsible for sourcing the data, including any compliance and IRB considerations.  In the event the student wishes to use data from an existing project, it will be the student’s responsibility to add the named program staff to their IRB protocol to allow data sharing in advance of project start. 

There are open de-identified data sets that can be used for this project.

The student will turn in two products for successful project completion:

  • Project proposal (3 pages):
    1. A description of the proposed project, including what is the question you want this project to answer
    2. High-level description of the selected dataset
    3. The method selected and a justification for why this is the best solution to answer your proposed question
    4. What, if any transformations will be made to the data to facilitate use in the chosen method
    5. A work plan, including estimated level of effort for each task. This can be in the form of a GANTT or PERT chart.
  • A final presentation (10 minutes) and written report (3 pages)
    1. Presentation to be recorded and uploaded to Canvas and should be written as a discussion of the results and the answer to your question, as if presenting to a supervisor. It should include any associated tables or visualizations in the presentation slides.
    2. Three page report to include a discussion of the process itself, along with any unexpected problems along the way, how realistic your work plan ended up being, and what are your lessons learned for the next project

It is expected that each student will work on their own individual project.  In the event that we have members of the same team participating on the same time who wish to collaborate on an existing project, that may be possible provided each student owns different pieces of the project and can justify which parts are whose. 

Projects will be graded pass/fail. 

No Course Fees

This program is free, funded entirely by NJACTS (UL1 TR003017, KL2 TR003018, and TL1 TR0030). 

Frequently Asked Questions (FAQ)

Is it really free?

Yes it is.  This program is part of NJACTS and funded under UL1 TR003017, KL2 TR003018, and TL1 TR0030. 

I’m a medical, nursing, or undergraduate student planning to pursue a clinical career, so not quite yet professionally engaged.  Can I enroll?

Medical and nursing students are welcome to enroll.  Undergraduates should already have some evidence of clinical work experience, such as internships, volunteer activities, or lab assistant positions. Please contact Dr. Foster to evaluate if your experience is sufficient.

How strong should my computer skills already be?

This certificate assumes familiarity with computing sufficient to do 100-level undergraduate statistics, which would include the basics of data technologies such as R and Python.  While resources to learn these underlying activities are made available at the beginning of the certificate, they will not be taught themselves.  If you have never heard of either, you are unlikely to be able to keep up with the lectures, which will focus on the machine learning libraries. 

Do I have to be CITI certified to enroll?

If you are planning to do your final project using something other than an open and freely available de-identified dataset, you will need to meet all the obligations of your governing IRB, which will include CITI training. 

Will there be protected health information in the materials?

No.  All data used in lectures and presentations will be synthetic or fully deidentified.  If your final project will involve PHI you will need to add the reviewers to your IRB protocol before it can be assessed.

How long will it take for me to complete the certificate?

You will have one year from the start to complete the full certificate.  If you work through steadily and without pausing it could be done in six months, depending on your final project.

What are the learning outcomes associated with earning this certificate

This certificate is intended to give people with a thorough grounding in health care a grounding in the technical skills needed to make use of health data, specifically:

  1. Determine the best approach to answering questions using clinical data-driven computational methods
  2. Breakdown the question for solution using available data and machine learning tools
  3. Design and complete a project based on these approaches
  4. Understand the context in which these efforts are undertaken, including legal and social considerations

It is designed as a complement to an existing clinical career and is not intended to prepare you for a purely technical career in health information technology. 

Where or how will I be able to apply this information?

There are three areas in which this kind of work is necessary and useful: research, quality assurance and reporting, and clinical care delivery, which can use machine learning algorithms to guide suggestions for improved care at the time of delivery. 

Workforce Development Pages and Links

Hover for information, click to visit

TL1 Predoctoral/Postdoctoral Awards

Hover for information, click to visit page

KL2 Mentored Career Development Awards

Hover for information, click to visit page