Mukherjee Lab

Engineering & Innovation / Mukherjee Lab

Mukherjee Lab

Glioblastoma Outcomes & Palliative Care Analysis

I collaborated with the Mukherjee Lab to analyze national cancer database data and
evaluate the impact of palliative care and facility characteristics on outcomes for
brain tumor patients. I contributed to data extraction, processing, segmentation,
analytic pipelines, and summary tables that informed our findings.

01
My RoleData analyst and research assistant
02
ToolsR, Python, pandas, NumPy, lifelines, Matplotlib
03
TimelineJan 2024 – Present
04
Focus AreasSurvival analysis, logistic regression, data segmentation

R Code Repository

Statistical analyses and model logic were organized in R. The local script includes cohort filtering, recoding, descriptive statistics, and tests for group differences.

Open R Script

# Cohort setup
citrue <- NCDBMeningioma_1[
  NCDBMeningioma_1$PALLIATIVE_CARE == 0, ]

pctrue <- NCDBMeningioma_1[
  NCDBMeningioma_1$PALLIATIVE_CARE == 4, ]

plcrecode <- factor(ifelse(
  ciandpctrue$PALLIATIVE_CARE == 4,
  "1", "0"))

agep <- wilcox.test(
  citrue$AGE, pctrue$AGE,
  exact = FALSE)

Python Project (Data Segmentation)

I also contributed to a Python project structure used for segmentation and inference workflows, with reusable training, evaluation, and inference entry points.

gbm_segmentation_project/
|-- Resources/Toby/
|   |-- main.py
|   |-- train.py
|   |-- inference.py
|   |-- train_unetr.py
|   |-- transformer.py
|   |-- utils.py
|   |-- config.py
|   |-- requirements.txt
|   |-- combine_views.ipynb
|   `-- model_train.sh

mode = train | evaluate | inference

Open Python Entry Point

Key Contributions

  • Segmented and cleaned national database records for downstream analysis.
  • Built reproducible processing and variable-derivation workflows.
  • Performed exploratory analysis to inform downstream modeling.
  • Assisted with Kaplan-Meier survival analysis and logistic regression modeling.
  • Generated publication-ready tables and figure-ready summaries.

KM Curve by Facility Type

p = 0.00043

Survival differs by facility characteristics after cohort stratification.

KM Curve by Treatment Intent

p < 0.0001

Treatment intent separates the survival curves strongly across follow-up.

KM Curve for Palliative Care

p < 0.0001

Palliative-care status was associated with a distinct survival profile.

Baseline Characteristics

Characteristic Curative Palliative P
Mean Age 64 +/- 15 73 +/- 15 <0.001
Female 26.9% 29.7% 0.299
White 87.1% 82.3% Ref
Black 12.5% 10.5% 0.400

View full table

Logistic Regression

Variable Univariable OR Multivariable AOR P
Increasing Age 1.05 1.05 <0.001
Charlson-Deyo 2.29 1.30 0.344
Insurance 1.14 1.73 0.447
Facility Type 2.53 5.36 0.001

View analysis chart

Overall Survival

Variable Hazard Ratio P
Increasing Age 1.03 [1.03-1.04] <0.001
Male vs Female 1.06 [1.04-1.07] <0.001
Charlson-Deyo 1.39 [1.30-1.49] <0.001
Private Insurance 0.90 [0.86-0.94] <0.001

View data dictionary