I have had too many friends with cancer, and this is what motivated me to build Fusion Oncology. I hit a major breakthrough at the University of Pittsburgh during my Master's thesis, where I developed a novel method to fuse XGBoost and BERT into a singular, high-complexity intelligence model. I am now turning my full attack on cancer with these same architectures. I realized that while I've spent my life reverse engineering machines to hunt for vulnerabilities in silicon, the most complex and broken binary on the planet is the human cell. For too long, "precision oncology" has been stalled by siloed models and a growing mountain of genomic noise. This project changes that by treating cancer as a technical exploit.
This isn't just another script; it is a 19,000+ line open-source framework, bulletproofed with 305 tests, that achieves true multi-modal integration. By PCA-compressing 768-dimensional DNABERT-2 sequence embeddings and weighting them by the drug sensitivity of 1,002 GDSC cell lines, I've built a single, jointly-learned XGBoost classifier. This fusion creates the Fusion Index, calculated as Importance×Instability×1000, a definitive metric for therapeutic vulnerability validated against 62 COSMIC driver genes. This isn't just a list of targets; it's a path from raw drug sensitivity data to a life-saving treatment roadmap in one click.
@kaggle Notebook: https://www.kaggle.com/code/mytechnotalent/fusion-oncology
GitHub Code:
https://github.com/mytechnotalent/fusion_oncology
#CancerResearch #PrecisionMedicine #ArtificialIntelligence #OpenSource #Genomics #DrugDiscovery #Kaggle







