"Cellular morphology emerges from polygenic, distributed transcriptional variation", Paylakhi et al. 2026
https://www.biorxiv.org/content/10.64898/2026.03.12.711281v1

#CellBiology #transcriptomics #CellPainting #RNAseq

Cellular morphology emerges from polygenic, distributed transcriptional variation

Height and most disease risk are known polygenic traits: characteristics governed by multiple genes at different loci instead of a select few. Though we are beginning to understand how genetic variation impacts cell morphology, whether such an analogous polygenic architecture operates at the cellular level, where morphology integrates cytoskeletal organization, organelle positioning, and metabolic state, has yet to be systematically tested. Here, we demonstrate that cellular morphology behaves as a polygenic trait by integrating multimodal modeling, perturbation profiling, and population scale genetic variation. A shared latent-space autoencoder trained on four large perturbation datasets predicts morphology from gene expression and generalizes without retraining to matched RNA-seq and Cell Painting profiles from 100 genetically diverse iPSC donors. The model predicted 17 morphological features (R > 0.6, permutation FDR q < 0.05), enriched for spatial organelle distribution and cytoskeletal architecture. Predictive performance does not arise from dominant gene-phenotype relationships: individual genes contribute modestly, and marginal gene-morphology correlations are uniformly weak, revealing a distributed regulatory architecture. Despite this polygenicity, CRISPR perturbation data from the JUMP consortium validates specific model-prioritized genes, such as the cytoskeletal regulator TIAM1, membrane trafficking factor RAB31, and mitochondrial-associated membrane transporter ABCC5, as molecular anchors whose disruption produces feature-specific morphological shifts. Transcriptome-wide association analyses identify correlational variant-gene-morphology chains linking cis-regulatory variation through mitochondrial metabolism (PDHX) and iron transport (SLC11A2) to cellular architecture. These results establish cellular morphology as a polygenic systems phenotype, extending the omnigenic framework to the cellular level and providing a biological basis for interpreting cross-modal prediction in functional genomics. ### Competing Interest Statement The authors have declared no competing interest. AnalytiXIN Fellowship in Life Sciences

bioRxiv

STAR Suite: Integrating transcriptomics through AI software engineering in the NIH MorPhiC consortium https://www.biorxiv.org/content/10.64898/2026.03.09.710580v1

" In just four months, a single developer added over 92,000 lines to the original 28,000-line codebase to produce four unified modules: STAR-core, STAR-Flex, STAR-Perturb, and STAR-SLAM that can be installed as a pre-compiled binary without introducing any new dependencies. "

#AI #rnaseq

STAR Suite: Integrating transcriptomics through AI software engineering in the NIH MorPhiC consortium

To accommodate rapid methodological turnover, bioinformatics pipelines typically consist of discrete binaries linked via scripts. While flexible, this architecture relies on intermediate files, sacrificing performance, and treating complex codebases as static silos. For example, the STAR aligner {dobin2013star}\---|the standard engine for transcriptomics\---|uses an external script for adapter trimming, necessitating the decompression and re-compression of large files. These limitations presented scalability problems for uniform processing of data in the NIH MorPhiC consortium. We present our solution, STAR Suite, a human-engineered and AI-implemented modernization that integrates functionality directly into the C++ source. In just four months, a single developer added over 92,000 lines to the original 28,000-line codebase to produce four unified modules: STAR-core, STAR-Flex, STAR-Perturb, and STAR-SLAM that can be installed as a pre-compiled binary without introducing any new dependencies. This work demonstrates a new paradigm for the rapid evolution of high-performance bioinformatics software. ### Competing Interest Statement L.H.H. and K.Y.Y. have equity interest in Biodepot LLC. The terms of this arrangement have been reviewed and approved by the University of Washington in accordance with its policies governing outside work and financial conflicts of interest in research. National Institutes of Health, U24HG012674

bioRxiv

[Edit] I think I finally have some winners WITH cell identities, though I had to dig for that part. Thanks!

Anyone working in #Genomics #SingleCell #RNASeq aware of a good data source for human kidney single-cell data that has *labeled* cell types.

We have a #GWAS that the target tissue would be kidneys. I'm stuck doing cell type analyses until I can find some labeled cell type data to connect back to the association data.

Pipeline release! nf-core/scnanoseq v1.2.2 - nf-core/scnanoseq v1.2.2 - Thallium Tiger!

Please see the changelog: https://github.com/nf-core/scnanoseq/releases/tag/1.2.2

#10xgenomics #longreadsequencing #nanopore #rnaseq #scrnaseq #singlecell #nfcore #openscience #nextflow #bioinformatics

Meet Koushik, our newest bioinformatics team member! 💻

He is a master's student who is passionate about RNA-seq data analysis using machine learning and working with Shakunthala.

https://www.izmb.uni-bonn.de/en/pbb/team

#Bioinformatics #RNAseq #AI #MachineLearning #BigData #DataScience #Python
@boas_pucker

In the Plant Bioinformatics course this week, students learn how to critically assess papers in scientific journals.

Focus:
• Is the data available?
• Are the analyses reproducible?
• Are the claims supported by the evidence?

Training the next generation of scientists to value reproducibility and transparency.

Course material:
https://github.com/bpucker/PlantBioinformatics

#RNAseq #Bioinformatics #OpenScience #PeerReview
@PuckerLab

Pipeline release! nf-core/rnaseq v3.23.0 - nf-core/rnaseq v3.23.0 - Gallium Gecko!
RNA sequencing analysis pipeline using STAR, RSEM, HISAT2 or Salmon with gene/isoform counts and extensive quality control.
Please see the changelog: https://github.com/nf-core/rnaseq/releases/tag/3.23.0

#rna #rnaseq #nfcore #openscience #nextflow #bioinformatics

Release nf-core/rnaseq v3.23.0 - Gallium Gecko · nf-core/rnaseq

What's Changed Bump version after release 3.22.2 by @pinin4fjords in #1663 Update preprocessing with multiple rRNA removal tools by @pinin4fjords in #1664 Add ARM containers, changelog updates, an...

GitHub

New AI methods let scientists merge RNA‑seq, imaging and other data, revealing hidden cellular states. This multimodal approach could accelerate discoveries in cell biology and computational biology. Learn how machine learning bridges data integration across experiments. #MultimodalAI #CellBiology #RNAseq #ComputationalBiology

🔗 https://aidailypost.com/news/ai-enables-scientists-integrate-multiple-cell-measurements

#Genetics #Genomics #Bioinformatics

Anyone have any favorites for R packages to do cell-type classification of #PBMCs in #SingleCell #RNASeq? I've used scGate with manual marker decisions in the past, but it's been a while since I've had to do it so I wasn't sure how the landscape had changed.