Michał Krassowski

52 Followers
54 Following
13 Posts
Bioinformatician applying statistics and omics methods to study endometriosis and pain conditions. PhD candidate at Oxford WRH based WCHG. Own opinions.
websitehttps://www.wrh.ox.ac.uk/team/michal-krassowski
Open source@krassowski
LinkedInhttps://linkedin.com/in/michal-krassowski/
What kind of classification is it even supposed to be? 7 kinds of fish and then wheat?
The feeling when searching in PubMed you get one result but then it turns out to be a page of "Encyclopedia of Neuroscience" which lists all terms starting with a given letter and incidentally all your serach terms are mentioned in such a mega-article.

If you ever wanted, you can now display and interact with LocusZoom plots directly in Python kernel for Jupyter: https://github.com/krassowski/jupyter-locuszoom

This is on PoC stage so please let me know if you find it useful in which case I will find time to polish the rough edges.

GitHub - krassowski/jupyter-locuszoom: LocusZoom in Jupyter notebooks

LocusZoom in Jupyter notebooks. Contribute to krassowski/jupyter-locuszoom development by creating an account on GitHub.

GitHub

This is neat: CorDiffViz: an R Package for visualizing multi-omics differential correlation networks (https://github.com/sqyu/CorDiffViz)

Especially the demo website: https://diffcornet.github.io/CorDiffViz/demo.html - having an interactive network to play around with different methods is useful to get intuition/didactic purposes (but of course could be misused).

GitHub - sqyu/CorDiffViz: Visualization for Differential Correlation Matrices

Visualization for Differential Correlation Matrices - GitHub - sqyu/CorDiffViz: Visualization for Differential Correlation Matrices

GitHub
Recent metaboGWAS studies are much nicer to work with. I'm so happy to see "effect allele" and "non-effect allele" (rather than old confusing RA which some used as risk and others as reference!), reporting of position together with build number, and inclusion of SNP/metabolite detection numbers.

How to quickly obtain amino acid change information for all known variants in a given genomic region if ANNOVAR is an overkill? Well, of course use an API from someone who figured it out already. There is a number of services which can help (Entrez, Biomart, ALFA, UCSC).

I've added an example on how to do that from Python using new version of easy-entrez client for the Entrez API: https://github.com/krassowski/easy-entrez#obtaining-amino-acids-change-information-for-variants-in-given-range

#bioinformatics #entrez

GitHub - krassowski/easy-entrez: Retrieve PubMed articles, text-mining annotations, or molecular data from >35 Entrez databases via easy to use Python package - built on top of Entrez E-utilities API.

Retrieve PubMed articles, text-mining annotations, or molecular data from >35 Entrez databases via easy to use Python package - built on top of Entrez E-utilities API. - GitHub - krassowski/easy...

GitHub

I don't know what is more annoying: the fact that many articles are still being published behind a paywall on Wiley Online Library, or that we cannot use university credentials to get past the paywall them because the website has silly JavaScript errors like `Uncaught ReferenceError: thiss is not defined` (literally a typo dangling another day!). As if the system was designed to withheld knowledge!

#opendata #openaccess

GPTchat knows some things about Mendelian Randomisation.

While it easily produces nonsensical answers after being told the previous answer was wrong, the default answers are often correct. I expected the standard three assumptions but

#CausalInference

How do Reactome pathways get collapsed to gene sets?

I noticed GMT files from Reactome (and Reactome in MSigDB) sometimes include identifiers of genes which are negatively regulated by given pathway. It looks like a bad thing for GSEA and friends. I wonder if this is intentional or not. An example is IL18 in IL10 signalling pathway (https://reactome.org/content/detail/R-HSA-6783783).

Reactome | Interleukin-10 signaling

Reactome is pathway database which provides intuitive bioinformatics tools for the visualisation, interpretation and analysis of pathway knowledge.

What does negative R-squared mean?

I learned of a another way to think about it today. In probability models Efron's R2 seems equivalent to Brier Skill Score, so we can interpret it as very poor relative calibration.

Maybe it is obvious, but phrasing it in terms of calibration makes it more intuitive to me (and it show beautifully on calibration plots). I shared a comparison of Efron's R2 and BSS equations on https://stats.stackexchange.com/a/596505/233900.
#statistics

What does negative R-squared mean?

Let's say I have some data, and then I fit the data with a model (a non-linear regression). Then I calculate the R-squared ($R^2$). When R-squared is negative, what does that mean? Does that mean my

Cross Validated