Really good post about the state of social mobility in academia:
https://www.tanjabhuiyan.com/blog/the-empty-promise-of-social-mobility-through-education
| matrix | @mtekman:matrix.org |
| gitlab and github | https://www.gitlab.com/mtekman and https://www.github.com/mtekman |
| ORCID and GSC | https://orcid.org/0000-0002-4181-2676 and https://scholar.google.com/citations?hl=en&tzom=-60&user=HVwU31YAAAAJ |
Really good post about the state of social mobility in academia:
https://www.tanjabhuiyan.com/blog/the-empty-promise-of-social-mobility-through-education
@galaxyproject almost everyone from the Freiburg Galaxy team going to #gcc2024 in Brno this June is going to travel there by train. Per person that saves ~ 200 kg of CO2.
Overall carbon footprint from traveling to GCC will, of course, be dominated by flights from outside the EU so how much potential is there for tweaking, e.g., the footprint of transatlantic flights?
Following are the results of the quick research I did this morning:
Have you ever had two #chromosome 9's? Well today I have.
One more reason that I prefer #R Dataframes to #Python Dataframes (#pandas). In R, there is rarely any uncertainty when it comes to loading in genomic data.
Image1 shows a 140k row table generated with Pandas containing just "9" or "X" for the chromosome.
Image2 shows how that dataset is read easily by R, but misinterpreted by pandas unless you set the datatypes yourself.
I've heard of #chromosome_duplication but this is pushing it
Turn an old eReader into an Information Screen (Nook STR)
https://shkspr.mobi/blog/2020/02/turn-an-old-ereader-into-an-information-screen-nook-str/
Here's a quick tutorial for turning an old Nook into a passive display. This is an update to my 2013 post End Result An eInk screen which displays the trains I can catch from my local station. It shows the next few available trains, and whether they're delayed. It also shows how long until the […]
Well that don’t look right at all
So they’re -9 compressed bz2 files
$ file *.bz2
[...]
DRR187559_1.fastqsanger.bz2: bzip2 compressed data, block size = 900k
DRR187559_2.fastqsanger.bz2: bzip2 compressed data, block size = 900k
And when looking for the bzip2 header that indicates compression and start of file we see:
$ grep BZh9 -c *.bz2
1.bz2:0
2.bz2:0
3.bz2:0
4.bz2:0
5.bz2:0
6.bz2:0
7.bz2:0
8.bz2:0
9.bz2:1
DRR187559_1.fastqsanger.bz2:229
DRR187559_2.fastqsanger.bz2:259
the first 8 lines are expeted, BZh and then the compression level wouldn’t be in 1-8 which were compressed with the associated compression levels
But the last two, uhhh, how did you possibly generate bzip2 files with that many headers? Apparently that can happen through concatenation.
Fun fact: bzip2 reads _2 fine.
Funner fact: basically no other implementations do. I.e. most bioinformatics tools. They just read the first entry and are done. But we only know this because it’s split mid-read, unlike _1 which runs successfully while actually failing.
$ fastqc DRR187559_1.fastqsanger.bz2
application/x-bzip2
Started analysis of DRR187559_1.fastqsanger.bz2
Analysis complete for DRR187559_1.fastqsanger.bz2
fastqc DRR187559_1.fastqsanger.bz2 4.67s user 0.35s system 150% cpu 3.334 total
FastQC reports 1927 reads which is off by, a lot. (451782 is the correct value.) We’d never know unless we carefully check this.
So if your tool breaks on a bzip2 file, try decompressing and recompressing, and updating your resume on linkedin while you find a new career.
New preprint from Tanja Bhuiyan
(@TanjaBhuiyan6, @tanjabhuiyan.bsky.social)
> Delighted to share our study on basal #transcription factor TFIID and an unexpected link to #RNA splicing. #GeneRegulation #condensates