#Duckdb #htslib #Genomics #Bioinformatics #RStats duckths: Read HTS (VCF/BCF/BAM/CRAM/FASTA/FASTQ/GTF/GFF) files in DuckDB via htslib Rduckhts: 'DuckDB' High Throughput Sequencing File Formats Reader Extension genomic.social/@bioinfhotep...

SMT (@[email protected]...
SMT (@[email protected])

Attached: 4 images #Genomics #Bioinformatics Release of duckhts: #htslib based #Duckdb Extension for High Throughput Sequencing File Formats https://duckdb.org/community_extensions/extensions/duckhts Allied Self contained #Rstats package Rduckhts : https://rgenomicsetl.r-universe.dev/Rduckhts For now the lifecycle of the APIs (Duckdb C API extension and R package) are "experimental". They are mostly targeted to some ETL use cases, but will add some some useful scalar and aggregation functions including Heng Li's C Genomic ranges Feedback and testing welcome !

genomic.social

#Genomics #Bioinformatics
Release of duckhts: #htslib based #Duckdb Extension for High Throughput Sequencing File Formats
https://duckdb.org/community_extensions/extensions/duckhts
Allied Self contained #Rstats package Rduckhts : https://rgenomicsetl.r-universe.dev/Rduckhts

For now the lifecycle of the APIs (Duckdb C API extension and R package) are "experimental". They are mostly targeted to some ETL use cases, but will add some some useful scalar and aggregation functions including Heng Li's C Genomic ranges

Feedback and testing welcome !

#RStats Rduckhts: 'DuckDB' 'HTS' File Reader Extension for 'R' Sitting on the shoulders of the great #htslib API and the duckdb C API Package : rgenomicsetl.r-universe.dev/Rduckhts
Maybe the fastest BCF/VCF to #RStats DataFrames using #htslib and #duckdb C API. Easily the title of fastest BCF/VCF to parquet convertors in #RStats (no other R options :D). This was motivated, among other things, by the idea of trying out #DuckLake in a familiar field github.com/RGenomicsETL...

playing with GNU #guile and #htslib.

(define with_vcf (lambda (f action) (begin
(define header (bcf_hdr_read f))
(define b (bcf_init1))
(while (>= (bcf_read fp header b) 0) (action header b))
(bcf_hdr_destroy header)
(bcf_destroy b)
(hts_close f)
)))

#bioinformatics #guile #scheme #vcf #lisp

Release 1.23 · samtools/htslib

Download the source code here: htslib-1.23.tar.bz2.(The "Source code" downloads are generated by GitHub and are incomplete as they are missing some generated files.) Updates HTSlib 1.22 changed ...

GitHub
#htslib #Bioinformatics #GenomicsIO
@yokofakun any idea what is the fastest method to get the nth bcf record using htslib or bcftools.h without explicit loops? (trying out something similar to your project https://github.com/lindenb/rbcf but with ALTREP)
I guess it is possible to guess which block to (lazy) parse if one know the blocks offsets and number of records per block

via Stephen Turner on X : "Everyone's giving NotebookLM their papers, theses, marketing materials, etc. to generate these "Podcasts".

I gave it the full text of kseq.h from seqtk. "

https://notebooklm.google.com/notebook/39e3670a-dfa5-4808-aa11-06ab8c0cfb9d/audio?original_referer=https:%2F%2Ft.co%23&pli=1

#IA #podcast #htslib #bioinformatics

Sign in - Google Accounts

PR accepted ! option "--drop-genotypes" in #bcftools concat https://github.com/samtools/bcftools/pull/1911 🥳🥳

#bioinformatics #genomics #htslib

drop-genotypes for concat by lindenb · Pull Request #1911 · samtools/bcftools

Hi all, This PR adds a new options -G/--drops-genotypes for concat. Motivation: I often want to just create a list of sites from a set of vcf with many samples split a list of vcf into a list inte...

GitHub