6 Followers
2 Following
3 Posts
Rants, raves (and occasionally considered opinions) on phyloinformatics, taxonomy, and biodiversity informatics. For more ranty and less considered opinions, see my Twitter feed.
Homepagehttps://iphylo.blogspot.com/
Atom Feedhttps://iphylo.blogspot.com/feeds/posts/default

https://iphylo.blogspot.com/2023/08/document-layout-analysis.html #ABBYY #CRF #DjVu #DocumentLayout #HOCR

Some notes to self on document layout analysis. I’m revisiting the problem of taking a PDF or a scanned document and determining its structure (for example, where is the title, abstract, bibliography, where are the figures and their captions, etc.). There are lots of papers on this topic, and lots of tools.

Document layout analysis

Blog by Rod Page on biodiversity informatics, taxonomy, systematics, phylogeny, knowledge graphs, and other topics.

 https://doi.org/10.59350/qkn8x-mgz20

Recently I’ve been messing about with DNA barcodes. I’m junior author with David Schindel on forthcoming book chapter Creating Virtuous Cycles for DNA Barcoding: A Case Study in Science Innovation, Entrepreneurship, and Diplomacy, and I’ve blogged about Adventures in machine learning: iNaturalist, DNA barcodes, and Lepidoptera.

Sub-second searching of millions of DNA barcodes using a vector database

Blog by Rod Page on biodiversity informatics, taxonomy, systematics, phylogeny, knowledge graphs, and other topics.