Assaf Arbelle

43 Followers
50 Following
39 Posts
Husband, Father of 3 amazing daughters, Research Scientist and Manager of the AI-Vision Grp @IBMResearch. Interested in Computer Vision, Multimodal Learning, Self Supervised Learning and much much more.
LinkedInhttps://www.linkedin.com/in/assafarbelle/
Twitterhttps://twitter.com/ArbelleAssaf
Google Scholarhttps://scholar.google.co.uk/citations?user=uU_V_PsAAAAJ&hl=en&oi=ao

"Page Layout Analysis of Text-heavy Historical Documents: a Comparison of Textual and Visual Approaches. (arXiv:2212.13924v1 [cs.IR])" — A dataset of ca. 300 annotated pages sampled from 19th century commentaries and an assessment of the performances of two transformers in segmenting such pages into areas of interest.

Paper: http://arxiv.org/abs/2212.13924

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>

Page Layout Analysis of Text-heavy Historical Documents: a Comparison of Textual and Visual Approaches

Page layout analysis is a fundamental step in document processing which enables to segment a page into regions of interest. With highly complex layouts and mixed scripts, scholarly commentaries are text-heavy documents which remain challenging for state-of-the-art models. Their layout considerably varies across editions and their most important regions are mainly defined by semantic rather than graphical characteristics such as position or appearance. This setting calls for a comparison between textual, visual and hybrid approaches. We therefore assess the performances of two transformers (LayoutLMv3 and RoBERTa) and an objection-detection network (YOLOv5). If results show a clear advantage in favor of the latter, we also list several caveats to this finding. In addition to our experiments, we release a dataset of ca. 300 annotated pages sampled from 19th century commentaries.

arXiv.org

Hi all #NuerIPS people!

I just landed yesterday and after a bit of sleep I am ready for the conference!

If anyone here wants to come say hi, I'll be at our poster in Hall J # 1016. Amit Alfassy and I will be there from 11AM to 1PM.

https://nips.cc/virtual/2022/poster/55625

#CV #ML #ComputerVision #NewPaper #FoundationModels #VisionAndLanguage

arxiv paper status

https://status.arxiv.org/

arXiv Operational Status

Keep up to date with any interruptions to our service which may be affecting you.

Quick reminder that @NeuripsConf is right around the corner!

I have seen a lot of requests for more discussion / advertising of papers, so much that I will start posting some of my old ones! :)

So, I encourage everyone not to be shy and to share their upcoming paper. Use the #NeurIPS2022 hash tag so it's easy for others to search for. And most importantly, have fun in New Orleans!

I'll be at my alma mater Hebrew University on Mon Nov 28 to talk about AI, drug discovery and causal inference at IBM Research. Event open to all students and faculty, refreshments will be served. :) @LChoshen will be there too!
There are a few slots open, register at https://forms.gle/oKLFAFDwzWcfvfJR9
When Academia meets Industry- Come celebrate 50 years of IBM Research in Israel

IBM Research Lab in Israel is happy to invite you to celebrate our 50th year. Monday, November 28th, 2022 17:00-19:00 @ Computer science building cafeteria (3rd floor), Hebrew University

Google Docs
Hi all, looking forward to following paper threads and sharing some cool work. Currently at Google Brain in France working on making RL useful and practical.

145 new papers under the cs.CV category on arXiv.org today — about 50 of them are categorised as updates/cross-posts. So around 95 are brand new papers 🙂

What with Stable Diffusion 2.0 being released today, I’m torn between getting on SD and reading the papers. What to do?

#AI #CV #DeepLearning #MachineLearning

Hello world!

I am a PhD Candidate at the AI Lab of the VUB in Brussels. My research focuses on #MARL and more specifically on learning to communicate, to collaborate and to explore.

#reinforcementlearning
#introduction

“Accelerating Diffusion Sampling with Classifier-based Feature Distillation” — a proposed method to speed up diffusion based image generation so that you can generate an image with a fewer number of steps.

Paper: https://arxiv.org/abs/2211.12039

No code/demo

#AI #CV #NewPaper #DeepLearning #MachineLearning

Accelerating Diffusion Sampling with Classifier-based Feature Distillation

Although diffusion model has shown great potential for generating higher quality images than GANs, slow sampling speed hinders its wide application in practice. Progressive distillation is thus proposed for fast sampling by progressively aligning output images of $N$-step teacher sampler with $N/2$-step student sampler. In this paper, we argue that this distillation-based accelerating method can be further improved, especially for few-step samplers, with our proposed \textbf{C}lassifier-based \textbf{F}eature \textbf{D}istillation (CFD). Instead of aligning output images, we distill teacher's sharpened feature distribution into the student with a dataset-independent classifier, making the student focus on those important features to improve performance. We also introduce a dataset-oriented loss to further optimize the model. Experiments on CIFAR-10 show the superiority of our method in achieving high quality and fast sampling. Code will be released soon.

arXiv.org

“SinDiffusion: Learning a Diffusion Model from a Single Natural Image” — A diffusion model which learns from a single input image and which can be used for a variety of different tasks such as image generation, text-to-image, image-to-image, and image harmonisation?

Sounds perfect, doesn’t it?

Paper: https://arxiv.org/abs/2211.12445
Code: https://github.com/weilunwang/sindiffusion

#AI #CV #NewPaper #DeepLearning #MachineLearning

SinDiffusion: Learning a Diffusion Model from a Single Natural Image

We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image. SinDiffusion significantly improves the quality and diversity of generated samples compared with existing GAN-based approaches. It is based on two core designs. First, SinDiffusion is trained with a single model at a single scale instead of multiple models with progressive growing of scales which serves as the default setting in prior work. This avoids the accumulation of errors, which cause characteristic artifacts in generated results. Second, we identify that a patch-level receptive field of the diffusion network is crucial and effective for capturing the image's patch statistics, therefore we redesign the network structure of the diffusion model. Coupling these two designs enables us to generate photorealistic and diverse images from a single image. Furthermore, SinDiffusion can be applied to various applications, i.e., text-guided image generation, and image outpainting, due to the inherent capability of diffusion models. Extensive experiments on a wide range of images demonstrate the superiority of our proposed method for modeling the patch distribution.

arXiv.org