Mastodawn

Ankit Sharma Jul 2, 2024

Karl Dubost Jul 1, 2024

Welcome to a new browser!

> Ladybird uses a brand new engine based on web standards, without borrowing any code from other browsers. It started as a humble HTML viewer for the SerenityOS hobby project, but since then it's grown into a full cross-platform browser project supporting Linux, macOS, and other Unix-like systems. — https://ladybird.org/announcement.html

This is good for the Web!

Announcing the Ladybird Browser Initiative

We've created a US non-profit to develop Ladybird into a truly independent web browser...

Ankit Sharma Jul 1, 2024

This intro to Linear Algebra is what you really needed but never had.

Link: https://pabloinsente.github.io/intro-linear-algebra

Introduction to Linear Algebra for Applied Machine Learning with Python

Ankit Sharma Jun 30, 2024

the github feed used to be so good.. used to discover a lot of alpha

wtf they changed that into.. now I get a couple of weird updates and then nothing, tried filtering but nothing works out well

Show thread

Ankit Sharma Jun 27, 2024

@nezubn7 test

Ankit Sharma Aug 2, 2023

Vicki Boykis Aug 2, 2023

📚 Pretty nervous about this, but here goes. Here's a side project I've been working on for the past few months. It's called Viberary and it's a semantic search engine. It gives you book recommendations based on ✨vibes. ✨ You enter a search query like "funny scifi" and it returns a list of (hopefully!) good recommendations.

https://viberary.pizza/

There is an about page that explains the data, model, etc. It's still pretty early stages but it's been a labor of love for me.

Viberary

Find your book vibe semantically!

Ankit Sharma Jul 30, 2023

@zhenyi hey, I was going through the Sessions app which you created last year and I was wondering how did you pull the videos there. Is there an api ??

Ankit Sharma Jun 28, 2023

@dansup did you find any?

Show thread

Ankit Sharma Jun 19, 2023

@xenodium interesting .. got may things which I will try and experiment with

Show thread

Ankit Sharma Jun 18, 2023

@xenodium you’re using spacemacs or vanilla emacs?

Ankit Sharma May 11, 2023

Aran Komatsuzaki May 11, 2023

Fast Distributed Inference Serving for Large Language Models

FastServe improves the average and tail job completion time by up to 5.1x and 6.4x, respectively, compared to the SotA solution Orca.

https://arxiv.org/abs/2305.05920

Fast Distributed Inference Serving for Large Language Models

Large language models (LLMs) power a new generation of interactive AI applications exemplified by ChatGPT. The interactive nature of these applications demands low latency for LLM inference. Existing LLM serving systems use run-to-completion processing for inference jobs, which suffers from head-of-line blocking and long latency. We present FastServe, a distributed inference serving system for LLMs. FastServe exploits the autoregressive pattern of LLM inference to enable preemption at the granularity of each output token. FastServe uses preemptive scheduling to minimize latency with a novel skip-join Multi-Level Feedback Queue scheduler. Based on the new semi-information-agnostic setting of LLM inference, the scheduler leverages the input length information to assign an appropriate initial queue for each arrival job to join. The higher priority queues than the joined queue are skipped to reduce demotions. We design an efficient GPU memory management mechanism that proactively offloads and uploads intermediate state between GPU memory and host memory for LLM inference. We build a system prototype of FastServe and experimental results show that compared to the state-of-the-art solution vLLM, FastServe improves the throughput by up to 31.4x and 17.9x under the same average and tail latency requirements, respectively.

arXiv.org

Website	https://nezubn.com
Twitter	https://twitter.com/nezubn
GitHub	https://github.com/ankitsharma07
Linkedin	https://www.linkedin.com/in/ankitkumar1107