Friday 13 December 2024
InnoDB is a storage engine for MySQL. It is designed to be a drop-in replacement for MySQL's existing storage engine. In this article, we discuss the history of InnoDB B-tree latch optimization.
Oriole is a table storage extension for Postgres. It is designed to be a drop-in replacement for Postgres' existing storage engine. In this article, we discuss how OrioleDB solves some of the wicked problems in PostgreSQL.
This technical blog explains how CREATE INDEX CONCURRENTLY (CIC) works and how it manages to avoid locking the table from updates. A unique distinguishing factor of CIC is that it can build a new index on the table, without blocking it from updates/inserts/deletes.
In this article, I would like to go over the mathematical process of training and optimizing a simple 4-layer neural network. I believe this would help the reader understand how backpropagation works as well as realize its importance.
In this article, I have used top down manner to explain linear algebra for deep learning. First providing the applications and uses and then drilling down to provide the concepts.
Explore the architecture of Transformers, the models that have revolutionized data handling through self-attention mechanisms.
In this article, we discuss various technical considerations when implementing RAG, exploring the concepts of chunking, query augmentation, hierarchies, multi-hop reasoning, and knowledge graphs. We also discuss unsolved problems & opportunities in the RAG infrastructure space, and introduce some infrastructure solutions for building RAG pipelines.
A vector database indexes and stores vector embeddings for fast retrieval and similarity search, with capabilities like CRUD operations, metadata filtering, horizontal scaling, and serverless. This article explains how vector databases work and how they can be used for semantic search and similarity matching.
txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows.
Retrieval augmented generation (RAG) is an architecture that provides the most relevant and contextually-important proprietary, private or dynamic data to your Generative AI application's large language model (LLM) when it is performing tasks to enhance its accuracy and performance.
We start by introducing key FT concepts and techniques, then finish with a concrete example of how to fine-tune a model (locally) using Python and Hugging Face’s software ecosystem.
Transformers require a specific input format, which includes tokenization, mapping and padding. This article explains how to prepare text data for transformer models.
Large language models are a type of neural network that can be trained to perform a variety of natural language processing tasks.
Implementing a payment system is not a leisurely task. Reliability and correctness are critical. Furthermore, successful business geenrate a lot of payment requests as they scale. Since even a small amount of downtime could mean a lot of lost revenue, it is important to design a payment system that is highly available and fault-tolerant. In this article, we will discuss the design of a payment system that is highly available and fault-tolerant.
Cloudflare has open-sourced Pingora, a high-performance HTTP server that is designed to be a drop-in replacement for NGINX.