"Hello World!"

Fourier Series and Fourier Transform - III

Certain feelings in my body lead me to believel that I have to stduy Fourier Series and Fourier Transform for a better understanding of probability theory, measure theory,entroy and information theory.

Fourier Series and Fourier Transform - II

Certain feelings in my body lead me to believel that I have to stduy Fourier Series and Fourier Transform for a better understanding of probability theory, measure theory,entroy and information theory.

Fourier Series and Fourier Transform - I

Certain feelings in my body lead me to believel that I have to stduy Fourier Series and Fourier Transform for a better understanding of probability theory, measure theory,entroy and information theory.

Maximum Entropy Distributions

The connection between entropy and probability distributions is really interesting. In this post, I will explore the connection between entropy and probability distributions, and how we can use this connection to derive the most likely probability distribution given some constraints.

Variational Inference (2)

Modern Bayesian statistics relies on models for which the posterior is not easy to compute and corresponding algorithms for approximating them. Variational inference is one of the most popular methods for approximating the posterior. In this post, we will introduce the basic idea of variational inference and its application to a simple example.

Variational Inference (1)

Modern Bayesian statistics relies on models for which the posterior is not easy to compute and corresponding algorithms for approximating them. Variational inference is one of the most popular methods for approximating the posterior. In this post, we will introduce the basic idea of variational inference and its application to a simple example.

Metropolis-Hastings Algorithm

The Metropolis-Hastings algorithm is a Markov chain Monte Carlo (MCMC) algorithm that generates a sequence of random variables from a probability distribution from which direct sampling is difficult.

Approximating the Posterior

When we use Bayesian inference, we need to compute the posterior distribution. In this post, we will look at some methods for approximating the posterior distribution.

Conjugate Families

When we build a model, we need to choose a prior distribution. If we choose a prior distribution from the same family as the posterior distribution, we can use the posterior distribution as the new prior distribution. This is called a conjugate prior. In this post, we will look at some of the most common conjugate priors.

The Beta-Binomial Bayesian Model

With more data generating day by day, I believe Bayesian statistics is the way to go. That's why I'm writing this series of posts on Bayesian statistics. In this post, I'll introduce the Beta-Binomial Bayesian model again. I'll also show how two communities (Python and R) have implemented this model.

Gradient Methods

Gradient descent is one of the most popular optimization algorithms in machine learning. It can be used for both convex and non-convex optimization problems. In this post, we will learn about the key ideas behind gradient descent and how it can be used to solve optimization problems.

From SVD to PCA

The applications of Singular Value Decomposition (SVD) are manifold. In this post, we will focus on the application of SVD to PCA, which is a great tool for dimensionality reduction.

QR Factorization

A QR factorization is a factorization of a matrix A into a product A = QR of an orthogonal matrix Q and an upper triangular matrix R. This kind of decomposition is useful in solving linear least squares problems and in the eigendecomposition of a matrix, which shows the structure of the matrix in terms of its eigenvalues and eigenvectors.

Solving Linear Systems

Linear systems of equations are the bread and butter of numerical linear algebra. Solving them is at the core of many machine learning algorithms and engineering applications.

Floating-Point Arithmetic

Floating-Point arithmetic is a way of representing real numbers in a computer. It is a way of representing numbers in a computer that is not exact, but is fast and efficient. It is a fundamental concept in numerical computing.

Dirichlet Distribution and Its Applications

From latent Dirichlet allocation to Bayesian inference, and beyond, the Dirichlet distribution is a powerful tool in the data scientist's toolbox.

Conjugate Priors - Binomial Beta Pair

Bayesian inference is almost 'everywhere' in data science; with the advance of computational power, it is now possible to apply Bayesian inference to high-dimensional data. In this post, we will discuss the conjugate priors for the binomial distribution.

The Johnson Lindenstrauss Lemma

In the era of AI, the Johnson Lindenstrauss lemma provides the mathematical foundation for many applications of machine learning and deep learning, such as ChatGPT.

Locality Sensitive Hashing (LSH)

LSH is recognized as a key breakthrough that has had great impact in many fields of computer science including computer vision, databases, information retrieval, machine learning, and signal processing.

Approximating Distinct Element in a Stream

This post explains a probabilistic counting algorithm with which one can estimate the number of distinct elements in a large collection of data in a single pass.

Develop Some Fluency in Probabilistic Thinking (Part III)

The foundation of machine learning and data science is probability theory. In this post, we will develop some fluency in probabilistic thinking with different examples, which prepare data scientists well for the sexist job of the 21st century.

Develop Some Fluency in Probabilistic Thinking (Part II)

The foundation of machine learning and data science is probability theory. In this post, we will develop some fluency in probabilistic thinking with different examples, which prepare data scientists well for the sexist job of the 21st century.

Develop Some Fluency in Probabilistic Thinking (Part I)

The foundation of machine learning and data science is probability theory. In this post, we will develop some fluency in probabilistic thinking with different examples, which prepare data scientists well for the sexist job of the 21st century.

Probability Review

From time to time, we need to review those definitions and basic concepts from probability field.