Machine Learning, Deep Learning, Transformer Architecture and Implications to Blockchain

Victor Yeo
3 min readJan 2, 2024

Machine learning is a domain within the realm of Artificial Intelligence (AI) that uses algorithms and models capable of learning and independently making predictions or decisions.

Machine learning plays an important role in AI by utilizing algorithms to examine and interpret vast amount of data, detect patterns, and acquire knowledge from the learning process.

Machine learning algorithms

There are three types of Machines learning models:

Supervised learning: Using labeled data, it aims to do the below:
regression, to estimate data
classification, to classify data

Unsupervised learning: Using unlabeled data, it aims to do clustering of data from patterns.

Reinforced learning: Basically the process learns iteratively over data and gets rewarded for desired behaviors.

Machine Learning Algorithms

The above image shows the different types of machine learning algorithms. This article will not go further into details of the machine learning algorithms.

Deep learning algorithms

Deep learning is a branch of machine learning. It uses multiple layer of neural networks to acquire knowledge and make predictions. There are few types of Deep learning models:

Feedforward network: just consist of layer of neural network arranged in forward direction.

Convolutional neural network (CNN): used in learning from images or video.

Recurrent neural network (RNN): used in sequential data like text, speech, time series. Long Short Term Memory (LSTM) is an RNN architecture that excels in capturing long term dependencies of data.

Attention network: attention network is used to capture mapping between query and key-value pairs to an output. Attention is a way of computing the relevance of different parts of the input and output sequences, by focusing on the words that are most relevant to the current output word.

Transformer Architecture

The transformer architecture revolutionizes the attention network by using self-attention. Chat-GPT , and its foundation Large language model (LLM), is an example use case of transformer architecture.

Transformer model runs as follows (highly simplified):

Input (text) is transformed to input embeddings (numbers). Input embeddings are summed with positional encoding to capture positional information. Then it is fed into encoding block.
The output goes through similar process. Finally, the output of the decoder block is processed by softmax layer, to generate a prediction for the next word of the output sequence.

Transformer Architecture

For attention network, there are types of attention in a transformer network: self-attention, encoder-decoder attention.

Self-attention is when the encoder or the decoder computes the attention scores between its own input/output embeddings.

Encoder-decoder attention means getting representation of both the encoder input sequence and decoder self-attention.

Transformer architecture proposes using scaled dot-product attention to implement self-attention. Scaled dot-product maximises training data by ensuring data is optimised for the model during training.

Taken from “Attention is All You Need

Q, K, V are queries, key and value respectively. The query is the information that is being searched for. The key is the context or reference, and the value is the content that is being searched.
MatMul is matrix multiplication, it returns matrix product of 2 arrays.
SoftMax returns probability distribution of a matrix.

The scaled dot-product attention is implemented in the transformer decoder block.

Implications to Blockchain

To be known in future: if the transformer model can be trained so that it can create and deploy a blockchain network based on user inputs and specifications.

--

--