Chinese Truman

Transformer可视化概念理解

最近需要对Transformer网络的中间层进行可视化，便于分析网络，在此记录一些常用到的概念。常用到的方法主要是Attention Rollout和Attention Flow，这两种方法都对网络中每一层的token attentions进行递归计算，主要的不同在于假设低层的attention weights如何影响到高层的信息流，以及是否计算token attentions之间的相关性。为

2021-06-17

可视化

#可视化 #算法

kwargs.pop

pop(key[, default]) if key is in the dictionary, remove it and return its value, else return default. If default is not given and key is not in the dictionary, a KeyError is raised. kwargs.pop()的作用是

2021-06-14

python

#python

ViT Patch Embedding理解

ViT(Vision Transformer)中的Patch Embedding用于将原始的2维图像转换成一系列的1维patch embeddings。假设输入图像的维度为 $H \times W \times C$，分别表示高，宽和通道数。 Patch Embeeding操作将输入图像分成 $N$ 个大小为 $P^2C$ 的 patch，并reshape成维度为 $N \times (P^2C

2021-06-11

计算机视觉

#计算机视觉