leetcode @ class

in between listening to the lectures i was leetcoding. my deep learning class taught about cnn. cnn is a type of neural network architecture that can learn spatial patterns through heirarchical feature extractions, mimicking human visual processing.

core components:

convolutional layer: apply filters to detect features (edge -> shape -> object)
filters/kernels: matrices (3x3 or 5x5) that slide across images
output dim: (input size - filter size) / stride + 1.
padding. add border elements to preserve spatial dimensions
- optimal padding: (filter size - 1) / 2

example:

Padding = (filter size - 1) / 2 For 3x3 filter: Padding = (3 - 1) / 2 = 1

Input: 5x5 image Add 1-pixel border around image → 7x7 Filter: 3x3 Output size = (7 - 3) / 1 + 1 = 5x5

this helps maintain output size to match input size.

what do 1x1 filters do?

dimensionality reduction (reduce channels but preserve spatial dim)
computational efficiency

GoogleNet/inception networks popularized this

reduce parameters
control network depth

BENEDICT NEO 梁耀恩

leetcode @ class