Similar presentations:
Computer Vision Problems
1.
Computer Vision ProblemsImage Classification
Neural Style Transfer
Cat? (0/1)
64x64
Object detection
Andrew Ng
2.
Deep Learning on large imagesCat? (0/1)
64x64
Andrew Ng
3.
Computer Vision Problemvertical edges
horizontal edges
Andrew Ng
4.
Vertical edge detection3
1
2
0
4
2
1
1
1
01
0
5
7
1
2
4
01
01
0-1
1
8
2
3
1
5
0-1
0-1
2
9
5
1
6
2
-1
-1
-1
7
3
1
7
2
3
4
1
3
8
8
9
Andrew Ng
5.
Vertical edge detection examples10
10
10
0
0
0
10
10
10
0
0
0
10
10
10
0
0
0
10
10
10
0
0
0
10
10
10
0
0
0
10
10
10
0
0
0
0
0
0
10
10
10
0
0
0
10
10
10
0
0
0
10
10
0
0
0
10
0
0
0
0
0
0
0
30
30
0
1
0
-1
0
30
30
0
1
0
-1
0
30
30
0
1
0
-1
0
30
30
0
0
-30
-30
0
10
1
0
-1
0
-30
-30
0
10
10
1
0
-1
0
-30
-30
0
10
10
10
10
10
10
1
0
-1
0
-30
-30
0
Andrew Ng
6.
Valid and Same convolutions“Valid”:
“Same”: Pad so that output size is the same
as the input size.
Andrew Ng
7.
Summary of convolutionspadding p
stride s
Andrew Ng
8.
Multiple filters3x3x3
4x4
6x6x3
3x3x3
4x4
Andrew Ng
9.
Pooling layer: Max pooling1
3
2
1
2
9
1
1
1
3
2
3
5
6
1
2
Andrew Ng
10.
Pooling layer: Average pooling1
3
2
1
2
9
1
1
1
4
2
3
5
6
1
2
Andrew Ng
11.
Types of layer in a convolutional network:- Convolution
- Pooling
- Fully connected
Andrew Ng
12.
OutlineClassic networks:
• LeNet-5
• AlexNet
• VGG
ResNet
Inception
Andrew Ng
13.
LeNet - 57
avg pool
avg pool
f=2
s=2
f=2
s=2
120
[LeCun et al., 1998. Gradient-based learning applied to document recognition]
84
Andrew Ng
14.
AlexNetMAX-POOL
MAX-POOL
MAX-POOL
33
=
9216
[Krizhevsky et al., 2012. ImageNet classification with deep convolutional neural networks]
4096
4096
Softmax
1000
Andrew Ng
15.
VGG - 16CONV = 33 filter, s = 1, same
POOL
[CONV 128]
2
POOL
224x224x 3
POOL
POOL
POOL
FC
4096
FC
4096
[Simonyan & Zisserman 2015. Very deep convolutional networks for large-scale image recognition]
Softmax
1000
Andrew Ng
16.
Inception network[Szegedy et al., 2014, Going Deeper with Convolutions]
Andrew Ng
17.
What are localization and detection?Image classification
Classification with
localization
Detection
Andrew Ng
18.
Classification with localization1234-
pedestrian
car
motorcycle
background
Andrew Ng
19.
Defining the target label y1234-
pedestrian
car
motorcycle
background
Need to output class label (1-4)
Andrew Ng
20.
Sliding windows detectionAndrew Ng
21.
Evaluating object localization“Correct” if IoU 0.5
More generally, IoU is a measure of the overlap between two bounding boxes.
Andrew Ng
22.
Non-max suppression exampleAndrew Ng
23.
Non-max suppression algorithmEach output prediction is:
Discard all boxes with
While there are any remaining boxes:
• Pick the box with the largest
Output that as a prediction.
• Discard any remaining box with
IoU with the box output
in the previous step
Andrew Ng
24.
Non-max suppression example0.6
0.8
0.9
0.7
0.7
Andrew Ng
25.
Anchor box exampleAnchor box 1:
y =
Anchor box 2:
Andrew Ng