We have trained a convolutional neural network (CNN) based on the Segnet Encoder-Decoder architecture to colorize black and white portraits.
Photos of faces were downloaded from LFW database by Huang, et al hosted in the University of Massachusets website. Out of these images, 2000 were selected for training and 500 were selected for validation during training. All 2,500 images were resized to 256 X 256 pixels. The training set was created by turning the images into black and white, while the target data were the colored images.
The CNN architecture was based on Segnet. However, it was modified so that each input image had a dimension of 256 X 256 X 1 (black and white) and the output having a dimension of 256 X 256 X 3 (colored). ReLU was used for all layers except the last convolutional layer which used sigmoid activation. Optimizer was RMSProp with a learning rate of 0.0001 and loss function was binary cross-entropy. The model was trained over 100 epochs.
Gary B. Huang, Manu Ramesh, Tamara Berg, and Erik Learned-Miller. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments. University of Massachusetts, Amherst, Technical Report 07-49, October 2007.
Vijay Badrinarayanan, Alex Kendall, Roberto Cipolla. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. 2016.