Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

Extending the DCGAN example implemented by gluon API to provide a more straight-forward evaluation on the generated image #12790

Merged
merged 31 commits into from
Oct 21, 2018
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
8865af0
add inception_score to metric dcgan model
pengxin99 Oct 10, 2018
3655df5
Update README.md
pengxin99 Oct 10, 2018
bd8c6b0
add two pic
pengxin99 Oct 10, 2018
f0860b6
Merge branch 'dcgan-inception_score' of https://github.com/pengxin99/…
pengxin99 Oct 10, 2018
8a7eadd
updata readme
pengxin99 Oct 10, 2018
6322a34
updata
pengxin99 Oct 10, 2018
8c6f058
Update README.md
pengxin99 Oct 10, 2018
ec7fd88
add license
pengxin99 Oct 11, 2018
9a4c1e4
Merge branch 'dcgan-inception_score' of https://github.com/pengxin99/…
pengxin99 Oct 11, 2018
27b8837
refine1
pengxin99 Oct 12, 2018
ef64b31
refine2
pengxin99 Oct 12, 2018
5664831
refine3
pengxin99 Oct 12, 2018
371828d
fix review comments
pengxin99 Oct 17, 2018
2c82b5a
Update README.md
pengxin99 Oct 17, 2018
9396b02
Update example/gluon/DCGAN/README.md
aaronmarkham Oct 18, 2018
618e97d
Update example/gluon/DCGAN/README.md
aaronmarkham Oct 18, 2018
6602495
Update example/gluon/DCGAN/README.md
aaronmarkham Oct 18, 2018
2978122
Update example/gluon/DCGAN/README.md
aaronmarkham Oct 18, 2018
e9043e1
Update example/gluon/DCGAN/README.md
aaronmarkham Oct 18, 2018
1da9dbe
Update example/gluon/DCGAN/README.md
aaronmarkham Oct 18, 2018
0c29ef7
Update example/gluon/DCGAN/README.md
aaronmarkham Oct 18, 2018
d47e07d
Update example/gluon/DCGAN/README.md
aaronmarkham Oct 18, 2018
8a08248
Update example/gluon/DCGAN/README.md
aaronmarkham Oct 18, 2018
8f6c527
Update example/gluon/DCGAN/README.md
aaronmarkham Oct 18, 2018
74c8275
Update example/gluon/DCGAN/README.md
aaronmarkham Oct 18, 2018
68030d0
modify sn_gan file links to DCGAN
pengxin99 Oct 18, 2018
c4a7616
update pic links to web-data
pengxin99 Oct 18, 2018
1cee0c4
update the pic path of readme.md
pengxin99 Oct 19, 2018
6181c15
rm folder pic/, and related links update to https://github.com/dmlc/w…
pengxin99 Oct 19, 2018
1f0e7cb
Merge branch 'dcgan-inception_score' of https://github.com/pengxin99/…
pengxin99 Oct 19, 2018
18065e8
Update README.md
juliusshufan Oct 19, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions example/gluon/DCgan/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# DCgan in MXNet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be we should call it DCGAN everywhere.


train the dcgan with mxnet, and eval dcgan model with inception_score.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deep Convolutional Generative Adversarial Network (DCGAN) implementation with Apache MXNet GLUON. This implementation uses, inception_score to evaluate the model.

You can use this reference implementation on MNIST and CIFAR10 dataset.


DCgan model is from: [Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks](https://arxiv.org/abs/1511.06434)

The inception score is refer to: [openai/improved-gan
](https://github.com/openai/improved-gan)


#### Generated pic(use dataset cifar10)
![Generated pic](https://github.com/pengxin99/incubator-mxnet/blob/dcgan-inception_score/example/gluon/DCgan/pic/fake_img_iter_13900.png)

#### Generated pic(use dataset mnist)
![Generated pic](https://github.com/pengxin99/incubator-mxnet/blob/dcgan-inception_score/example/gluon/DCgan/pic/fake_img_iter_21700.png)

#### inception_score in cpu and gpu (the real image`s score is around 3.3)
CPU & GPU

![inception_socre_with_cpu](https://github.com/pengxin99/incubator-mxnet/blob/dcgan-inception_score/example/gluon/DCgan/pic/inception_score_cifar10_cpu.png)
![inception_score_with_gpu](https://github.com/pengxin99/incubator-mxnet/blob/dcgan-inception_score/example/gluon/DCgan/pic/inception_score_cifar10.png)
## Quick start
use below code to see the configurations you can set:
```python
python dcgan.py -h
```

use below code to train dcgan model with default configurations and dataset(cifar10), and metric with inception_score:
```python
python dcgan.py
```
Empty file added example/gluon/DCgan/__init__.py
Empty file.
317 changes: 317 additions & 0 deletions example/gluon/DCgan/dcgan.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,317 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

import matplotlib as mpl
mpl.use('Agg')
from matplotlib import pyplot as plt

import argparse
import mxnet as mx
from mxnet import gluon
from mxnet.gluon import nn
from mxnet import autograd
import numpy as np
import logging
from datetime import datetime
import os
import time

from inception_score import get_inception_score

def fill_buf(buf, i, img, shape):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add code docs for easy readability and maintaining this script.

n = buf.shape[0]//shape[1]
m = buf.shape[1]//shape[0]

sx = (i%m)*shape[0]
sy = (i//m)*shape[1]
buf[sy:sy+shape[1], sx:sx+shape[0], :] = img
return None

def visual(title, X, name):
assert len(X.shape) == 4
X = X.transpose((0, 2, 3, 1))
X = np.clip((X - np.min(X))*(255.0/(np.max(X) - np.min(X))), 0, 255).astype(np.uint8)
n = np.ceil(np.sqrt(X.shape[0]))
buff = np.zeros((int(n*X.shape[1]), int(n*X.shape[2]), int(X.shape[3])), dtype=np.uint8)
for i, img in enumerate(X):
fill_buf(buff, i, img, X.shape[1:3])
buff = buff[:,:,::-1]
plt.imshow(buff)
plt.title(title)
plt.savefig(name)


parser = argparse.ArgumentParser()
parser = argparse.ArgumentParser(description='Train a DCgan model for image generation '
'and then use inception_score to metric the result.')
parser.add_argument('--dataset', type=str, default='cifar10', help='dataset to use. options are cifar10 and mnist.')
parser.add_argument('--batch-size', type=int, default=64, help='input batch size')
parser.add_argument('--nz', type=int, default=100, help='size of the latent z vector')
parser.add_argument('--ngf', type=int, default=64, help='the channel of each generator filter layer.')
parser.add_argument('--ndf', type=int, default=64, help='the channel of each descriminator filter layer.')
parser.add_argument('--nepoch', type=int, default=25, help='number of epochs to train for')
parser.add_argument('--lr', type=float, default=0.0002, help='learning rate, default=0.0002')
parser.add_argument('--beta1', type=float, default=0.5, help='beta1 for adam. default=0.5')
parser.add_argument('--cuda', action='store_true', help='enables cuda')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this required? if --ngpu is given

parser.add_argument('--ngpu', type=int, default=1, help='number of GPUs to use')
parser.add_argument('--netG', default='', help="path to netG (to continue training)")
parser.add_argument('--netD', default='', help="path to netD (to continue training)")
parser.add_argument('--outf', default='./results', help='folder to output images and model checkpoints')
parser.add_argument('--check-point', default=True, help="save results at each epoch or not")
parser.add_argument('--inception_score', type=bool, default=True, help='To record the inception_score, default is True.')

opt = parser.parse_args()
print(opt)

logging.basicConfig(level=logging.DEBUG)
ngpu = int(opt.ngpu)
nz = int(opt.nz)
ngf = int(opt.ngf)
ndf = int(opt.ndf)
nc = 3
if opt.cuda:
ctx = mx.gpu(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not using ngpu?

else:
ctx = mx.cpu()
check_point = bool(opt.check_point)
outf = opt.outf
dataset = opt.dataset

if not os.path.exists(outf):
os.makedirs(outf)


def transformer(data, label):
# resize to 64x64
data = mx.image.imresize(data, 64, 64)
# transpose from (64, 64, 3) to (3, 64, 64)
data = mx.nd.transpose(data, (2, 0, 1))
# normalize to [-1, 1]
data = data.astype(np.float32)/128 - 1
# if image is greyscale, repeat 3 times to get RGB image.
if data.shape[0] == 1:
data = mx.nd.tile(data, (3, 1, 1))
return data, label


# get dataset with the batch_size num each time
def get_dataset(dataset):
# mnist
if dataset == "mnist":
train_data = gluon.data.DataLoader(
gluon.data.vision.MNIST('./data', train=True, transform=transformer),
batch_size=opt.batch_size, shuffle=True, last_batch='discard')

val_data = gluon.data.DataLoader(
gluon.data.vision.MNIST('./data', train=False, transform=transformer),
batch_size=opt.batch_size, shuffle=False)
# cifar10
elif dataset == "cifar10":
train_data = gluon.data.DataLoader(
gluon.data.vision.CIFAR10('./data', train=True, transform=transformer),
batch_size=opt.batch_size, shuffle=True, last_batch='discard')

val_data = gluon.data.DataLoader(
gluon.data.vision.CIFAR10('./data', train=False, transform=transformer),
batch_size=opt.batch_size, shuffle=False)

return train_data, val_data


def get_netG():
# build the generator
netG = nn.Sequential()
with netG.name_scope():
# input is Z, going into a convolution
netG.add(nn.Conv2DTranspose(ngf * 8, 4, 1, 0, use_bias=False))
netG.add(nn.BatchNorm())
netG.add(nn.Activation('relu'))
# state size. (ngf*8) x 4 x 4
netG.add(nn.Conv2DTranspose(ngf * 4, 4, 2, 1, use_bias=False))
netG.add(nn.BatchNorm())
netG.add(nn.Activation('relu'))
# state size. (ngf*8) x 8 x 8
netG.add(nn.Conv2DTranspose(ngf * 2, 4, 2, 1, use_bias=False))
netG.add(nn.BatchNorm())
netG.add(nn.Activation('relu'))
# state size. (ngf*8) x 16 x 16
netG.add(nn.Conv2DTranspose(ngf, 4, 2, 1, use_bias=False))
netG.add(nn.BatchNorm())
netG.add(nn.Activation('relu'))
# state size. (ngf*8) x 32 x 32
netG.add(nn.Conv2DTranspose(nc, 4, 2, 1, use_bias=False))
netG.add(nn.Activation('tanh'))
# state size. (nc) x 64 x 64

return netG


def get_netD():
# build the discriminator
netD = nn.Sequential()
with netD.name_scope():
# input is (nc) x 64 x 64
netD.add(nn.Conv2D(ndf, 4, 2, 1, use_bias=False))
netD.add(nn.LeakyReLU(0.2))
# state size. (ndf) x 32 x 32
netD.add(nn.Conv2D(ndf * 2, 4, 2, 1, use_bias=False))
netD.add(nn.BatchNorm())
netD.add(nn.LeakyReLU(0.2))
# state size. (ndf) x 16 x 16
netD.add(nn.Conv2D(ndf * 4, 4, 2, 1, use_bias=False))
netD.add(nn.BatchNorm())
netD.add(nn.LeakyReLU(0.2))
# state size. (ndf) x 8 x 8
netD.add(nn.Conv2D(ndf * 8, 4, 2, 1, use_bias=False))
netD.add(nn.BatchNorm())
netD.add(nn.LeakyReLU(0.2))
# state size. (ndf) x 4 x 4
netD.add(nn.Conv2D(2, 4, 1, 0, use_bias=False))

return netD

def get_configurations(netG, netD):
# loss
loss = gluon.loss.SoftmaxCrossEntropyLoss()

# initialize the generator and the discriminator
netG.initialize(mx.init.Normal(0.02), ctx=ctx)
netD.initialize(mx.init.Normal(0.02), ctx=ctx)

# trainer for the generator and the discriminator
trainerG = gluon.Trainer(netG.collect_params(), 'adam', {'learning_rate': opt.lr, 'beta1': opt.beta1})
trainerD = gluon.Trainer(netD.collect_params(), 'adam', {'learning_rate': opt.lr, 'beta1': opt.beta1})

return loss, trainerG, trainerD


def ins_save(inception_score):
# draw the inception_score curve
length = len(inception_score)
x = np.arange(0, length)
plt.figure(figsize=(8.0, 6.0))
plt.plot(x, inception_score)
plt.xlabel("iter/100")
plt.ylabel("inception_score")
plt.savefig("inception_score.png")


# main function
def main():

# to get the dataset and net configuration
train_data, val_data = get_dataset(dataset)
netG = get_netG()
netD = get_netD()
loss, trainerG, trainerD = get_configurations(netG, netD)

# set labels
real_label = mx.nd.ones((opt.batch_size,), ctx=ctx)
fake_label = mx.nd.zeros((opt.batch_size,), ctx=ctx)

metric = mx.metric.Accuracy()
print('Training... ')
stamp = datetime.now().strftime('%Y_%m_%d-%H_%M')

iter = 0

# to metric the network
loss_d = []
loss_g = []
inception_score = []

for epoch in range(opt.nepoch):
tic = time.time()
btic = time.time()
for data, _ in train_data:
############################
# (1) Update D network: maximize log(D(x)) + log(1 - D(G(z)))
###########################
# train with real_t
data = data.as_in_context(ctx)
noise = mx.nd.random.normal(0, 1, shape=(opt.batch_size, nz, 1, 1), ctx=ctx)

with autograd.record():
output = netD(data)
output = output.reshape((opt.batch_size, 2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reshape is to make sure the shape is correct for the loss calculation

errD_real = loss(output, real_label)
metric.update([real_label, ], [output, ])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move metric update and backward pass out of autograd.record()


fake = netG(noise)
output = netD(fake.detach())
output = output.reshape((opt.batch_size, 2))
output = output.reshape((opt.batch_size, 2))
errD_fake = loss(output, fake_label)
errD = errD_real + errD_fake
errD.backward()
metric.update([fake_label,], [output,])

trainerD.step(opt.batch_size)

############################
# (2) Update G network: maximize log(D(G(z)))
###########################
with autograd.record():
output = netD(fake)
output = output.reshape((-1, 2))
errG = loss(output, real_label)
errG.backward()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move backward pass out of autograd.record


trainerG.step(opt.batch_size)

name, acc = metric.get()
# logging.info('speed: {} samples/s'.format(opt.batch_size / (time.time() - btic)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

logging.info('discriminator loss = %f, generator loss = %f, binary training acc = %f at iter %d epoch %d'
% (mx.nd.mean(errD).asscalar(), mx.nd.mean(errG).asscalar(), acc, iter, epoch))
if iter % 100 == 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default iter is 25. why % 100 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this '100' mean every per 100 iters, save the generated images and inception_score, i have set it as a parameter. default iter was not set before, and default epoch is 25.

visual('gout', fake.asnumpy(), name=os.path.join(outf, 'fake_img_iter_%d.png' % iter))
visual('data', data.asnumpy(), name=os.path.join(outf, 'real_img_iter_%d.png' % iter))
# record the metric data
loss_d.append(errD)
loss_g.append(errG)
if opt.inception_score:
score, _ = get_inception_score(fake)
inception_score.append(score)

iter = iter + 1
btic = time.time()

name, acc = metric.get()
metric.reset()
logging.info('\nbinary training acc at epoch %d: %s=%f' % (epoch, name, acc))
logging.info('time: %f' % (time.time() - tic))

# save check_point
if check_point:
netG.save_parameters(os.path.join(outf,'generator_epoch_%d.params' %epoch))
netD.save_parameters(os.path.join(outf,'discriminator_epoch_%d.params' % epoch))

# save parameter
netG.save_parameters(os.path.join(outf, 'generator.params'))
netD.save_parameters(os.path.join(outf, 'discriminator.params'))

# visualization the inception_score as a picture
if opt.inception_score:
ins_save(inception_score)


if __name__ == '__main__':
if opt.inception_score:
print("Use inception_score to metric this DCgan model, the reusult is save as a picture named \"inception_score.png\"!")
main()


Loading