Results
Results
It is difficult to evaluate this model, because there is no clear metric to measure the “realness” of images.
Here some of the results I get on my validation set:
I have tried to generate with false captions, to see if the generator have learnt to use them :
We not see some clear differences. I think the conditioning on the caption is not significant in this generator model. However, the loss Ld_fake_captions
go to zero on the discriminator, that mean that the discriminator predict the reject class (0) for images from the dataset with an invalid caption. The discriminator is then able to use the information contained in the embeddings.
Then the generator inability to use them is a problem in the generator architecture.
In future work, I will address this issue.
But we can instead experiment a variation on the conditioning to observe the dependance the model have learn on it.
Observation
Room pictures
The rooms pictures are well reconstructed, maybe because the rooms pictures are geometric with fairly simple structure. The other possibility is that the training set have a lot of room related pictures. The vehicles are also well reconstructed.
Visage
The model have difficulties to generate visage and complex pattern. Maybe I need to train more the model or augment the layers numbers to allow it to express more complex structures.
More results
Here more results with the associated captions in tooltip.
’