Layout and context understanding for image synthesis with scene graphs
College
College of Computer Studies
Department/Unit
Software Technology
Document Type
Conference Proceeding
Source Title
Proceedings - International Conference on Image Processing, ICIP
Volume
2019-September
First Page
1905
Last Page
1909
Publication Date
9-1-2019
Abstract
Advancements on text-to-image synthesis generate remarkable images from textual descriptions. However, these methods are designed to generate only one object with varying attributes. They face difficulties with complex descriptions having multiple arbitrary objects since it would require information on the placement and sizes of each object in the image. Recently, a method that infers object layouts from scene graphs has been proposed as a solution to this problem. However, their method uses only object labels in describing the layout, which fail to capture the appearance of some objects. Moreover, their model is biased towards generating rectangular shaped objects in the absence of ground-truth masks. In this paper, we propose an object encoding module to capture object features and use it as additional information to the image generation network. We also introduce a graph-cuts based segmentation method that can infer the masks of objects from bounding boxes to better model object shapes. Our method produces more discernible images with more realistic shapes as compared to the images generated by the current state-of-the-art method. © 2019 IEEE.
html
Digitial Object Identifier (DOI)
10.1109/ICIP.2019.8803182
Recommended Citation
Talavera, A., Tan, D., Azcarraga, A. P., & Hua, K. (2019). Layout and context understanding for image synthesis with scene graphs. Proceedings - International Conference on Image Processing, ICIP, 2019-September, 1905-1909. https://doi.org/10.1109/ICIP.2019.8803182
Disciplines
Computer Sciences
Keywords
Image analysis; Computer graphics
Upload File
wf_no