CS492(C): Diffusion and Flow Models
Minhyuk Sung, KAIST, Fall 2025
Visual Generation Contest
Submission Due: December 6 (Saturday), 23:59 KST
Where to submit: KLMS
Presentation Session 1: December 8 (Monday), 10:30 a.m. - 11:50 a.m. KST
Presentation Session 2: December 10 (Wednesday), 10:30 a.m. - 11:50 a.m. KST
What to Do
In this contest, you will explore recent advances in image generative models and unleash their capabilities to create visual content in various forms. Specifically, your task is to produce any form of visual content, including but not limited to:
- 2D images
- Panoramas
- 3D assets
- Mesh textures
- etc.
While topics from lectures, such as text-to-image foundation models, score distillation, and inference-time guidance, provide a solid starting point, you are strongly encouraged to push further by exploring advanced techniques. Remember that this is a visual content creation project, so think carefully about the creative and artistic qualities of your final output, and aim to produce something that captures attention at a glance.
Important Notes
PLEASE READ THE FOLLOWING CAREFULLY! Any violation of the rules or failure to properly cite existing code, models, or papers used in the project in your write-up will result in a zero score.
- For pretrained models, you are required to use only the following two models
-
While architectural modifications using ControlNet, LoRA, or additional neural networks for guidance (e.g., pretrained reward models) are permitted, the use of any other pretrained generative models is not allowed. Commercial generative models are also not permitted. Fine-tuning or training a generative model from scratch is allowed.
-
You are required to work with the same team members as in the previous project.
-
Your final submission must include data such as images or other visual outputs generated using your own code and the designated model.
-
DO NOT USE the following (violations will result in a zero score):
- Any commercial image generators (e.g., ChatGPT, Gemini)
- Any commercial software
- Any existing visual assets from online repositories whether free or paid (please check out clarifications below)
- Closed-source software (e.g., PolyCam, Luma AI)
-
Failure to cite existing code, models, and assets used in the project in the write-up will result in a zero score.
Clarifications
You are allowed to use existing visual data only for fine-tuning the backbone model (with or without ControlNet, LoRA, or additional neural networks for guidance). However:
- Only free data may be used. Paid data are not permitted.
- At inference time, you must not use existing data or data generated by other models/services as references or inputs. Final outputs must be generated purely from your trained model.
- You must not paste, trace, or composite any existing data or data generated by other models/services into your final outputs, even if heavily edited.
- All pretrained checkpoints and datasets used must be clearly listed in your report, with links.
What to Submit
- A poster
- Please use the poster template in the link above.
- Name the submission file "Team_{TEAM ID}_poster.pdf" (e.g., Team_10_poster.pdf).
- Format: PDF
- File size limit: 10 MB
- If you do not submit your poster by the deadline, you will NOT be allowed to present at the poster session, resulting in a zero score in peer evaluation.
- Source code and data
- Ensure the reproducibility of your work with your provided code and data.
- Code or data that cannot reproduce your results will be considered as not submitted, resulting in a zero score.
- Write-up
- It is not required to use the template in the link above, but your write-up must include:
- Project title
- Teammates' names and student IDs
- Brief description of the visual content
- Discussion of technical aspects
- Discussion of artistic aspects
- Reproduction steps using your code and data
- Name the submission file "Team_{TEAM ID}_writeup.pdf" (e.g., Team_10_writeup.pdf).
- Format: PDF
- Length limit: Up to 4 pages in A4 size, excluding references.
- File size limit: 10 MB
- Properly cite all the code and resources you have used. Missing references will be considered as misconduct, resulting in a zero score.
- It is not required to use the template in the link above, but your write-up must include:
Evaluation
Your results will be evaluated based on Technology Score and Creativity Score by the instructor, TAs, and your peer classmates.
- Technology Score [1–5 range]: This indicates the technical novelty and difficulty in creating and rendering the object/scene.
- Creativity Score [1–5 range]: This indicates the originality and artistic value of the rendering outputs.
For both, higher is better, and only integer scores are permitted.
Each individual will assign a Technology Score and Creativity Score to the results of all classmates. The scores given by one person to all results will be normalized. Then, the average of these normalized scores for each criterion will determine your final score.
Presentation Sessions
-
All sessions are mandatory. Your attendance will be checked at the beginning, middle, and end of each session.
-
Your camera and mic must be turned on all the time.
-
Find your poster BEFORE the sessions, and also check whether your camera and mic work, and you can share your screen.
-
You will play a role of a presenter in one session and a questioner in the other session. Odd number teams will present on Monday, and even number teams will present on Wednesday.
-
Each session will begin with a 1-minute pitch from the presenters. When it's your turn, please follow these steps:
- Introduce your team (team ID and your names).
- Share your screen and present your poster in 1 minute.
-
In the session you present, you must be on standby in front of your poster during the entire session and wait for the questioners.
-
The instructor will stop by each poster with the schedule below (subject to change).
-
All discussions during the poster sessions must be done in English.
Grading
There is no late day. Submit on time.
Late submission: Zero score.
Missing any of video, 3D content, code/data, and write-up: Zero score.
Wrong format: 10% penalty for each.
Absence (at any time) at the sessions: 20% penalty for each session.
Turning off camera of mic: 20% penalty for each session.
Not staying at the poster during the presentation: 20% penalty.