All images on this site were generated by a machine learning model.
I claim no legal ownership or rights to any of the images generated by this model. However, all silhouettes and logos depicted are property of their respective owners.
Technical Details
Model
- NVIDIA’s Stylegan2-ada architecture.
- Increased feature maps for both generator and discriminator, inspired by l4rz (NSFW) and Gwern. As noted by l4rz, increasing high-resolution feature maps greatly increases quality in images with fine details. Moreover, mixed-precision training makes these high-resolution layers cheap in compute and memory.
- Disabled path-length regularization.
- Gamma (R1-regularization) lowered from 10 to 1.
- Minibatch standard deviation increased to 20.
- Disabled style mixing.
- Model trained with inverted colors. (see “General notes on training”)
- Custom non-square resolution output images (384w * 512h). Code adapted from: eps696.
- Model trained for about 20 days on a RTX 3090 GPU
- “Future” style sneakers were made by fine-tuning the model on a curated subset of sneakers.
- Color sliders were made using SeFa.
Dataset:
- Training dataset consists of ~50000 images of sneakers scraped from web shops and sneaker marketplaces.
- All images were standardized using simple image manipulation programs usin Pillow and OpenCV.
- Silhouettes with large amounts of colorways (Vans, Nike Jordan 1) were partially filtered out.
- Boring sneakers were largely filtered out.
- Large amounts of additional interesting sneakers were found using the Wayback Machine on old webshops. These searches retrieved sneakers of short-lived trends from some years ago.
- Weird and fun sneaker brands were found by following lookatsangi on instagram.
- A painful amount of dataset filtering was done manually.
General notes on training
- White backgrounds on images appear to destabilize training. I am not completely sure why this happens, but it seems to happen to other people as well (https://github.com/NVlabs/stylegan2-ada-pytorch/issues/157 ). This problem seems to appear when augmentations are enabled. Making backgrounds black via naive background removal caused artifacts, and out-of-the-box background removal tools were poorly optimized for these specific images. It took me a surprisingly long time to solve all my issues by simply inverting the colors of the images for training, and reverting them after generation.
- Lowering gamma and path-length regularization slightly decreased perceptual quality of images but improved their variety. For the intended purpose of this site, the tradeoff between variance and quality was well worth it.
- Achieving both varied and high quality results on sneaker images using GANs is surprisingly difficult.
Credits
Special thanks:
- Obormot for the javascript template.
- Gwern, Arfa, l4rz, and Aydao for inspiration and their techincal writeups.
- The wonderful people at Cognito for helping me with the website setup and design.
- Logo made using https://pixel-me.tokyo/en/.
- Nearcyan for allowing me to use his code from TADNE.
- DiffManagement for technical support and guidance.