The Future
9 years ago
Here is seome unsorted ramblings on future project ideas floating in my head. I can't guarantee I'll actually ever implement any of this, so don't take it as a "roadmap". A lot of the more lofty goals will be hitting the hard limit of how much money I have to spend on high end GPUs. Which isn't much. I'm not going to give any references on jargon, it's more for my own notekeeping than anything.
End Game: A system that you give a description of an image (or animation) you want (e.g. "A female cat anthro with extremely large breasts has vaginal sex with a male fox anthro with a large penis drawn cartoony") and it outputs images to that specification. Maybe one could even incorporate the ability to "encode" new characters and situations, use "heatmaps" to sketch what the final result should be like etc. Lofty goal, definitely not doable unless I had at least some funding, but hypothetically feasible.
-Scale up ERIS-1 to higher resolutions (specifically 128x128) DONE!
-Download and munge the entire e621 dataset
-Create ERIS variants with different architecture. Specifically, I am interested in the effects of significantly deeper networks.
-Try sparse autoencoders and deep belief networks instead of Generative Adversarial Networks
-Train a VGG-19 like network to tag images
-Use things learned from previous experiments to create a system that takes as input a list of tags and outputs an image
-Train extremely specific networks as feature extractors to use in better generation and classification
-Experiment with attention for image composition
-Create a livestream game where users in chat can feed in tags/suggestions that are used to create images live
-Use similar techniques plus frame motion to generate videos (Most likely not feasible with current technology)
-Incorporate 3D simulation to allow "rotating" of characters, objects and scenes
End Game: A system that you give a description of an image (or animation) you want (e.g. "A female cat anthro with extremely large breasts has vaginal sex with a male fox anthro with a large penis drawn cartoony") and it outputs images to that specification. Maybe one could even incorporate the ability to "encode" new characters and situations, use "heatmaps" to sketch what the final result should be like etc. Lofty goal, definitely not doable unless I had at least some funding, but hypothetically feasible.
-Scale up ERIS-1 to higher resolutions (specifically 128x128) DONE!
-Download and munge the entire e621 dataset
-Create ERIS variants with different architecture. Specifically, I am interested in the effects of significantly deeper networks.
-Try sparse autoencoders and deep belief networks instead of Generative Adversarial Networks
-Train a VGG-19 like network to tag images
-Use things learned from previous experiments to create a system that takes as input a list of tags and outputs an image
-Train extremely specific networks as feature extractors to use in better generation and classification
-Experiment with attention for image composition
-Create a livestream game where users in chat can feed in tags/suggestions that are used to create images live
-Use similar techniques plus frame motion to generate videos (Most likely not feasible with current technology)
-Incorporate 3D simulation to allow "rotating" of characters, objects and scenes