Bio Generator and Image Captioning

Temitayo Shorunke
May 19, 2024
1 min read

This week, I focused on two part of our capstone projects: developing a bio generator and generating image captions for training data. Initially, I used a pre-trained GPT-2 model from the transformers library to create bios. Despite extensive fine-tuning, the model struggled with consistency and accuracy, producing repetitive and insufficiently diverse outputs. Recognizing these limitations, I pivoted to a new approach: a random bio generator. This script uses extensive lists of gender-specific first names, last names, occupations, hometowns, and other details. It processes user prompts to determine gender and age group, ensuring diverse and realistic bios. This solution proved reliable and flexible, meeting our immediate needs effectively.

Simultaneously, I worked on an image captioning script for Face Craft, crucial for generating training data for the GAN networks. The script uses the VisionEncoderDecoderModel from Hugging Face's transformers library, specifically the "nlpconnect/vit-gpt2-image-captioning" model. It processes images to generate captions, vital for understanding and labeling the training data. Despite the script’s efficiency, the generated captions were not as detailed as expected, often lacking specificity and richness needed for high-quality training data. This shortcoming led me to consider alternative methods for improving the detail and accuracy of the image captions.

Overall, this week’s work highlighted the importance of flexibility and adaptation in project development. The random bio generator provided a practical solution, while the image captioning script laid the groundwork for further improvements in Face Craft.

Bio Generator and Image Captioning

Recent Posts

Comentarios