ORPO: Preference Optimization without the Supervised Fine-tuning (SFT) Step
Cornell University Researchers Introduce Reinforcement Learning for Consistency Models for Efficient Training and Inference in Text-to-Image Generation
Memory and new controls for ChatGPT
Knowledge Bases for Amazon Bedrock now supports metadata filtering to improve retrieval accuracy
Reinforcement Learning: Introduction and Main Concepts | by Vyacheslav Efimov | Apr, 2024
Meet Sailor: A Family of Open Language Models Ranging from 0.5B to 7B Parameters for Southeast Asian (SEA) Languages