Preference optimization, instruction following, multimodal alignment, evaluation loops, and data recipes that make model behavior measurable.
Research Console / Personal OS Hub
Winston / Why Not Sleep
Why Not Sleep is the public routing layer for long-form engineering work, playable experiments, project notes, and personal archives. It favors durable links, reproducible context, and finished artifacts over feed-shaped noise.
- LLM/MM post-training
- Agentic RL
- Generative search ads
- Recommendation systems
- Static publishing
- Browser games
About
Large-model systems, multimodal post-training, and agentic optimization.
Winston is a large-model and multimodal algorithm engineer focused on post-training for LLM/MM systems, agentic reinforcement learning, and generative search advertising and recommendation. This site is the public index for technical writing, games, project notes, design studies, manuscripts, and slower personal records.
Training and evaluation patterns for agents that plan, use tools, recover from errors, and improve through interaction instead of static prompting alone.
Retrieval, ranking, generation, auction-aware objectives, user intent modeling, and feedback systems for search and recommendation surfaces.
Channel Index
Seven channel routes, one durable front door.
Current Shape
Artifacts first. Claims later.
Reserved subdomains