Current thread
Language-conditioned control without retraining.
I’m exploring RL agents that can adapt behaviour from natural-language goals without learning a fresh policy for every task variant. The motivating question is how to build agents that are more flexible at inference time while staying grounded in stable training pipelines.
The work combines hybrid control ideas, learned language representations, and a practical concern for what can actually be trained, inspected, and iterated on by a small team.
Questions I care about
- RL architectures that separate reusable competence from task-specific objectives
- Evaluation setups that make behavioural changes obvious and measurable
- Tooling that shortens the loop between experiment design and implementation
- Interfaces between language signals and continuous-control policies