Hacker Neus
Is One Layer Enough? A Single Transformer Layer Matches Full-Parameter RL Train
(arxiv.org)
95 points
by tcp_handshaker
6 hours ago |
22 comments