I completely ignored Anthropic’s advice and wrote a more elaborate test prompt based on a use case I’m familiar with and therefore can audit the agent’s code quality. In 2021, I wrote a script to scrape YouTube video metadata from videos on a given channel using YouTube’s Data API, but the API is poorly and counterintuitively documented and my Python scripts aren’t great. I subscribe to the SiIvagunner YouTube account which, as a part of the channel’s gimmick (musical swaps with different melodies than the ones expected), posts hundreds of videos per month with nondescript thumbnails and titles, making it nonobvious which videos are the best other than the view counts. The video metadata could be used to surface good videos I missed, so I had a fun idea to test Opus 4.5:
Жители Санкт-Петербурга устроили «крысогон»Сегодня
。Line官方版本下载对此有专业解读
September 2025: I added the Dreame Aqua10 Ultra Roller as the best robot vacuum for pet hair on carpet and shifted the Roborock Saros 10R (previously named the best robot vacuum overall) to the best robot vacuum for pet hair on hard floors.
В России ответили на имитирующие высадку на Украине учения НАТО18:04
在半导体领域,钪的紧缺同样令人忧心。研究机构SemiAnalysis创始人兼首席执行官迪伦·帕特尔指出,美国芯片制造商的钪库存正在走低,这可能危及新一代5G芯片的生产。