{"id":965,"date":"2025-08-14T08:43:14","date_gmt":"2025-08-14T07:43:14","guid":{"rendered":"https:\/\/metrics.blogg.gu.se\/?p=965"},"modified":"2025-08-14T08:43:14","modified_gmt":"2025-08-14T07:43:14","slug":"software-on-demand-from-ides-to-intent","status":"publish","type":"post","link":"https:\/\/metrics.blogg.gu.se\/?p=965","title":{"rendered":"Software on Demand: from IDEs to Intent"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/metrics.blogg.gu.se\/files\/2025\/08\/software_on_demand.png\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"683\" src=\"https:\/\/metrics.blogg.gu.se\/files\/2025\/08\/software_on_demand-1024x683.png\" alt=\"\" class=\"wp-image-966\" srcset=\"https:\/\/metrics.blogg.gu.se\/files\/2025\/08\/software_on_demand-1024x683.png 1024w, https:\/\/metrics.blogg.gu.se\/files\/2025\/08\/software_on_demand-300x200.png 300w, https:\/\/metrics.blogg.gu.se\/files\/2025\/08\/software_on_demand-768x512.png 768w, https:\/\/metrics.blogg.gu.se\/files\/2025\/08\/software_on_demand-1200x800.png 1200w, https:\/\/metrics.blogg.gu.se\/files\/2025\/08\/software_on_demand-1320x880.png 1320w, https:\/\/metrics.blogg.gu.se\/files\/2025\/08\/software_on_demand.png 1536w\" sizes=\"(max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 1362px) 62vw, 840px\" \/><\/a><\/figure>\n\n\n\n<p>OpenAI\u2019s latest keynote put one idea forward: coding is shifting from writing <strong>lines<\/strong> to expressing <strong>intent<\/strong>. With GPT-5\u2019s push into agentic workflows\u2014and concrete coding gains on benchmarks like SWE-bench Verified\u2014the \u201csoftware on demand\u201d era is no longer speculative. You describe behavior; an agent plans, scaffolds, implements, runs tests, and iterates. Humans stay in the loop as product owners and reviewers. <\/p>\n\n\n\n<p>What\u2019s different now isn\u2019t just better autocomplete. OpenAI\u2019s platform updates (Responses API + agent tooling) are standardizing how models call tools, navigate repos, and execute tasks, turning LLMs into reliable collaborators rather than clever chatboxes. The keynote storyline mirrored what many teams are seeing: agents that can reason across files, operate tests, and honor constraints\u2014then explain their choices.<\/p>\n\n\n\n<p>There\u2019s still daylight between today\u2019s agents and fully autonomous engineers\u2014OpenAI itself acknowledged the limits\u2014but the arc is clear. In the near term, expect product teams to specify features as executable specs: a prompt plus acceptance tests. Agents draft code; CI catches regressions; humans approve merges. The payoff is faster iteration and broader access: more people can \u201cprogram\u201d without memorizing frameworks, while specialists curate architecture, performance, and safety. <a href=\"https:\/\/www.theguardian.com\/technology\/2025\/aug\/07\/openai-chatgpt-upgrade-big-step-forward-human-jobs-gpt-5?utm_source=chatgpt.com\" target=\"_blank\" rel=\"noreferrer noopener\">The Guardian<\/a><\/p>\n\n\n\n<p>If you\u2019re experimenting, start small: encode user stories as tests, let an agent propose patches, and gate everything behind your normal review. The orgs that win won\u2019t be the ones that replace engineers\u2014they\u2019ll be the ones that <strong>instrument<\/strong> intent, tests, and guardrails so agents can ship value on demand.<\/p>\n\n\n\n<p>I&#8217;m already pass that &#8211; experimenting at large with prompts writing requirements, LLMs using design patterns and developing add-ins to Visual Studio to make these tools available. <\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">Suggested research &amp; resources<\/h2>\n\n\n\n<ul>\n<li><strong>SWE-bench &amp; SWE-bench Verified<\/strong> \u2013 Real-world GitHub issue benchmark (plus a human-validated subset) used to measure end-to-end software fixing by LLMs\/agents. Great for evaluating \u201csoftware on demand\u201d claims. <a href=\"https:\/\/arxiv.org\/abs\/2310.06770?utm_source=chatgpt.com\" target=\"_blank\" rel=\"noreferrer noopener\">arXiv<\/a><a href=\"https:\/\/openai.com\/index\/introducing-swe-bench-verified\/?utm_source=chatgpt.com\" target=\"_blank\" rel=\"noreferrer noopener\">OpenAI<\/a><\/li>\n\n\n\n<li><strong>SWE-agent (NeurIPS 2024)<\/strong> \u2013 Shows that agent-computer interfaces (file navigation, test execution) dramatically improve automated software engineering. Useful design patterns for your own agents. <a href=\"https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2024\/hash\/5a7c947568c1b1328ccc5230172e1e7c-Abstract-Conference.html?utm_source=chatgpt.com\" target=\"_blank\" rel=\"noreferrer noopener\">proceedings.neurips.cc<\/a><a href=\"https:\/\/arxiv.org\/abs\/2405.15793?utm_source=chatgpt.com\" target=\"_blank\" rel=\"noreferrer noopener\">arXiv<\/a><\/li>\n\n\n\n<li><strong>AutoDev (Microsoft, 2024)<\/strong> \u2013 Framework for autonomous planning\/execution over repos, with strong results on code and test generation; a good reference for multi-tool agent loops. <a href=\"https:\/\/arxiv.org\/html\/2403.08299v1?utm_source=chatgpt.com\" target=\"_blank\" rel=\"noreferrer noopener\">arXiv<\/a><a href=\"https:\/\/visualstudiomagazine.com\/articles\/2024\/03\/20\/autodev.aspx?utm_source=chatgpt.com\" target=\"_blank\" rel=\"noreferrer noopener\">Visual Studio Magazine<\/a><\/li>\n\n\n\n<li><strong>OpenAI: New tools for building agents (2025)<\/strong> \u2013 Overview of the Responses API and how to wire tools\/function-calling for robust agent behavior. <a href=\"https:\/\/openai.com\/index\/new-tools-for-building-agents\/?utm_source=chatgpt.com\" target=\"_blank\" rel=\"noreferrer noopener\">OpenAI<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>OpenAI\u2019s latest keynote put one idea forward: coding is shifting from writing lines to expressing intent. With GPT-5\u2019s push into agentic workflows\u2014and concrete coding gains on benchmarks like SWE-bench Verified\u2014the \u201csoftware on demand\u201d era is no longer speculative. You describe behavior; an agent plans, scaffolds, implements, runs tests, and iterates. Humans stay in the loop &hellip; <a href=\"https:\/\/metrics.blogg.gu.se\/?p=965\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Software on Demand: from IDEs to Intent&#8221;<\/span><\/a><\/p>\n","protected":false},"author":68,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5],"tags":[],"_links":{"self":[{"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=\/wp\/v2\/posts\/965"}],"collection":[{"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=\/wp\/v2\/users\/68"}],"replies":[{"embeddable":true,"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=965"}],"version-history":[{"count":1,"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=\/wp\/v2\/posts\/965\/revisions"}],"predecessor-version":[{"id":967,"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=\/wp\/v2\/posts\/965\/revisions\/967"}],"wp:attachment":[{"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=965"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=965"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=965"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}