March 2026 – SE metrics (Software Engineering)

Can You Trust GPT with Your System Design? Testing AI’s Architectural IQ

Image by Vinson Tan ( 楊祖武 ) from Pixabay

https://ieeexplore.ieee.org/document/10978937

We’ve all seen Large Language Models (LLMs) write impressive snippets of code or debug a tricky function. But can an AI actually understand the soul of a system? Can it explain the “why” behind a complex architectural decision?

The paper, “Do Large Language Models Contain Software Architectural Knowledge? An Exploratory Case Study with GPT,” puts this to the test. Researchers did a study with 14 software engineers to see if GPT could navigate the Architectural Knowledge (AK) of a massive, real-world system: the Hadoop Distributed File System (HDFS).

The Experiment: AI vs. The Ground Truth
Engineers grilled GPT with questions ranging from basic component identification to deep design rationales. Their answers were then compared against a verified “ground truth” of HDFS documentation.

The Results
The study revealed a fascinating dichotomy in GPT’s performance: Recall was ok: GPT is surprisingly good at “remembering” things. It showed moderate recall, meaning it could often identify the correct architectural components and general concepts buried in its training data. Precision was really bad (guessing is much better): It struggled with accuracy. The model often suffered from lower precision, frequently providing answers that sounded authoritative but were technically incorrect or “hallucinated.”

When asked about design rationales (why a specific solution was chosen) or quality attribute solutions, GPT’s performance dipped significantly. It can tell you what is there, but it struggles to explain the engineering trade-offs.

The Takeaway for Architects
The engineers in the study rated GPT’s trustworthiness as only moderate. The verdict is clear: GPT is a fantastic tool for initial discovery and brainstorming, but it cannot be used as a source of truth for critical system design.

The Bottom Line is to treat LLMs as junior architects with a photographic memory but a shaky grasp of logic. They are great for a first draft, but expert human validation remains the most important step in the process.

GenAI: The Architect’s New Brainstorming Buddy, Not a Replacement

https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=11015085&casa_token=5hSfww3AlwIAAAAA:eTn9d1W-p95CJtxwAcvPft_bWZB9R8i6P-d1IZBln6MmSF-En1Q4vKdgbejF8w2klKZYeX1VZx4&tag=1

For years, software architects have operated in an “automation gap.” While developers enjoy robust CI/CD pipelines and automated testing, architects have largely relied on manual whiteboarding and expert intuition. With the rise of Generative AI (GenAI), many wonder: Is the gap finally closing?

In this paper, researchers provide a reality check. Their verdict? GenAI is a powerful “tutor” and “brainstormer,” but it isn’t ready to take the captain’s chair.

Where GenAI Shines

The study identifies a high “GenAI Fit” for tasks that are traditionally “loud” and creative. It excels at:

Brainstorming: Identifying potential stakeholders or generating design alternatives.
Drafting: Creating well-formed Architecturally Significant Requirements (ASRs) from raw notes.
Summarization: Condensing complex documentation into digestible views.

Where it does not fit!

However, the “gap” remains for high-fidelity tasks. GenAI struggles with objective analysis. It can’t reliably prioritize requirements, verify the correctness of architectural views, or resolve conflicting design decisions. These tasks require the subjective judgment and deep organizational context that only a human architect possesses.

The Future: Hybrid Workflows

The path forward isn’t replacing architects with bots; it’s about hybrid workflows. By pairing GenAI with traditional tools (like static analyzers) to fact-check its “hallucinations,” we can finally automate the tedious parts of architecting while leaving the critical, high-stakes decisions to the experts.

The Bottom Line: Use GenAI to widen your perspective and draft your docs, but keep your hands on the wheel when it comes to the “why” behind your system.

Is Your Microservice Architecture Causing Heartburn? The Cost of Static Chaos on Runtime Speed

BIld av stux från Pixabay

https://cs.gssi.it/catia.trubiani/download/2025-ICSA-Correlation-Architecture-Performance-Antipatterns.pdf

In the world of microservices, we often chase the dream of independent deployment, rapid scaling, and resilient services. We focus on the dynamic—the Kubernetes pods autoscaling, the latency spikes, the load balancer metrics. We assume that if we have a robust runtime, our architecture is sound.

But this study suggests we have been ignoring a crucial connection. We are too often treating the symptoms, not the disease.

The research team, using the massive Train Ticket benchmark system, decided to prove something architects have suspected for years: The way you draw your boxes and arrows directly dictates your application’s carbon footprint and response time.

They didn’t just guess; they used advanced tooling to quantify the chaos. By combining service call dependency mapping with Design Structure Matrices (DSM) that also tracked subtle entity-sharing (services talking behind each other’s backs via a shared database), they revealed invisible architectural decay. They matched static Architecture Antipatterns (e.g., “Cliques”—tightly clustered groups that must change together) against dynamic Performance Antipatterns (e.g., “Blobs”—services that become bottlenecks).

The results are a wake-up call for any DevOps team trying to scale a legacy monolith that’s masquerading as microservices.

A Roadmap to Technical Debt Management
The impact on practice is clear. This study validates that we must merge static and dynamic analysis. We cannot separate the “Dev” and “Ops.”

Stop Guessing: You cannot optimize what you cannot measure. Utilize tooling that visualizes both runtime traffic and structural dependencies.

Prioritize Refactoring: Performance monitoring based on real operational profiles tells you where the bottleneck is. Combining this with architecture analysis tells you why it is there and which structural repair will deliver the greatest performance ROI.

Green Your Code: Every redundant service call, every unneeded database join, and every “Chatty Service” antipattern is wasted energy. Good architecture is sustainable architecture.

It’s time to stop thinking that Kubernetes will save your tangled architecture. The next time you see a latency spike, don’t just add more pods. Check your blueprints. The fastest system is one that doesn’t have to do unnecessary work.