AI Applications
How can we define an AI application? It is an application that uses an LLM model to solve some problem or complete some task. A classic example is a chatbot application or an on-call assistant.
AI has many great use cases and has enormous potential to streamline processes within a company, but it is important to choose the right approach when building these applications. Two of the many ways to develop an AI application.
The first approach uses complex frameworks such as langchain. The entire application is dependent on LLM models from input to output. We can compare this to a black box. The developer does not see how the framework works in detail – they only see that some input produces some output. The second approach may look like using a minimal LLM framework and using simple LLM API calls only where they are needed. This gives the developer much greater insight into how the application runs. In this blog post, we'll compare the differences between these two approaches, their advantages and disadvantages.
Complex AI/LLM Frameworks
As mentioned, the kind of framework we're talking about here can be represented by langchain, an open source library that has become very popular for writing AI applications. It supports many LLM providers such as OpenAI, Anthropic, Gemini, and many others 1. It works as a wrapper over raw API calls from LLM providers. This means it completely separates the application developer from details such as individual API calls and prompt construction, thereby creating an abstraction layer. It provides ready-made components such as LLM agents, memory management, and allows creating so-called chains, which are sequences of steps that lead to some result. For example:
rag_chain = (
{"context": retriever | format_docs, "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)
This chain creates a RAG chain that answers user questions by searching for information in a vector database. It is clear that a fairly complex system like RAG can be assembled in just a few lines (of course, in a real application this piece of code alone is not enough, but it is still only ~50 lines of code; the full example can be found here), which is a huge advantage when building PoC (Proof of Concept) applications. However, it may not be entirely suitable for building stable production applications.
All these components allow AI applications to be created quickly. But there is one fundamental problem. The abstraction layer can also be thought of as a black box. It is not possible to see exactly down to the lowest level of how the application works, and we are unable to see exactly how this langchain application functions. This means it is very difficult to debug and trace problems that occur in a production environment. Finding problems is especially important in AI applications due to the nature of LLM models. LLM models are unpredictable – if we always give the LLM the same input, the output will likely always be different. This can cause the application to be very non-deterministic, which leads to undesirable behavior from the end customer's perspective. We cannot avoid this fact, but complex frameworks like langchain only worsen the situation with their large abstraction layer.
API Calls
This approach to building an application is the opposite of using frameworks. It involves using nearly raw API calls. Instead of a direct API call, a very small and lightweight wrapper is used. An example call to the OpenAI LLM might look like this:
from openai import OpenAI
client = OpenAI()
response = client.responses.create(
model="gpt-5",
input="Write a one-sentence bedtime story about a unicorn."
)
At first glance, this may seem very laborious and tedious when using only raw API calls. We thought the same when we started developing AI applications, but this approach has proven itself to us. When these simple calls are used, a very small abstraction is created over the LLM itself. We also have much greater control over the LLM calls, and this control allows us to build applications that are more predictable. As a result, far fewer undesirable situations and bugs occur. Tracing and debugging works the same as in regular applications, and we have full visibility into the entire runtime of the application rather than working with a black box like with langchain.
Another advantage of this approach is the ability to use LLM calls only where they are truly needed. This is useful because not every part of an application necessarily needs to use AI. As I mentioned, LLMs are inherently unpredictable, so we want to use LLMs only where it is actually necessary. For example, searching through json files. If langchain were used and a json object was produced mid-application – say, from an API call – langchain would send that json to the LLM to find some needed key. Yet it would be sufficient to use standard json processing: loading it into memory and finding the given key without involving an LLM at all. The same applies to LLM tool calls.
A drawback of this approach, however, can be that switching LLM providers may be more difficult, since the APIs can differ slightly. This can be easily avoided by having the developer create a very simple wrapper over the calls, since conceptually all LLM providers work with the same input – it is just defined differently.
Structured Output from an LLM Model
The standard output of an LLM API call is unstructured. We can compare it to an answer from a person. The response will likely never be the same for the same question, unless it is a yes/no question. This is why structured output 2 is used. It allows you to define the data structure that the LLM must return. The structure is typically defined in json format. This feature allows us to bring a bit more predictability to our AI applications.
The OpenAI Python client cleverly uses the Pydantic library, which is commonly used for defining types in Python. So instead of defining a complex json, it is enough to create a model and use it in the LLM call. Pydantic also supports describing individual properties of the type, which are then used in the structured output definition. For example:
class InitialResponseSchema(ChoiceModel):
"""
Entrypoint to the request chain from the user.
"""
service: Service = Field(description="Service the on call issue is about")
severity: int = Field(
description="The severity of the request (1=high, 2=medium, 3=low)."
)
response = client.responses.create(
model="gpt-5",
input="Clasify the request",
response_format=InitialResponseSchema,
)
Conclusion
This blog post described 2 ways to write AI applications, their advantages and disadvantages. For our purposes, we chose the raw API calls approach, as it allows us to bring greater stability, predictability, and easier debugging.