By the end of Episode 3 (video), the assistant could act. Tool calls let it look up orders, initiate returns, and create support tickets — real backend operations against Oracle, not simulated responses. But every one of those operations happened synchronously inside a single HTTP request. The chat endpoint called a tool method, that method did the work inline, and the response went back to the caller. All of it blocking, all of it inside the same transaction.
That works for simple demos. It starts breaking when the work is slow, depends on external systems, has multiple steps, or needs to be retried independently of the chat request.
The single most important idea in this episode is direct: the assistant starts workflows. The backend owns workflows.
In Episode 3, the tool was doing everything: validating the request, writing to Oracle, returning a result. If validation failed, the model got a clear error to relay. If it succeeded, the row was written and the request was done. Clean and correct for the demo.
But it ties the chat request tightly to the outcome of the workflow. If the workflow takes five seconds, the user waits five seconds for a reply. If the workflow involves multiple downstream steps, all of them need to complete inside the HTTP timeout. If something fails midway, the tool fails, and the model tries to explain an error that probably makes no sense to a customer.
The event-driven model separates those concerns. The tool’s job is to validate that the request makes sense and publish an event. The consumer’s job is to pick up that event and do the actual work. The user gets a fast response either way.
Figure 1 — The key architectural shift. On the left, the Episode 3 synchronous path: the tool call, validation, Oracle write, and response all happen in the same HTTP thread. On the right, the Episode 4 path: the tool call publishes an event and the response returns immediately at the response boundary. A downstream consumer handles validation and the database write in a separate transaction.
What changes
The architecture from Episodes 1 through 3 carries forward unchanged. The memory advisor, the vector-store advisor, and the Oracle-backed persistence are all still there. The chat client configuration is the same. AgentTools still exposes the same three @Tool methods with the same descriptions.
What changes is what those methods do internally, and what powers the new workflow layer: Oracle TxEventQ.
Oracle TxEventQ supports the Kafka wire protocol, but here it is accessed through JMS via the Oracle AQ JMS starter. From the Spring application’s perspective, the queue looks like any other JMS destination — JmsTemplate for publishing, @JmsListener for consuming. Nothing Kafka-specific in the application code.
The reason this matters: by Episode 4, Oracle is handling relational state, vector retrieval, conversation memory, and event streaming. No additional infrastructure.
Two event types. The constants on the record itself keep string literals out of the rest of the code. @JsonInclude(JsonInclude.Include.NON_NULL) means unused fields are omitted from serialized JSON — a return event does not include issue or priority, a support ticket event does not include reason.
The conversationId field carries the conversation ID from the original chat request through to the consumer. The consumer knows which conversation triggered the workflow. That is useful if the system eventually needs to send a message back into the conversation when work completes.
How the tool changes
The most visible change is in AgentTools. The initiateReturn method went from doing validation and database writes inline to doing a quick existence check and publishing an event:
@Tool(description="Initiate a return for an eligible delivered ShopAssist order after backend validation.")
@Transactional(readOnly=true)
publicStringinitiateReturn(
@ToolParam(description="The ShopAssist order ID, for example ORD-1001.")StringorderId,
@ToolParam(description="The customer's reason for the return.")Stringreason,
return"Order %s was not found, so a return workflow could not be started.".formatted(normalizedOrderId);
}
workflowEventPublisher.publish(newWorkflowEvent(
WorkflowEvent.RETURN_REQUESTED,
UUID.randomUUID(),
Instant.now(clock),
conversationId(toolContext),
normalizedOrderId,
reason.trim(),
null,
null
));
return"Return workflow started for order %s.".formatted(normalizedOrderId);
}
The method is now @Transactional(readOnly = true). It only writes one thing: nothing. It confirms the order exists, publishes a RETURN_REQUESTED event, and returns. The return string is “Return workflow started” rather than “Return initiated” — a deliberate phrasing change that the system prompt picks up on.
The third parameter, ToolContext toolContext, is new. Spring AI passes tool context to any tool method that declares it. AssistantService populates it with the conversation ID at call time:
The interface abstraction means unit tests can inject an in-memory publisher without touching JMS at all. The real implementation serializes the event to JSON, sends it as a JMS text message, and logs the key identifiers.
"Discarding unknown workflow event type eventType={} eventId={}",
event.eventType(),
event.eventId()
);
}
}
@JmsListener and @Transactional together mean the message dequeue and the database write are part of the same transaction. If the database write fails, the message stays on the queue.
The business validation that lived inside AgentTools in Episode 3 has moved to the consumer. handleReturnRequested re-checks order status, the return window, and whether a return is already in progress before calling save():
logger.info("Return workflow updated order state orderId={} status={}",orderId,order.getStatus());
}
The consumer does not trust the event blindly. It re-validates because events can be replayed or arrive out of order. The idempotency check — if the status is already PREPARING_RETURN, log and return without error — means processing the same event twice has no effect.
Figure 2 — The full event-driven flow. A chat request arrives, AgentTools confirms the order exists, JmsWorkflowEventPublisher serializes and sends the event to SHOPASSIST_WORKFLOW_TEQ, and the response returns at that point. Separately, WorkflowEventConsumer picks up the event via @JmsListener, re-validates the business rules, and writes to Oracle. The consumer’s @Transactional covers the dequeue and the database write as one unit.
The queue setup
Oracle TxEventQ is created by a SQL script that runs on every container start:
The EXCEPTION block catches the Oracle error codes for “queue already exists” and “queue table already exists” and silently continues. This makes the script safe to run against an existing volume — the first run creates the queue, every subsequent run does nothing.
The same script grants enqueue and dequeue privileges to the application schema, so the Spring application uses the same database credentials for JMS messaging that it uses for JDBC everywhere else.
The system prompt
The system prompt was updated to reflect the workflow semantics:
app:
assistant:
system-prompt:>
You are ShopAssist, a concise and practical support assistant for a demo
electronics store. Use retrieved policy context when it is available.
Use prior messages only when they are available through the active
conversation ID. Do not invent policy details. Do not invent order
details. Use tools for order status lookup, return initiation, and
support ticket creation. Treat tool results as the source of truth for
business actions and explain validation failures clearly. When a tool
returns a workflow-started message, relay it directly to the user. Do
not imply the action has already completed. Do not invent a workflow
status. If the answer is not grounded in retrieved context, current
conversation history, or tool results, say you do not know. Do not share
memory across conversation IDs.
The critical addition: “When a tool returns a workflow-started message, relay it directly to the user. Do not imply the action has already completed.”
Without that instruction, a model will naturally rephrase “Return workflow started for order ORD-1001” into something like “I’ve initiated your return” — which implies instant completion. That would be inaccurate and would confuse users who check their order status immediately afterward. The prompt constraint prevents it. This is a good example of the system prompt doing coordination work that code cannot easily do.
Startup behaviour
On startup, DataSeeder drains any stale messages from the queue before seeding the demo orders:
logger.info("Drained stale workflow event from {} on startup",queueName);
}
}
Events published in a previous run persist in TxEventQ across container restarts because the queue is backed by Oracle’s durable storage. Draining on startup ensures that old events from a previous demo session do not get processed unexpectedly when the application restarts with freshly seeded data.
Trying it
The same four demo orders from previous episodes are seeded: ORD-1001 (delivered, within the 30-day return window), ORD-1002 (shipped), ORD-1003 (delivered, outside the return window), ORD-1004 (processing).
Return workflow:
curl-s-X POST http://localhost:8080/api/v1/agent/chat
-H"Content-Type: application/json"
-H"X-Conversation-Id: demo-1"
-d'{"message":"Initiate a return for ORD-1001 because the product was defective."}' | jq
The tool confirms the order exists, publishes a RETURN_REQUESTED event, and returns immediately. In the application logs you will see two lines in quick succession: the publisher logging Published workflow event and the consumer logging Received workflow event, followed by Return workflow updated order state. The response was already back at the client before the consumer finished.
Support ticket workflow:
curl-s-X POST http://localhost:8080/api/v1/agent/chat
-H"Content-Type: application/json"
-H"X-Conversation-Id: demo-1"
-d'{"message":"Create a high-priority support ticket for ORD-1002 because shipping is stuck."}' | jq
The tool verifies the order exists, publishes a SUPPORT_TICKET_REQUESTED event, and returns. The consumer inserts the ticket row with priority HIGH.
After both requests, the database reflects the results:
SELECT ORDER_ID, STATUS FROM CUSTOMER_ORDER;
ORD-1001 shows PREPARING_RETURN. The other three orders are unchanged.
SELECT TICKET_ID, ORDER_ID, PRIORITY, STATUS FROM SUPPORT_TICKET;
One ticket row for ORD-1002 with priority HIGH and status OPEN.
The important thing to notice: the assistant reported “Return workflow started for ORD-1001” rather than “Return initiated”. The system prompt worked. The model did not imply the return was already complete.
Where things stand
Episode 1 made the assistant knowledgeable. Episode 2 made it remember. Episode 3 made it act. Episode 4 connects those actions to backend workflows.
Figure 3 — Oracle’s role across all four episodes. The relational tables and vector store arrived in Episode 1. Chat memory was added in Episode 2. The support ticket table came in Episode 3. Oracle TxEventQ event streaming arrived in Episode 4. Single database connection pool, no additional infrastructure.
At this point Oracle is the backing store for every layer of the application: relational order data in CUSTOMER_ORDER, vectorized policy documents in the Oracle Vector Store table, conversation history in SPRING_AI_CHAT_MEMORY, support ticket records in SUPPORT_TICKET, and event streaming through SHOPASSIST_WORKFLOW_TEQ. All of it through one database, one connection pool.
The assistant’s role throughout the series has stayed consistent. It retrieves knowledge. It remembers conversations. It initiates actions. It starts workflows. In every case, the backend owns what happens next. That boundary — the model orchestrates, the backend decides and executes — is what makes the system trustworthy rather than unpredictable.
/metrics responds. Prometheus scrapes the target. Grafana shows panels. Maybe oracledb_up is 1 in your deployed version, and database-side signals finally sit near application latency, deployment events, queue behavior, and other service telemetry.
That is a useful milestone.
It is not production readiness.
Productionizing Oracle Database Metrics Exporter means controlling what the exporter can read, who can scrape it, what labels leave the database boundary, and who owns alerts, dashboards, runbooks, upgrades, and readiness review.
A working local scrape proves that Oracle Database Metrics Exporter can connect to a database and expose metrics in a Prometheus-compatible format. Production asks a different set of questions. What database identity does the exporter use? Which database views and tables can that identity read? Where are credentials, connect strings, and wallets stored? Who can scrape /metrics? Which labels leave the database boundary? Are custom SQL metrics safe at production scale? Which alerts page humans? Who owns the exporter when it breaks?
The production model is simple to state and easy to underestimate:
Treat Oracle Database Metrics Exporter as both a privileged Oracle Database client and a private scrape target.
That is the thesis of this article. Productionizing Oracle Database Metrics Exporter is not mainly about getting another container to run. It is about controlling what the exporter can read, who can scrape it, what labels leave the database boundary, and who owns the alerts, dashboards, runbooks, upgrades, and rollback path.
For developers building AI and database-backed applications, this matters because Oracle Database may sit directly in the request path. A RAG service might use vector search, JSON metadata filters, relational joins, queue tables, conversation memory, and audit writes. An agentic workflow system may persist state, tool calls, retries, and human approval steps. An ingestion service may compete with retrieval traffic.
When those systems slow down, database observability becomes part of user experience.
The production controls, however, are not AI-specific. They are the same controls you would want for any database-backed service: least privilege, protected credentials, private scraping, reviewed labels, safe custom SQL, actionable alerts, and owned runbooks.
Oracle Database Metrics Exporter is not trying to be Oracle Enterprise Manager, OCI Database Management, or a commercial APM/database-monitoring suite. Those platforms can provide broader discovery, incident management, tuning workflows, dashboards, and operational governance, depending on deployment, licensing, configuration, and operating model. The exporter’s value is narrower: it brings Oracle Database metrics into Prometheus-compatible pipelines that many developer platforms already operate.
That narrower scope is useful. It also means production readiness is your responsibility.
A few terms before the production review
Before we move into the review path, it helps to fix the vocabulary. The terms are common, but small differences matter in production conversations.
An exporter is a process that collects or queries data from another system and exposes it in a metrics format. In this case, Oracle Database Metrics Exporter connects to Oracle Database and exposes metrics for a scraper.
A scraper is a monitoring component, usually Prometheus or an OpenTelemetry Collector Prometheus receiver, that periodically reads the exporter’s metrics endpoint.
Least privilege means granting only the database and platform access needed for approved metrics, not broad access because it is convenient.
A custom SQL metric is a metric defined by an operator or application team using a SQL query, rather than a default metric shipped with the exporter.
Label cardinality is the number of distinct label values, or label combinations, that a metric can produce.
A runbook is a short operational document that tells responders what an alert means, what to check first, how to reduce impact, and when to escalate.
A production-readiness review is a pre-rollout review that checks identity, grants, secrets, network exposure, metrics, labels, alerts, dashboards, runbooks, ownership, upgrades, and rollback criteria.
Those definitions matter because they keep the review grounded. This is not only a deployment review. It is a trust-boundary review.
Version and naming notes
Verify names, image paths, and tags against the current Oracle documentation before rollout.
At the time of writing (May 2026) the current version of Oracle Database Metrics Exporter is 2.3.1, and Oracle’s installation docs used an image path in the form container-registry.oracle.com/database/observability-exporter:2.3.1.
Treat that as a review point, not as a promise that your environment should use the same version.
Before rollout, verify the current release, image tag, command-line flags, default metrics, and configuration syntax against Oracle’s repository and documentation. Also inspect the runtime --help, the default metrics file for the exact tag, and live /metrics output from the image you deploy.
Understand how production changes the boundary
In local development, the topology is usually compact. One person can understand the whole path.
Developer laptop
├─ Oracle Database or test database
├─ Oracle Database Metrics Exporter
├─ Prometheus
└─ Grafana
That setup is valuable because it proves the mechanics. The exporter can connect to Oracle Database. Prometheus can scrape it. Grafana can query Prometheus. The developer can troubleshoot each hop.
Production changes the boundary.
In local development, /metrics is a convenience. In production, /metrics is an internal data boundary.
In local development, broad test access may be acceptable for a disposable lab. In production, exporter grants must be reviewed metric by metric.
In local development, labels are exploratory. In production, labels affect cost, privacy, retention, remote write, dashboard access, and alert annotations.
In local development, the developer owns everything. In production, ownership must be explicit.
The readiness path is less about one deployment shape and more about a sequence of decisions.
This is a readiness flow, not a universal deployment architecture. It shows the decisions that turn “the demo works” into “this exporter is an operated production component.”
Bring the right context to the review
A production review cannot approve an exporter in the abstract. Reviewers need to know exactly what will run, what it will read, what it will expose, and who will operate it.
Bring the exporter version, image tag, runtime --help output, enabled default metrics, custom SQL metrics, and sample /metrics output from the deployed version. Bring the database topology, required grants, credential path, wallet or TCPS requirements, exporter-to-database network path, and scraper-to-exporter network path. Bring the Prometheus or OpenTelemetry Collector configuration, retention and remote-write destinations, dashboard audience, alert routing, upgrade plan, and rollback plan.
Database topology matters. A deployment may be single-instance, CDB/PDB, RAC, Autonomous Database, managed service, on-premises, cloud-hosted, or a combination. One grant recipe and one network policy rarely cover all of those cases.
The review should include application developers, DBAs, SREs or platform engineers, security reviewers, observability owners, and the product or application owner if alerts imply user impact.
This may sound heavy for a “small exporter.” It is not heavy for a component that connects to Oracle Database, queries database views, exposes database-derived metrics, and feeds alerting systems.
Start with the database identity
The first production decision is the database identity.
The exporter should use a dedicated monitoring identity where possible, not an application schema, personal account, DBA account, or shared administrative user.
That identity matters because the exporter is a database client. Every metric maps to some database query. The grants behind those queries define what database information can leave the database boundary. A dedicated identity simplifies access review, audit, rotation, revocation, and incident response.
Oracle’s exporter docs recommend connecting with the lowest possible privileges and roles necessary for the exporter to run. Oracle Database privilege and role behavior is covered in the Oracle Database Security Guide, and many default metrics query dynamic performance views documented in the Oracle Database Reference.
This is not unique to Oracle Database Metrics Exporter. Datadog, Dynatrace, New Relic, custom SQL jobs, and other Oracle Database monitoring paths also need a database identity, grants, credentials, and network reachability.
The safe question is not “Which tool avoids privileges?” The safe question is “Which metrics are approved, which objects do they require, and who reviews the grants?”
-- Illustrative only: grant only views required by approved metrics.
GRANTSELECTON SYS.GV_$INSTANCE TO exporter_monitor;
GRANTSELECTON SYS.GV_$SESSIONTO exporter_monitor;
This snippet is not a complete grant recipe. It shows the shape of a reviewed monitoring identity. The object list must come from the exact default metrics and custom metrics you approve for your topology.
Test grants using the same connection method and session role behavior the exporter uses. Do not assume a query succeeds for the monitoring user in SQL Developer, SQLcl, or an administrative session in exactly the same way it will succeed inside the exporter.
Oracle’s docs may list broad roles such as SELECT_CATALOG_ROLE as a way to make built-in metrics work. A broad role may be documented as a convenience path, but it is not automatically least privilege. Least privilege means reviewing the enabled metrics and granting only the required access where practical for your environment. If your organization allows SELECT_CATALOG_ROLE for operational reasons, document that as an explicit risk acceptance rather than describing it as least privilege.
Do not assume one grant recipe works for single-instance databases, CDB/PDB deployments, RAC, Autonomous Database, managed database services, and cloud-hosted databases. The topology matters.
Manage credentials and wallets as production secrets
After identity comes credential handling.
Treat exporter credentials with the same seriousness as application database credentials. The secret set may include the username, password, password files if supported by the deployed exporter version, Oracle Wallet or TCPS material where required, connect strings, service names, database endpoints, vault references, and tokens used by the deployment platform.
Kubernetes Secrets are a delivery mechanism, not a complete secret-management solution. They can be part of an approved path, but they do not answer rotation, access review, encryption, redaction, backup, or incident response by themselves. See the Kubernetes Secrets documentation for the platform behavior, then apply your organization’s secret-management policy.
Oracle’s 2.3.1 docs show environment-variable connection settings and a configuration-file model that includes fields such as passwordFile and tnsAdmin. Verify the exact field names, nesting, and supported credential methods for the image tag you deploy.
A conceptual configuration shape might look like this:
# Conceptual pattern only. Verify exact field names, nesting,
# and supported credential methods against the current Oracle docs
The important point is not this exact YAML. The important point is the production habit: do not hard-code passwords in manifests or examples. Do not include real service names, tenant names, wallet paths, or connect descriptors in shared docs. Choose one supported credential path, document it, rotate it, and test exporter behavior during rotation.
If the database requires wallets or TCPS, review filesystem permissions, mount paths, renewal process, wallet distribution, wallet revocation, logging behavior, backup exposure, and incident response process.
If you use a vault integration, verify the exact syntax and support level for the deployed exporter version. Do not assume support for OCI Vault, Azure Key Vault, HashiCorp Vault, Secure External Password Store, or external authentication unless the current docs and your runtime test confirm it.
Also check logs. Exporter logs can be useful during connection failures, but logs may include service names, wallet paths, privilege errors, SQL errors, or other sensitive context. Redaction belongs in the production design, not only in the incident review.
Before sharing logs, screenshots, dashboard JSON, alert examples, or runbook excerpts outside the approved audience, review them for database names, service names, schema names, SQL text, tenant identifiers, wallet paths, hostnames, and internal network details.
Keep /metrics private
Once the exporter can connect to the database, protect the scrape endpoint.
The exporter endpoint should not be exposed as a public internet endpoint.
/metrics contains database-derived operational data. Labels may reveal system names, SQL identifiers, SQL text, schema names, service names, queue names, usernames, tenant identifiers, prompts, documents, workflow names, or application-specific values. Even if your initial metric set looks harmless, future default metrics, custom SQL metrics, or exporter upgrades can change what appears at the endpoint.
Prometheus’ model makes the endpoint powerful because scrapers can pull metrics directly. That same model makes endpoint exposure important. Prometheus documents the scrape configuration model in its configuration reference and explains metric labels in the data model.
The practical takeaway is simple: if a client can scrape the exporter, it can see what the exporter exposes before downstream relabeling, dropping, retention, or remote-write filtering.
A private Kubernetes ClusterIP Service is not, by itself, an access-control model. NetworkPolicy can help restrict access, but Kubernetes NetworkPolicy enforcement depends on the cluster networking implementation. Creating a NetworkPolicy in a cluster whose CNI does not enforce NetworkPolicy does not protect the endpoint. Also remember that NetworkPolicy is additive: pods are not isolated for ingress or egress until a policy selects them for that direction. Test enforcement from the Prometheus pod, from an unapproved pod in the same namespace, and from an unapproved pod in another namespace.
TLS and authentication through --web.config.file are version-sensitive and should be verified against the exporter’s current support for Prometheus exporter-toolkit-style web configuration. Do not assume TLS or authentication is enabled by default. Also treat TLS and authentication as policy decisions, not only feature checks: confirm whether your organization requires encrypted scrape traffic, authenticated scrapers, certificate rotation, service-mesh policy, or additional authorization controls for database-derived metrics.
Here is an ingress NetworkPolicy pattern that allows only an approved Prometheus pod to reach the exporter:
# Pattern only. Verify namespace labels, pod labels, port names,
# and CNI NetworkPolicy enforcement in your cluster.
# This is an L3/L4 control, not a replacement for TLS, authentication,
# service mesh policy, RBAC, or data-governance review.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-prometheus-to-oracle-exporter
namespace: app-observability
spec:
podSelector:
matchLabels:
app: oracle-db-metrics-exporter
policyTypes:
- Ingress
ingress:
-from:
-namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: monitoring
podSelector:
matchLabels:
app.kubernetes.io/name: prometheus
ports:
-protocol: TCP
port: 9161
That pattern is useful only if the labels, namespaces, ports, and CNI enforcement match your cluster. It does not replace TLS, authentication, RBAC, service mesh policy, or data-governance review.
NetworkPolicy limits which pods can open a connection; it does not prove the client is an approved scraper identity at the application layer and does not encrypt the scrape payload.
Production review also includes the exporter’s outbound path to Oracle Database:
# cloud gateways, service mesh, DNS behavior, and CNI behavior may require
# a different model. DNS can require TCP/53 as well as UDP/53.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: restrict-oracle-exporter-egress
namespace: app-observability
spec:
podSelector:
matchLabels:
app: oracle-db-metrics-exporter
policyTypes:
- Egress
egress:
-to:
-ipBlock:
cidr: 10.20.30.40/32
ports:
-protocol: TCP
port: 1521
-to:
-namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
-protocol: UDP
port: 53
-protocol: TCP
port: 53
RAC, SCAN listeners, Autonomous Database, private endpoints, cloud gateways, DNS behavior, service mesh sidecars, and CNI-specific FQDN policy can require different designs. An egress policy that is too narrow can break database failover, wallet retrieval, or DNS. An egress policy that is too broad may fail the review.
For RAC, SCAN listeners, private endpoints, and managed database services, prefer a topology-specific tested egress design over a single hard-coded database IP if failover or endpoint rotation is part of the service behavior.
The right answer is topology-specific and tested.
Scrape deliberately with Prometheus or the OpenTelemetry Collector
After the endpoint is private, make the scrape path explicit.
Oracle Database Metrics Exporter exposes Prometheus-compatible metrics. Prometheus scrapes /metrics, stores time series, evaluates PromQL, and can run alert rules. The OpenTelemetry Collector can scrape Prometheus-format metrics through the Prometheus receiver and forward metrics through a Collector pipeline. Grafana visualizes metrics by querying Prometheus or another metrics backend.
Do not blur those roles. Grafana does not make the exporter safe. It makes reviewed metrics visible. The OpenTelemetry Collector can be part of the scrape-and-forward path, but do not claim the exporter natively pushes OTLP unless you verify that behavior for the exact exporter version.
A Prometheus scrape pattern should make interval and timeout deliberate:
# Pattern only. Verify service discovery, TLS, authentication,
# relabeling, job labels, timeout, and interval for your environment.
Prometheus scrape syntax is documented in the Prometheus configuration reference. Static targets keep the example short; production may use Kubernetes service discovery, file service discovery, relabeling, TLS, authentication, or a service mesh.
If your platform standardizes on OpenTelemetry Collector, the Collector can own the scrape:
# Pattern only. Verify that your Collector distribution includes
# the Prometheus receiver and that your backend accepts the resulting metrics.
# Set according to your gateway's actual TLS configuration.
insecure: false
service:
pipelines:
metrics:
receivers:[prometheus]
processors:[batch]
exporters:[otlp]
The OpenTelemetry Collector configuration model is documented in the Collector docs, and the Prometheus receiver lives in the OpenTelemetry Collector Contrib project. Verify receiver availability in the Collector distribution you deploy.
Review scrape interval, scrape timeout, exporter query timeout if supported, and custom metric timeout together.
A short scrape interval can increase database query frequency. A long scrape interval can delay detection and reduce dashboard resolution. Multiple Prometheus servers or Collectors scraping the same exporter can multiply database work if the exporter collects on scrape rather than serving cached results. Verify the exporter’s collection behavior for your deployed version before adding HA scrapers, federation, or parallel Collector pipelines.
Decide which system owns scrape-health alerting before production rollout.
Review default metrics before enabling them broadly
Default metrics are useful because they give teams a starting point. They are still database queries and labels.
Before enabling them broadly, inspect the exact default metrics for the exporter tag you deploy. Use Oracle docs, release notes, and the repository as source anchors, but do not assume the repository main branch matches your deployed image. Compare the tagged default metrics file, the image contents if available, and live /metrics output from your environment.
For each enabled metric, record the metric name, metric type, database views or SQL behind it, required grants, exposed labels, expected cardinality, dashboard or alert usage, interpretation owner, and approval decision. Also record whether any label can contain sensitive identifiers. That review note does not need to be complex, but it should be explicit enough that a DBA, application owner, SRE, and security reviewer can understand what the exporter is allowed to expose.
For the live /metrics review, capture a sample from the same network path the scraper will use. Confirm the exporter build/version if exposed, the oracledb_up value if that metric is present, expected default metrics, expected custom metrics, unexpected labels, high-cardinality labels, sensitive label values, scrape duration or error self-metrics if present, and whether metric names match dashboard and alert queries.
Inspect oracledb_up, if present in your deployed version, because it tells you whether the database is reachable from the exporter’s point of view. Confirm the metric name in live /metrics before wiring dashboards or alerts.
Also inspect exporter self-metrics if they are present in your deployed version. Examples to look for may include scrape error, scrape duration, and scrape count metrics, but names and semantics must be confirmed from live /metrics for the exact image you deploy.
Those metrics help you distinguish database signals from exporter collection problems.
Then inspect session metrics, activity metrics, wait metrics, tablespace or resource metrics, and top SQL metrics if present in your deployed version. Labels such as sql_id and sql_text need special review. SQL IDs, SQL text, usernames, schemas, module names, service names, tenant identifiers, queue names, and workflow names may be sensitive depending on your environment.
Top SQL metrics are triage hints, not a tuning workflow. They can point responders toward the next question, but they do not replace AWR, ASH, SQL Monitor, execution-plan analysis, SQL Tuning Advisor, ADDM, Performance Hub, or DBA workflows.
Truncated SQL text is not automatically safe. A default metric is not automatically approved for every environment.
Metrics guide investigation. They do not prove root cause by themselves.
Add custom SQL only after review
Custom SQL metrics are powerful because they expose business-relevant database signals. They are risky because they can add query cost, sensitive labels, fragile SQL, new grants, and high-cardinality dimensions.
Good custom metric candidates answer bounded operational questions. A queue-depth metric by a small approved queue-name list can be useful. So can failed job counts by a bounded status list, stale workflow counts aggregated by workflow type, ingestion backlog by approved pipeline name, or RAG document indexing backlog aggregated by approved status categories.
Poor candidates are metrics that turn user, request, document, prompt, session, tenant, URL, file, exception, raw SQL text, or workflow-instance values into labels. Queries that scan large application tables every scrape are also poor candidates, even if they look harmless in development.
A bounded queue-depth metric might have this shape:
# Pattern only. Verify current custom metric format, field names,
# column-name matching behavior, timeout behavior, and metric-name
# generation against the exporter docs and the exact image tag you deploy.
#
# Oracle SQL returns unquoted identifiers in uppercase metadata. Depending
# on exporter behavior, you may need quoted aliases such as "queue_name"
# and "depth", or you may need to use the uppercase names expected by the
# exporter version you run.
[[metric]]
context = "app_queue"
request = """
SELECT
queue_name AS "queue_name",
COUNT(*) AS "depth"
FROM app_work_queue
WHERE status = 'READY'
GROUP BY queue_name
"""
labels = ["queue_name"]
[metric.metricsdesc]
depth = "Number of ready items in an approved application queue"
This example is intentionally small. It assumes queue_name is a short approved list, not a tenant ID, workflow ID, or arbitrary customer-provided value. It also assumes the query cost has been reviewed at production scale.
Before using a custom metric in production, test the exact query and exporter configuration together and confirm that the returned column names match the labels and metric descriptors expected by the exporter.
For application tables, review the execution plan, indexes, expected row counts, and concurrency impact at the chosen scrape interval. A custom metric that performs a full table scan every 30 seconds is production workload, not passive observation.
For each custom SQL metric, ask what operational question it answers, who will use it, which dashboard or alert consumes it, which grants it requires, how expensive it is at production scale, which scrape interval and timeout apply, what happens if it fails, whether labels are bounded, whether labels are safe to expose, and who owns the query when schema changes.
Custom SQL metrics are production code. They need owners, tests, review, disablement paths, and retirement criteria.
Control label cardinality before it controls your backend
Label cardinality is the number of distinct label values, or combinations of label values, a metric can produce. In Prometheus-style metrics, every unique combination of metric name and label values becomes a distinct time series.
That is useful when labels are bounded and meaningful. It becomes expensive and noisy when labels contain unbounded values such as user IDs, request IDs, document IDs, prompt IDs, session IDs, SQL text, or workflow instance IDs.
The Prometheus data model identifies each time series by metric name and label set. Changing label values, adding labels, or removing labels changes the resulting time series. The Prometheus configuration docs also warn that label dropping must preserve meaningful and unique series.
Dropping labels is not always safe; it can merge series and change the meaning of a metric.
Prefer bounded labels such as status, region, approved service name, queue name from a small approved list, wait class, or a small reviewed workflow type list.
Avoid unbounded labels such as user ID, request ID, document ID, session ID, prompt ID, URL, SQL text, workflow instance ID, exception text, or anything derived from arbitrary user or workload input.
Treat SQL text as sensitive by default. Treat sql_id as review-worthy because it can produce many series and may be correlated with query text elsewhere. Treat schema names, usernames, service names, tenant identifiers, prompt IDs, document IDs, queue names, and workflow IDs as exported data, not harmless metadata.
You can inspect exporter series and high-risk labels with scoped PromQL patterns:
count by (__name__) ({job="oracle-db-metrics-exporter"})
count by (__name__, sql_id) ({job="oracle-db-metrics-exporter", sql_id!=""})
count by (__name__, sql_text) ({job="oracle-db-metrics-exporter", sql_text!=""})
Use broad PromQL selectors carefully in large production backends. Scope by job, namespace, environment, database service, or another approved label so the inspection query itself does not become a problem.
Do not rely only on backend filtering. Prometheus relabeling, remote-write filtering, Collector processors, and backend retention policies can reduce downstream storage or visibility, but they do not protect the exporter endpoint itself.
If a label appears at /metrics, any approved scraper or accidental endpoint exposure can see it before downstream filtering.
Cardinality review belongs before rollout, not after storage costs rise, dashboard queries slow down, or alert labels leak sensitive identifiers.
Alert on failures that need action
Alerts should start with failures that require action, not with every interesting database signal.
Good first alert categories include Prometheus being unable to scrape the exporter, the exporter being reachable while oracledb_up reports database reachability failure if that metric is present, exporter scrape or collection errors if those self-metrics are present, scrape duration approaching the timeout budget if that self-metric is present, custom metric timeouts or failures if the exporter exposes that signal, missing required metrics, and cardinality spikes after a deployment or metric change.
Be cautious with raw database-performance alerts until you have baselines and ownership.
Prometheus alert rules are documented in the Prometheus alerting rules guide. The following example is a pattern, not a universal rule file.
Verify metric names, labels, thresholds, severity, and routing against your exporter version and alerting standards.
# Pattern only. Confirm these metric names and semantics in live /metrics
# for your exact exporter version before enabling these rules.
# The self-metric names shown here are examples to verify, not guarantees.
groups:
-name: oracle-db-metrics-exporter.rules
rules:
-alert: OracleDatabaseMetricsExporterTargetDown
expr: up{job="oracle-db-metrics-exporter"} == 0
for: 5m
labels:
severity: warning
annotations:
summary:"Oracle Database Metrics Exporter target is down"
description:"Prometheus cannot scrape the Oracle Database Metrics Exporter target."
summary:"Oracle Database Metrics Exporter scrape duration is high"
description:"Exporter collection duration is approaching the expected scrape budget."
The up alert is Prometheus-side target health. The oracledb_up alert, if that metric exists with the expected semantics in your deployed version, is database reachability from the exporter’s point of view, not proof of application health or full database health.
Also decide how to handle missing expected metrics. An expression such as oracledb_up == 0 does not alert if the oracledb_up series is absent. If the metric is required for your production design, add a separate absent-series check or dashboard validation, and test it during exporter upgrades, relabeling changes, and Collector pipeline changes.
The scrape-error and scrape-duration examples require exporter self-metrics that must be confirmed in your deployed version. Tune thresholds to your scrape_timeout, exporter query timeout if supported, database topology, and approved custom metrics.
Avoid paging on raw waits, sessions, or top SQL without baselines. Prefer warnings, tickets, or investigation dashboards until the team understands normal workload patterns. Do not include sensitive label values in alert annotations.
Route exporter health alerts to the monitoring-chain owner. Route database reachability alerts according to the platform or DBA incident model. Route custom metric alerts to the team that owns the metric’s business meaning.
If Enterprise Manager, OCI Database Management, Datadog, Dynatrace, New Relic, Grafana, Prometheus, and cloud alarms all watch the same database, alert ownership matters more than alert volume. Choose which system pages humans, which system opens tickets, and which systems provide context only.
A duplicate page is not resilience. It is operational noise.
Write runbooks before alerts page someone
Every paging alert should have a runbook before it pages a human.
A useful runbook does not need to be long. It needs to tell responders what the alert means, what user impact is known or unknown, what to check first, what common causes exist, what mitigations are safe, when to escalate, and how to roll back or disable a broken component.
A runbook for database reachability from the exporter might look like this:
Runbook pattern:
Alert:
OracleDatabaseMetricsExporterDatabaseUnreachable
Meaning:
The exporter is reachable, but its Oracle Database reachability check is failing.
Possible user impact:
Unknown from this alert alone. Check application health and database service status.
First checks:
- Is the exporter target still up?
- Did the database service, listener, wallet, DNS, route, or firewall change?
- Did the exporter credential expire, rotate, or lock?
- Is the database service reachable from the exporter network?
- Do exporter logs show account lock, password, role, privilege, wallet, or service-name errors?
- Did a CDB/PDB service name or Autonomous Database wallet change?
Escalate to:
DBA/platform owner for database service or credential issues.
SRE/observability owner for exporter, scrape, or network path issues.
Safe mitigations:
- Roll back the last exporter configuration change if it caused the failure.
- Restore the prior secret or wallet version if approved.
- Disable a broken custom metric if it is blocking collection and the runbook allows it.
Do not:
- Share unredacted logs.
- Add broad grants during incident response without approval.
- Assume this alert proves application outage or database root cause.
For OracleDatabaseMetricsExporterTargetDown, first check whether the exporter pod or process is running, whether the Service or target address is correct, whether labels or service discovery changed, whether NetworkPolicy blocks Prometheus, whether TLS or authentication is misconfigured, whether the scrape timeout changed, and whether Prometheus can reach the endpoint from its own network.
For a database reachability alert such as oracledb_up == 0, first confirm that the metric exists and has the expected semantics for your exporter version. Then check whether the exporter itself is reachable, whether the database service, listener, wallet, DNS, route, or firewall changed, whether the credential expired or locked, whether the service is reachable from the exporter network, and whether exporter logs show account, role, privilege, wallet, or service-name errors.
For missing expected metrics, check the exporter version, live /metrics output, relabeling rules, Collector pipeline, backend ingestion, metric name changes, and dashboard or alert query assumptions.
For exporter scrape errors, identify which metric or query failed. Check grants, topology-specific views, custom SQL changes after schema migration, query timeouts, wallet paths, credential paths, and log redaction before sharing details.
For slow scrape duration, check whether custom metrics changed, whether scrape interval or timeout changed, whether a query scans more data after workload growth, whether the database is under load, whether multiple scrapers are hitting the same exporter, and whether the exporter is collecting too much per scrape.
For a cardinality spike, identify which metric name grew and which label drove the growth. Check recent custom metric deployments, default metric changes, exporter upgrades, and whether a label started carrying tenant, user, request, SQL text, document, prompt, or workflow identifiers.
Escalation should match the failure mode. Exporter process or scrape path issues belong with SRE, platform, or observability owners. Database reachability and grants belong with DBA or platform owners. Custom SQL metric failures belong with the application owner plus DBA review. Sensitive labels involve security and observability owners. Dashboard or alert query breakage belongs with the observability owner.
Build dashboards for decisions, not decoration
A production dashboard should help responders decide what to do next. It should not be a wall of raw database counters.
Grafana dashboards are documented in the Grafana dashboards docs, and Grafana can query Prometheus through the Prometheus data source. But Grafana is the dashboard and, in some organizations, the alerting surface. It does not make the exporter safe; it makes reviewed metrics visible.
A practical production dashboard should start with monitoring-chain health: Prometheus target status for the exporter, exporter scrape errors if exposed, and exporter scrape duration if exposed. Then it should show database reachability through oracledb_up if present in your deployed version and recent reachability transitions.
After that, add collection reliability and cost: last scrape error if exposed, scrape totals if exposed, collection duration versus scrape timeout if exposed, and custom metric failures or timeouts if exposed.
Only then should the dashboard move into workload context: sessions and activity indicators, wait categories with bounded labels, and top SQL or hotspot signals only if labels and audience are approved.
Application/database signals can follow: reviewed queue depth, ingestion backlog, workflow state counts, or RAG indexing backlog by approved status. Finally, add change overlays for application deployments, database changes, schema migrations, exporter upgrades, credential rotations, and network-policy changes.
The best panels answer operational questions:
Is the monitoring path healthy?
Is the database reachable from the exporter?
Did collection cost change?
Which bounded workload category changed?
Where should we look next?
Which recent deployment, schema migration, credential rotation, or network-policy change lines up with the signal?
Do not expose SQL text panels broadly. Do not put sensitive labels in public team dashboards. Use scoped variables to avoid massive PromQL fan-out. Review dashboard variables as carefully as panels. A variable query that lists sql_text, usernames, tenant identifiers, service names, or workflow IDs can expose sensitive values even if no panel displays them directly.
Document the intended audience for each dashboard.
A dashboard for DBAs can reasonably expose different details than a broad application team dashboard.
Top SQL or hotspot panels should be restricted or omitted unless SQL labels have been reviewed and the audience is appropriate. Use top SQL signals as hints for the next question, not as proof of root cause.
Assign operating ownership
If everyone can use the exporter but nobody owns it, the exporter becomes another production dependency with no accountable operator.
Ownership must be explicit before rollout. A short owner map is usually more valuable than a long architecture document.
Before rollout, name the team or person responsible for exporter deployment, exporter database identity, grant approval, default metric approval, custom SQL metric approval, Kubernetes Service, NetworkPolicy, secret delivery, Prometheus or Collector scrape configuration, alert routing, dashboard access, response when up == 0, response when database reachability fails from the exporter, exporter upgrade review, and emergency disablement of a broken custom metric.
A workable responsibility split often looks like this:
Application developers propose application-specific custom metrics and explain their business meaning.
Observability teams review metric naming, labels, cardinality, dashboards, retention, and remote write.
Product or application owners define user impact and which alerts justify paging.
Shared responsibility is fine. Unowned responsibility is not. Ownership must include break/fix, upgrades, credential rotation, dashboards, alerts, and emergency disablement.
The emergency disablement point matters. A broken custom SQL metric can fail after a schema migration. A new label can create a cardinality spike. A credential rotation can break collection. Someone needs authority to disable, roll back, or restrict the exporter path safely while preserving the incident trail.
Review upgrades before changing the exporter
Treat an exporter upgrade like a monitoring schema change. It may not change application code, but it can change what your observability platform stores, alerts on, and exposes to users.
Before changing the exporter version, compare the old and new behavior:
Exporter upgrade review prompt:
Before changing the exporter version, compare:
- Release notes and changelog.
- Image tag.
- Runtime --help output.
- Default metric files.
- /metrics output before and after.
- Added, removed, or renamed metrics.
- Changed metric types.
- Added, removed, or changed labels.
- Cardinality impact.
- Grants required by default metrics.
- Custom metric behavior.
- Wallet, password file, vault, and external-auth configuration.
- --web.config.file behavior if TLS/auth is used.
- Prometheus scrape success.
- OpenTelemetry Collector scrape path, if used.
- Dashboard queries.
- Alert rules.
- Rollback steps.
- Approval owner.
Use the Oracle GitHub releases, repository, documentation, runtime --help, default metrics file, and live /metrics output for the exact image tag. Avoid unreviewed latest tags in production.
Metric names, labels, and types can change. New default metrics may require new grants. Removed or renamed metrics can break dashboards and alerts. Added labels can increase cardinality or expose sensitive data. Credential, wallet, password-file, vault, TLS, authentication, query timeout, and custom metric syntax can change.
Migration from an older or community exporter such as iamseth/oracledb_exporter is not “just swap the image.” Metric names, labels, grants, dashboards, alerts, and scrape behavior may differ.
Grafana Alloy’s prometheus.exporter.oracledb is an adjacent route for Grafana-centric teams, but it should be reviewed as its own exporter implementation and lifecycle. Do not assume it is identical to Oracle Database Metrics Exporter unless you verify the embedded implementation, metric set, labels, grants, wallet behavior, and upgrade path.
Use a production-readiness review before rollout
A deployment is ready for shared or production use only after the team can answer the production questions, not just after the endpoint returns metrics.
Use this review near the end of rollout planning:
Production-readiness review:
A deployment is ready for shared or production use only after the team can answer:
- Identity: Is the exporter using a dedicated monitoring identity?
- Grants: Are grants approved for the exact default and custom metrics enabled?
- Topology: Have CDB/PDB, RAC, Autonomous Database, managed database, or single-instance differences been reviewed?
- Secrets: Are passwords, wallets, connect strings, and tokens delivered through approved secret paths?
- Rotation: Has credential rotation been tested?
- Endpoint: Is /metrics private?
- Network: Can only approved scrapers reach the exporter?
- TLS/auth: If required, is exporter endpoint TLS or authentication configured and tested?
- Scrape config: Are interval, timeout, labels, and job names deliberate?
- Default metrics: Has the team reviewed default metric names, labels, query cost, and grants?
- Custom metrics: Is every custom SQL metric owned, reviewed, bounded, and useful?
- Labels: Are sensitive and high-cardinality labels removed, aggregated, or restricted before rollout?
- Prometheus/Collector: Is the scrape path clear and tested?
- Alerts: Do alerts represent actionable failures?
- Dashboards: Do dashboards support decisions and avoid exposing sensitive labels?
- Runbooks: Does every paging alert have a runbook?
- Retention: Are metrics retention and remote-write destinations approved?
- Access: Are dashboard and metrics-backend permissions appropriate?
- Ownership: Are deployment, grants, metrics, alerts, dashboards, and upgrades assigned to named teams?
- Upgrade plan: Is there a version review and rollback process?
- Rollback: Can a broken custom metric, scrape config, or exporter version be disabled quickly?
The expected result is a rollout decision:
Approved for production.
Approved with restrictions.
Deferred pending grants review.
Deferred pending label review.
Deferred pending network controls.
Deferred pending runbooks or ownership.
Rejected until the design changes.
A checklist does not replace technical review. It makes the review concrete. If the team cannot name the identity, grants, scrape path, labels, alerts, dashboards, runbooks, owners, upgrade process, and rollback plan, the rollout is not ready.
The practical takeaway
A production exporter rollout is ready when the team can name the database identity, list the approved grants, protect the credentials, explain who can scrape /metrics, defend every high-risk label, show actionable alerts and dashboards, link to runbooks, identify owners, and describe how upgrades are reviewed.
That is the difference between a local demo and an operated production component.
The practical decision is not whether an exporter is “better” than a database management platform. The practical decision is which operating model your team can actually sustain. If your platform already runs Prometheus-compatible observability, Oracle Database Metrics Exporter can be a strong fit. If your DBA team already operates Enterprise Manager or OCI Database Management as the production control plane, use the exporter as a complementary signal or do not add it until ownership is clear.
Schedule the readiness review before broad rollout. Bring the exact exporter version, image tag, /metrics output, enabled metric list, custom SQL, grant list, scrape configuration, dashboard draft, alert rules, runbook links, retention plan, and owner map.
Do not roll out broadly until each trust boundary has an owner.
GRAPH_TABLE is the right first tool because it demonstrates the core idea: match a graph pattern and return rows.
But graph work often goes one step further. You may want to rank important accounts, find connected groups, trace weighted paths, or visualize the network so a human can inspect it.
Article 3 used GRAPH_TABLE to find hubs, chains, cycles, and customer-risk joins. This article adds computed graph signals and visual exploration. You do not need a separate graph database after pattern matching; in-database algorithms and Graph Studio are the natural next step once you understand the graph.
Oracle gives you two practical next steps after basic pattern matching:
in-database graph algorithms with DBMS_OGA;
visual and notebook workflows in Graph Studio, backed by PGX for in-memory analytics.
Start With In-Database Algorithms
DBMS_OGA exposes graph algorithm functions that you can call from SQL. The function returns a temporary graph with computed properties, and GRAPH_TABLE reads those computed properties back as rows.
The syntax can look unusual at first, but the workflow is simple:
call the algorithm on BANK_GRAPH;
write the algorithm output to a vertex property;
query that output property with GRAPH_TABLE;
join the result to your relational tables.
Run the examples in this article in Autonomous Database Serverless on OCI if you want the algorithm examples to execute. At the time of writing, FreeSQL does not expose DBMS_OGA / DBMS_GAF, so the algorithm examples in this article will not work there.
If you are using ADB-S, run the setup from earlier articles first. Then check whether your database exposes the graph algorithm packages:
SELECT owner, object_name, object_type, status
FROM all_objects
WHERE object_name IN('DBMS_OGA','DBMS_GAF')
ORDERBY owner, object_name, object_type;
If that query does not show both packages, skip the DBMS_OGA examples in this article and move to the Graph Studio section.
The algorithm path depends on DBMS_OGA and DBMS_GAF. If those packages are not visible in your environment, keep using the earlier SQL property graph and GRAPH_TABLE examples, then move to the Graph Studio section.
If DBMS_OGA is visible, run demo/04_bank_graph_algorithms.sql. (The code is at end of this article). The script checks package visibility first, skips cleanly where the packages are absent, and uses the current schema when it builds the Bellman-Ford seed vertex JSON.
Treat these algorithm outputs as features, not verdicts. A PageRank score, component id, or path distance can help you prioritize work, but it should not be presented as proof of fraud. That distinction keeps the article technically useful without overstating what graph analytics can decide on its own.
The useful part is not just that the algorithm runs. It is that the computed graph output can come back through GRAPH_TABLE, become SQL rows, and join to the same relational context you already use.
PageRank: Which Accounts Are Structurally Important?
PageRank is a useful first centrality measure because it gives every vertex an importance score based on the graph structure.
In a bank-transfer graph, PageRank is not a fraud model. It is a signal. An account with high graph importance may deserve attention when combined with other context such as transfer amount, customer risk, device history, or known investigation seeds.
SELECT gt.account_id,
ba.account_name,
gt.pagerank_score
FROM GRAPH_TABLE(
DBMS_OGA.PAGERANK(
bank_graph,
PROPERTY(VERTEX OUTPUT pagerank_score),
20,
0.0001,
0.85,
TRUE
)
MATCH(a IS account)
COLUMNS (
a.account_id AS account_id,
a.pagerank_score AS pagerank_score
)
) gt
JOIN bank_accounts ba
ON ba.account_id = gt.account_id
ORDERBY gt.pagerank_score DESC;
Notice the same pattern from earlier articles. Even though an algorithm ran first, the output becomes SQL rows. That makes the score easy to sort, filter, join, or save.
You can also join the result back to customer or account data. For example, sort by PageRank and then show each account’s customer risk tier. The algorithm provides graph structure; SQL adds business context.
Weakly Connected Components: Which Accounts Belong Together?
Connected components find groups of vertices connected by edges. Weakly connected components treat directed edges as connected regardless of direction.
In a fraud investigation workflow, WCC is useful because it gives you a compact way to describe account clusters.
SELECT gt.component_id,
COUNT(*)AS account_count,
LISTAGG(ba.account_name,', ') WITHIN GROUP(ORDERBY ba.account_id)AS accounts
FROM GRAPH_TABLE(
DBMS_OGA.WCC(
bank_graph,
PROPERTY(VERTEX OUTPUT component_id)
)
MATCH(a IS account)
COLUMNS (
a.account_id AS account_id,
a.component_id AS component_id
)
) gt
JOIN bank_accounts ba
ON ba.account_id = gt.account_id
GROUPBY gt.component_id
ORDERBY account_count DESC, gt.component_id;
On a small graph, this may return one large component. That is fine. The useful pattern is the method: compute a graph-level grouping, then summarize it with SQL.
On a larger graph, connected components can help split an investigation into smaller clusters. They can also help you find isolated groups of accounts that do not interact with the rest of the network.
PageRank, Weakly Connected Components, and Bellman-Ford produce different kinds of graph features. Treat them as investigation inputs, not final decisions.
Bellman-Ford: Weighted Path Exposure
Bellman-Ford is a shortest-path algorithm. In a road network, the edge weight might be distance. In a transfer graph, the edge weight might be amount.
In this demo, a minimum cumulative transfer amount is not automatically a risk score. It is a path-based measure that may help investigation when paired with business rules.
The Bellman-Ford example starts from account 101 and uses transfer amount as the edge weight.
The seed vertex is account 101. The edge input property is amount, so the algorithm treats transfer amount as the path weight. The output property is minimum_transfer_exposure, which is then projected by GRAPH_TABLE and joined to bank_accounts.
This output is a weighted path signal from one starting account. In an investigation workflow, it could help you understand which accounts are reachable from a seed account under this weighting rule.
Move Into Graph Studio
SQL is enough to create and query the graph, but visualization helps beginners see what they built.
Graph Studio gives you a managed place to create, query, analyze, and visualize graphs in Autonomous AI Database. In this series, it is most useful for three things:
show BANK_GRAPH as a real graph object;
visualize a small subgraph around account 101;
run one notebook or query-panel example that mirrors the SQL article.
In Graph Studio, open the graph list, select BANK_GRAPH, and start with a small preview rather than the whole graph. Filter around account 101 or the 3-hop cycle so the visual reinforces the pattern queries instead of turning into a dense tangle.
If you use a notebook, run the same one-hop query from account 101 that you used in SQL. That helps connect the visual result back to the query:
SELECT*
FROM GRAPH_TABLE(
bank_graph
MATCH(src IS account)-[t IS transfer]->(dst IS account)
WHERE src.account_id =101
COLUMNS (
src.account_name AS source_account,
t.amount AS amount,
t.channel AS channel,
dst.account_name AS destination_account
)
)
ORDERBY amount DESC;
Where PGX Fits
PGX is Oracle’s in-memory graph analytics layer. It matters when you need richer algorithms, interactive analysis, notebook workflows, or high-performance graph analytics beyond the first SQL examples.
The practical path is incremental: start in SQL, add in-database algorithm signals when the packages are available, and move into Graph Studio or PGX when visualization or deeper analytics becomes useful.
That sequence keeps the mental model clean:
SQL property graph: define the graph over database objects.
GRAPH_TABLE: match graph patterns and return rows.
Graph Studio and PGX: visualize and analyze with a richer graph environment.
Where To Go Next
You now know how to recognize a graph-shaped problem, define a SQL property graph over relational tables, query connected patterns with SQL, join graph results to business context, and choose when to move into algorithms or visualization.
That is enough to start using graph features in real Oracle database work.
Try The Demo
This article has two demo paths.
For the algorithm path, use your own ADB-S instance. Run the setup steps below, then run the availability check and algorithm script. The script computes PageRank, Weakly Connected Components, and Bellman-Ford outputs, then returns the results as SQL rows.
For the FreeSQL path, stop before the algorithm script. FreeSQL is still useful for the table, graph, and GRAPH_TABLE pattern examples, but FreeSQL does not expose DBMS_OGA / DBMS_GAF.
After the SQL algorithm step, open Graph Studio in the same ADB-S environment and visualize a small part of BANK_GRAPH. Start with a filtered view around account 101 or the seeded transfer cycle. The goal is not to draw every relationship at once; it is to connect the SQL results back to a graph a person can inspect.
Here’s the code referenced above, which you can copy and paste into your environment. Please note that you need to run the set up steps 1 and 2 first, then step 4 – all are included below. If you followed the previous articles in this series, steps 1 and 2 are exactly the same code you ran in those articles. The code below contains comments that explains what it is doing.
-- Oracle AI Database 26ai graph demo
-- Step 1: create a tiny bank-fraud dataset.
--
-- Run this as a schema user that can create tables.
-- Start by removing objects from a previous run. The graph depends on the
-- tables, so it must be dropped before the tables can be recreated.
BEGIN
EXECUTEIMMEDIATE'DROP PROPERTY GRAPH bank_graph';
EXCEPTION
WHEN OTHERS THEN
-- Ignore cleanup errors so the script can be rerun from a fresh schema or
-- after dependencies have already been removed.
NULL;
END;
/
-- Drop the edge table before the vertex tables because it has foreign keys to
-- bank_accounts. ORA-00942 means the table did not exist, which is safe here.
This article uses Oracle Database Metrics Exporter 2.3.1, checked against Oracle’s documentation on 2026-05-10. Container image: container-registry.oracle.com/database/observability-exporter:2.3.1. Default listen address: :9161. Default metrics path: /metrics.
In the first article in this series, we looked at why application telemetry stops at the service boundary — and why a slow RAG endpoint or AI request often cannot be diagnosed without knowing what Oracle Database was doing at the same moment. This article turns that idea into a working local setup.
We’ll run Oracle Database Metrics Exporter against a development Oracle Database, scrape it with Prometheus, connect Grafana, and build the starter dashboard. By the end you’ll have panels showing exporter health, database reachability, sessions, activity, wait categories, and top SQL — all in the same time window as your application metrics.
Oracle’s docs and GitHub materials sometimes use Oracle AI Database Metrics Exporter where the product page uses Oracle Database Metrics Exporter. You may also see observability-exporter in the container image path. In this article, those references point to the same exporter.
The stack
The exporter sits between Oracle Database and Prometheus. It connects to the database, runs the metric queries, and exposes the results at /metrics. Prometheus pulls those metrics on a schedule. Grafana queries Prometheus to build panels.
Grafana does not connect directly to Oracle Database in this setup. That distinction matters when something goes wrong: if a panel is empty, the problem is almost always earlier in the chain. We’ll verify each link before moving to the next.
What you need
A development Oracle Database — local container, OCI Always Free, or a private lab instance
SQL admin access to create a monitoring user in the target PDB or service
With Oracle Database on the same Docker network, the exporter can reach it by container name. If Oracle Database is running on the host machine instead, use host.docker.internal as the hostname in the exporter connect string.
Create a monitoring user
The exporter needs its own database identity. Connect to the PDB or service you want to monitor and run:
SELECT_CATALOG_ROLE covers most of the default metrics without requiring a long list of individual view grants. It is a broad grant — convenient for a local demo, but not least-privilege for a shared environment. Narrowing this down is the main topic of the next article in this series.
With Oracle Database Free, the default PDB is FREEPDB1. Make sure you create the user in the service the exporter will connect to.
Configure the exporter
Create a local file with the connection details. Keep this file out of source control.
# exporter.env
DB_USERNAME=exporter_demo
DB_PASSWORD=<your_password>
DB_CONNECT_STRING=<host>:1521/<service-name>
Replace <host> with the address the exporter container can reach. If Oracle Database is another container on oracle-obs-demo, use its container name: for example, oracle-db:1521/FREEPDB1. If it’s running on the host, use host.docker.internal:1521/FREEPDB1.
That 1 means the exporter connected to Oracle Database and ran its queries successfully. If you see 0, the HTTP server is running but the database connection is not — check the connect string, service name, credentials, and whether the user exists in the right PDB. If the endpoint is unreachable, check docker ps and docker logs oracle-exporter.
Before moving on, take a quick look at what the exporter actually exposes:
curl-s http://localhost:9161/metrics | grep"^# HELP oracledb_" | head -20
This lists the metric names and descriptions for version 2.3.1. The PromQL queries in the sections below assume the default metric names from this version.
Add Prometheus
Create the scrape configuration:
# prometheus.yml
global:
scrape_interval: 15s
scrape_configs:
-job_name: oracle-exporter
metrics_path: /metrics
static_configs:
-targets:
- oracle-exporter:9161
The target is oracle-exporter:9161 — the exporter’s container name on the Docker network. Prometheus will resolve that name from inside the network. Using localhost:9161 would point at the Prometheus container itself.
Open http://localhost:9090 and go to Status → Targets. The oracle-exporter job should show UP.
Confirm both signals in the query UI before continuing:
up{job="oracle-exporter"}
oracledb_up
up{job="oracle-exporter"} is whether Prometheus can scrape the exporter. oracledb_up is whether the exporter can reach Oracle Database. Both should return 1 before building any panels.
Add Grafana
docker run -d
--name grafana
--network oracle-obs-demo
-p127.0.0.1:3000:3000
"$GRAFANA_IMAGE"
Open http://localhost:3000 and log in. Default credentials for a fresh Grafana container are admin / admin.
Add Prometheus as a data source: Connections → Data sources → Add new data source → Prometheus. Set the server URL to:
http://prometheus:9090
Not http://localhost:9090 — Grafana is running inside Docker, where localhost means the Grafana container. Use the container name.
Save and test. When it shows connected, you’re ready to build panels.
Build the starter dashboard
Create a new dashboard. The panels below build from monitoring-chain health outward to database behavior. Start at the top and work down.
Is the monitoring chain working?
Two stat panels, side by side.
Prometheus → Exporter:
up{job="oracle-exporter"}
1 means Prometheus can scrape the exporter. 0 means the exporter is down or unreachable.
Exporter → Oracle Database:
oracledb_up
1 means the exporter has a working database connection. 0 means the exporter is alive but cannot reach the database.
These two panels separate the first layer of any debugging conversation: is the monitoring chain itself broken, or is the database the problem?
Exporter collection health
oracledb_exporter_last_scrape_error
Healthy value is 0. A nonzero value means the last collection had errors — often a permissions issue, a query timeout, or a broken custom metric configuration.
oracledb_exporter_last_scrape_duration_seconds
How long collection takes per scrape. If this climbs toward the scrape interval, the exporter is struggling to keep up — usually a slow query or high database load.
oracledb_exporter_scrapes_total
A running total of completed scrapes. Useful for confirming the exporter is actively collecting and not stalled.
Sessions
sum by (status, type) (oracledb_sessions_value)
Database sessions are not the same as active users. They include connection pool slots, background processes, and idle sessions. The useful question is not the absolute value — it is whether the shape changes at the same time as an application event. Did sessions spike when the slow request hit? Did inactive sessions accumulate after a deployment?
Activity
oracledb_activity_execute_count
oracledb_activity_parse_count_total
oracledb_activity_user_commits
oracledb_activity_user_rollbacks
These show whether the workload shape changed. A jump in execute or parse count often corresponds to a new query pattern, a connection pool opening more sessions, or a batch job starting. Compare them against application request volume on the same time axis.
The default Oracle activity metrics are gauges, not counters. Graph them directly. Only apply rate() after checking the # TYPE line for your exporter output.
Wait categories
oracledb_wait_time_user_io
oracledb_wait_time_system_io
oracledb_wait_time_concurrency
oracledb_wait_time_application
oracledb_wait_time_commit
oracledb_wait_time_network
Wait categories show where database time is going. They do not explain why a wait increased, but they tell you which direction to investigate next. User I/O rising with application latency is a different path than concurrency rising after a schema change.
The wait metrics are counters. For a rate-of-change view:
rate(oracledb_wait_time_commit[5m])
Top SQL
topk(10, oracledb_top_sql_elapsed)
This surfaces the SQL statements accumulating the most elapsed time. In a development setup it is the fastest way to find which query to look at next. The label set includes SQL identifiers and may include SQL text — useful locally, but worth reviewing before this panel goes into a shared dashboard.
Reading the dashboard
Once the panels are in place, run a real request through your application and watch the time window in Grafana.
Make a call to your RAG endpoint or Spring Boot AI service. During that window:
Did oracledb_sessions_value spike? Session pressure often shows up alongside connection pool wait time.
Did execute or parse count shift? More SQL activity usually means a different query pattern or a job running in parallel.
Did a wait category increase? User I/O rising with latency points toward reads. Concurrency rising points toward lock contention.
Did anything appear at the top of oracledb_top_sql_elapsed? That is the query to look at next.
The demo works. The question now is what it takes to run this reliably alongside a real application.
In the next article, we look at what productionizing the exporter actually requires: replacing the broad SELECT_CATALOG_ROLE grant with object-level grants for the metrics you enable, keeping the exporter endpoint private, reviewing which labels leave the database boundary, and making sure someone owns the component before it becomes a dependency nobody maintains.
We have tables. We have a SQL property graph named BANK_GRAPH. Now we can ask graph-shaped questions.
The key tool is GRAPH_TABLE.
GRAPH_TABLE is a SQL table expression. You give it a graph and a pattern to match, and it returns rows. That last part matters. Once the graph match becomes rows, you can use the rest of SQL: joins, filters, grouping, ordering, and common table expressions.
Article 2 created BANK_GRAPH. This article uses that graph to show why pattern matching is worth learning: it lets you express fraud questions about hubs, chains, and cycles more directly than equivalent join-heavy SQL, while still returning rows that join to customer risk data.
In this article, we will query four fraud-style patterns:
inbound hubs;
two-hop transfer chains;
transfer cycles;
graph results joined to customer risk data.
Before you start, you’ll need to run the setup code from Article 2. I have copied that here so you can do it all in the one place. Make sure you use the “Run Script” button so it runs all of the statements, not just the first one.
Then run this code:
Ok, now you have everything that we did in the previous two articles back in place!
Now here’s the code for this example:
If you prefer to run everything in ADB-S, copy those three scripts into the same schema and run them in order. SQLcl and SQL*Plus users can run the files directly from the command line.
Before looking at each query, notice the repeated shape. The graph pattern lives inside GRAPH_TABLE, and the rest of the statement is ordinary SQL. That split is the core habit to learn: use graph syntax to find connected structures, then use SQL to summarize and enrich the results.
Pattern 1: Inbound Hubs
A simple first fraud question is: which accounts receive the most transfers?
In graph terms, we are looking for transfer edges that point into the same destination account.
SELECT gt.account_id,
ba.account_name,
COUNT(*)AS inbound_transfers,
SUM(gt.amount)AS inbound_amount
FROM GRAPH_TABLE(
bank_graph
MATCH(src IS account)-[t IS transfer]->(dst IS account)
MATCH(src IS account)-[t IS transfer]->(dst IS account)
Read it from left to right: find an account vertex, follow a directed transfer edge, and land on another account vertex.
The COLUMNS clause chooses which graph properties become SQL columns. Here we project the destination account id and transfer amount. The outer SQL query groups those rows to find inbound hubs.
For the demo data, account 105 should stand out because several transfers point into it. That result is not automatically suspicious, but it is a useful triage signal. In a real system, you would combine it with time windows, amount thresholds, customer risk, and historical behavior.
Pattern 2: Accounts In The Middle
A second question is more interesting: which accounts sit in the middle of two-hop money movement?
This is where graph syntax starts to feel natural. You are not mentally building a chain of self-joins. You are drawing the relationship pattern and asking Oracle to return the matches.
This query also shows why aliases matter. src, mid, and dst are not table names. They are pattern variables. The same account label appears three times, but each occurrence has a different role in the path. Good variable names make graph patterns much easier to explain on screen.
Pattern 3: Transfer Cycles
Fraud investigations often care about circular movement. Money leaves one account, moves through other accounts, and eventually returns.
The demo data has a simple cycle:
101 -> 102 -> 103 -> 101
In graph terms, we want to start at an account and return to the same account after a few transfer hops.
SELECT gt.account_id,
ba.account_name,
COUNT(*)AS cycle_count
FROM GRAPH_TABLE(
bank_graph
MATCH(a IS account)-[IS transfer]->{3,5}(a)
COLUMNS (
a.account_id AS account_id
)
) gt
JOIN bank_accounts ba
ON ba.account_id = gt.account_id
GROUPBY gt.account_id, ba.account_name
ORDERBY cycle_count DESC, gt.account_id;
The {3,5} quantifier is the part to notice. It asks for paths of three to five repetitions of the transfer step. The final (a) says the path must return to the same starting account.
The graph query expresses the business question almost exactly: “find accounts that can reach themselves through three to five transfers.”
Pattern 4: Join Graph Results To Customer Risk
Graph results are useful by themselves, but they become much more useful when you join them to business context.
For example, a cycle involving a low-risk account may be less urgent than a cycle involving a high-risk customer. Because GRAPH_TABLE returns rows, we can join cycle results to bank_accounts and customers.
This is one of the strongest reasons to learn graph inside Oracle Database. You do not have to choose between graph analysis and relational context. You can combine them in one SQL workflow.
That is the practical payoff of this article. GRAPH_TABLE does not trap you in a graph-only world. It gives you a row source. Once you have rows, the rest of your SQL skills still apply.
What To Watch For
Graph syntax makes connected questions easier to read, but it does not remove the need for careful modeling.
Use stable keys for vertices and edges. Expose only the properties readers or applications need. Keep labels clear. Make edge direction explicit. If your business question treats relationships as undirected, say that clearly and choose the right query or algorithm path.
Next Step
Pattern matching answers many practical questions. But sometimes you want a computed signal: an importance score, a connected group id, or a weighted distance from a seed account.
That is where in-database graph algorithms come in. The next article uses DBMS_OGA in ADB-S, because the tested FreeSQL environment does not expose the algorithm packages. It then moves the same graph into Graph Studio for visualization.
That is also where edge weights become concrete. In the next article, the amount property on a transfer edge becomes the weight for a path example, instead of remaining an abstract graph term.
Try The Demo
The hands-on part of this article is the pattern-query script. It uses the BANK_GRAPH you created in Article 2 and runs the four query families you just learned:
inbound hubs;
two-hop transfer chains;
transfer cycles;
graph results joined to customer risk.
Use the embedded FreeSQL runner near the top of the article for the quick path. If you prefer ADB-S, run the same script in the schema where BANK_GRAPH already exists.
When the script finishes, spend a minute with the output instead of rushing ahead. Look for account 105 in the inbound hub result, the middle-account counts in the two-hop result, and the medium/high-risk customers in the hybrid query. Those outputs are the bridge between graph pattern matching and the algorithm examples in the next article.
By the end of Episode 2, the assistant could answer policy questions from Oracle Vector Store and remember conversations across requests. That combination already felt more like a real application than a bare LLM wrapper.
But it still could not act on anything.
Ask it about order ORD-1002 and it would explain the shipping policy. It could not look up the actual order. Ask it to return a product and it would describe the return process. It could not initiate the return. The knowledge was there. The memory was there. The connection to the backend was not.
Before looking at the code, it is worth being precise about what tool calling is and what it is not.
The model does not get direct access to the database. It does not write SQL. It does not update rows. What it gets is the ability to request that the backend perform a specific named action. The backend then decides whether to allow it, validates the input against business rules, performs the operation if allowed, and returns a result. The model reads that result and generates a response.
This is the difference between AI orchestration and AI autonomy. The model orchestrates. The backend decides. That distinction is what makes the system trustworthy rather than unpredictable.
What changes
The architecture from Episode 2 carries forward unchanged. The memory advisor, the vector-store advisor, and the Oracle-backed persistence are all still there. The only additions are one tool class, one new entity, and one line in ChatClientConfig.
defaultTools(agentTools) registers the AgentTools bean with the chat client. From that point, the model knows those tools exist and can request them during any conversation.
The system prompt was also updated to tell the model how to use the tools and to reinforce that tool results are authoritative:
app:
assistant:
system-prompt:>
You are ShopAssist, a concise and practical support assistant for a demo
electronics store. Use retrieved policy context when it is available.
Use tools for order status lookup, return initiation, and support ticket
creation. Treat tool results as the source of truth for business actions
and explain validation failures clearly. Do not invent order details.
Do not invent action outcomes.
The tool class
AgentTools.java contains three methods, each annotated with @Tool. That annotation is what exposes the method to the model. Spring AI reads the description and parameter annotations at startup and includes them in the model’s context so it knows what each tool does and when to use it.
@Tool(description="Look up the current status and basic details for a ShopAssist order.")
@Transactional(readOnly=true)
publicStringlookupOrderStatus(
@ToolParam(description="The ShopAssist order ID, for example ORD-1002.")StringorderId
){ ... }
@Tool(description="Initiate a return for an eligible delivered ShopAssist order after backend validation.")
@Transactional
publicStringinitiateReturn(
@ToolParam(description="The ShopAssist order ID, for example ORD-1001.")StringorderId,
@ToolParam(description="The customer's reason for the return.")Stringreason
){ ... }
@Tool(description="Create a support ticket for an existing ShopAssist order.")
@Transactional
publicStringcreateSupportTicket(
@ToolParam(description="The ShopAssist order ID, for example ORD-1002.")StringorderId,
@ToolParam(description="A short description of the customer issue.")Stringissue,
@ToolParam(description="Ticket priority such as LOW, NORMAL, HIGH, or URGENT.",required=false)Stringpriority
){ ... }
The return types are all String. That is intentional — plain text results are easy for the model to summarise naturally without needing to parse structured data.
lookupOrderStatus is marked @Transactional(readOnly = true). initiateReturn and createSupportTicket are @Transactional because they write to Oracle. The transaction boundary sits inside the tool method, so a validation failure exits cleanly without any partial database change.
How a tool call actually flows
When the model decides a tool is needed, it does not generate a response immediately. Instead it emits a structured tool-call request back to the application. Spring AI intercepts that, invokes the matching method, and sends the result back to the model as a tool message. The model then generates its final response with the result in context.
The full round trip for a return request looks like this: the user message arrives, the memory advisor loads conversation history, the vector-store advisor retrieves relevant policy documents, the model sees all of that and decides to call initiateReturn, the backend validates and either updates Oracle or returns a failure reason, and the model then writes the response to the user based on what the backend actually said.
Business rules in the backend
The return validation is worth looking at directly because it is the clearest example of the backend owning business policy:
return"Return initiated for order %s. The order status is now %s."
.formatted(order.getOrderId(),order.getStatus());
}
Four checks in sequence: already returning, not yet delivered, outside the 30-day window, then success. Every failure exits early with a plain-text reason and touches nothing in the database. Only the last path calls customerOrderRepository.save().
The markPreparingReturn() call is a domain method on CustomerOrder. The status change lives on the entity, not scattered across service or tool layers.
The support ticket entity
SupportTicket is a simple JPA entity backed by the SUPPORT_TICKET table. Episode 3 adds it with Hibernate’s ddl-auto=update creating the table on first startup — no migration script.
The key ID uses GenerationType.SEQUENCE with an explicit @SequenceGenerator, which is the recommended Oracle pattern. Priority and status are plain strings rather than enums — the tool normalises whatever the model passes and defaults anything unrecognised to NORMAL.
Trying it
The application seeds four deterministic demo orders on startup: ORD-1001 (delivered, within the 30-day return window), ORD-1002 (shipped), ORD-1003 (delivered, outside the return window), and ORD-1004 (processing).
Order lookup:
curl-s-X POST http://localhost:8080/api/v1/agent/chat
-H"Content-Type: application/json"
-H"X-Conversation-Id: demo-1"
-d'{"message":"What is the status of order ORD-1002?"}' | jq
The assistant calls lookupOrderStatus, reads the live status from Oracle, and returns it. It does not guess or interpolate from the conversation.
Successful return:
curl-s-X POST http://localhost:8080/api/v1/agent/chat
-H"Content-Type: application/json"
-H"X-Conversation-Id: demo-1"
-d'{"message":"I want to return order ORD-1001 because the product was defective."}' | jq
The backend finds the order, confirms it is delivered and within the return window, calls markPreparingReturn(), and saves to Oracle. The status in CUSTOMER_ORDER is now PREPARING_RETURN.
Failed return:
curl-s-X POST http://localhost:8080/api/v1/agent/chat
-H"Content-Type: application/json"
-H"X-Conversation-Id: demo-1"
-d'{"message":"I want to return order ORD-1003 because the product was defective."}' | jq
ORD-1003 was delivered 45 days ago. The backend rejects it at the return-window check and returns a clear reason. Nothing is written to the database. The model relays the rejection to the user.
This failure case is as important as the success case. It demonstrates that the model cannot bypass business rules by asking politely.
Support ticket:
curl-s-X POST http://localhost:8080/api/v1/agent/chat
-H"Content-Type: application/json"
-H"X-Conversation-Id: demo-1"
-d'{"message":"Create a high-priority support ticket for ORD-1002 because shipping is stuck."}' | jq
The backend verifies the order exists, normalises the priority to HIGH, sets status to OPEN, and inserts a row into SUPPORT_TICKET.
Inspecting Oracle directly
After those four requests, you can see the backend state directly:
SELECT ORDER_ID, STATUS FROM CUSTOMER_ORDER;
ORD-1001 now shows PREPARING_RETURN. The other three are unchanged.
SELECT TICKET_ID, ORDER_ID, PRIORITY, STATUS FROM SUPPORT_TICKET;
One row: the high-priority ticket for ORD-1002.
The assistant is useful because it is connected to real application state. Not because the model became more capable.
Oracle is now storing four kinds of data
Episode 1 added relational order data and vector embeddings. Episode 2 added conversational state. Episode 3 adds support ticket rows written by tool calls.
All four live in the same Oracle database: CUSTOMER_ORDER, the vector store table, SPRING_AI_CHAT_MEMORY, and SUPPORT_TICKET. Still one database connection from the application. No additional infrastructure.
What is next
The application can now retrieve knowledge, remember conversations, and perform backend actions. But everything still happens synchronously inside a single HTTP request. The model calls a tool, waits for the result, and then responds.
That works for simple actions. It does not work for anything that takes time, depends on external systems, or needs to be retried. Episode 4 starts moving toward event-driven processing — work that can happen outside the request cycle and feed back into the conversation when it completes.
Freshness note: At the time of writing, the current exporter release is 2.3.1, the default listen address is :9161, the default telemetry path is /metrics, and container examples use container-registry.oracle.com/database/observability-exporter:2.3.1. Treat these as version-specific details and recheck current documentation before relying on copy/paste commands.
Application telemetry stops too early
A Spring Boot AI service can look healthy in the places developers check first. The HTTP layer reports normal error rates. Application metrics show request latency, JVM behavior, token timing, and maybe HikariCP connection pool usage. Then a retrieval-augmented generation, or RAG, endpoint slows down, and the only obvious clue is that “the database call took longer.”
That clue is too thin when Oracle Database is part of the user experience. The database may hold vectors, JSON documents, relational records, queues, conversation state, ingestion metadata, and application state. A slow answer might come from a code change in the service, a different retrieval query, an ingestion job competing with search traffic, a session spike, a wait-class shift, or a SQL hotspot.
Application telemetry is necessary, but it often stops at the application boundary. Database-side metrics fill a different gap: they show what the database was doing during the same time window as application latency, pool pressure, model calls, deployments, and workload changes.
Oracle Database Metrics Exporter exposes Oracle Database metrics so observability tools can scrape, store, visualize, and alert on database behavior alongside application behavior. The payoff is not another dashboard for its own sake. The payoff is correlation by workflow: when a RAG endpoint slows down, developers and platform teams can inspect database-side signals in the same timeline as the service that depends on them.
What a database exporter does
A database exporter is a small service that connects to a database, collects selected metrics, and exposes those metrics through an endpoint that monitoring systems can scrape.
That definition matters because an exporter is a bridge, not the destination. It is not the database. It is not Grafana. It is not an application metrics library. It is not a trace collector. Its job is to turn database-side signals into input for the observability pipeline you already use.
In a Prometheus-style setup, the monitoring system usually follows a pull model. Prometheus or another compatible scraper periodically requests metrics from a target endpoint. The exporter is that target. It connects to Oracle Database, collects configured metrics, and exposes them over HTTP in a scrapeable format.
That changes the debugging conversation. Instead of asking for a separate database view after the fact, the application team can start with a shared timeline: request latency rose at 10:05, connection pool wait time rose at 10:06, sessions increased at 10:06, and database wait behavior changed at 10:07. Metrics do not finish the diagnosis, but they tell everyone where to look next.
What Oracle Database Metrics Exporter is
Tthe exporter is a bridge between Oracle Database instances and observability tools. The exporter exposes metrics in a format compatible with Prometheus-style scraping. metrics are OpenTelemetry and Prometheus-compatible. This article focuses on the Prometheus path because it is the most direct path to reason about: the exporter exposes metrics over HTTP, Prometheus or a compatible scraper collects and stores them, and Grafana visualizes them through a Prometheus-compatible data source.
In an OpenTelemetry-oriented platform, a common pattern is to use the OpenTelemetry Collector as part of the metrics pipeline, for example by scraping the exporter’s Prometheus-format endpoint with the Collector’s Prometheus receiver and forwarding metrics to a backend. That does not mean the exporter emits traces, performs automatic application-to-database trace correlation, or natively pushes OTLP metrics to every backend.
The exporter project is free and open source. It is available under your choice of the Universal Permissive License v1.0, or the MIT license . That statement applies to the exporter project itself; database, infrastructure, cloud service, managed backend, and SaaS observability costs are separate.
The exporter works across on-premises, cloud, Kubernetes, single-instance, clustered, and Autonomous Database deployments. Check the current docs for exact version, network, authentication, wallet, and privilege requirements in your environment.
The exporter can provide default metrics and custom SQL-based metrics defined in YAML or TOML. Custom metrics are powerful, but they need design review. Each query needs appropriate database grants and should be evaluated for execution cost, scrape interval, timeout behavior, label cardinality, and sensitive data exposure.
The path from Oracle Database to Grafana
Grafana is not collecting from Oracle Database directly in this path. The usual flow looks like this:
The default endpoint details are version-specific. In version 2.3.1, the exporter defaults to --web.listen-address=:9161 and --web.telemetry-path="/metrics".
The networking detail that trips up many first demos is point of view. Use the address reachable from Prometheus, not necessarily the address reachable from your terminal. For a same-host local test, this might be localhost:9161. In Docker Compose, localhost usually means the Prometheus container itself, so the target is usually the exporter service name. In Kubernetes, it would normally be a Service DNS name.
For a same-host local test where Prometheus can reach the exporter on localhost, the scrape target might look like this:
scrape_configs:
-job_name:"oracle-database"
static_configs:
-targets:["localhost:9161"]
For a Docker Compose-style setup where exporter is the service name reachable from the Prometheus container, the target would look more like this:
scrape_configs:
-job_name:"oracle-database"
static_configs:
-targets:["exporter:9161"]
These snippets show the shape of the scrape configuration, not a complete production Prometheus setup. The correct target depends on where Prometheus runs, how the exporter is exposed, and what network rules sit between the scraper and the exporter.
A first useful local check
The smallest useful demo is not a full dashboard. It is proving that the exporter can connect to a development database and expose one expected metric.
Oracle provides a local Docker Compose test stack with Oracle AI Database, the exporter, Prometheus, and Grafana, and the docs state that it provisions a dashboard. Treat that stack as local testing only. Do not expose sample database listener ports on shared or public hosts.
You can start up the demo system with a command like this – make sure you choose a password, don’t copy the example:
cd docker-compose
DB_PASSWORD='<choose-a-local-demo-password>' docker-compose up -d
Once the exporter is running and reachable from your shell, the first check is simple:
This proves that the exporter can observe the database from its own connection path. It does not prove application health, production readiness, or that every metric query succeeded. If oracledb_up is missing or zero, check exporter logs, database connectivity, credentials, privileges, service name, and metric configuration before trusting deeper panels.
You can log into the Grafana instance at http://localhost:3000 and you will need to enter “admin” as both the user and the password. It will ask you to create a new password. Then you can nagivate to “Dashboards” and open the “Oracle Dashboard”. You’ll see something like this:
What to look at first
Start with exporter health before interpreting database behavior. The default metrics include oracledb_exporter_last_scrape_error, oracledb_exporter_last_scrape_duration_seconds, and oracledb_exporter_scrapes_total.
A healthy oracledb_exporter_last_scrape_error value is 0. A nonzero value indicates exporter scrape errors, not necessarily a database outage. The cause might be credentials, network reachability, a timeout, a permissions problem, a custom metric query, or database load.
Then add a few database-side panels that help explain application symptoms.
oracledb_up is the basic availability signal from the exporter’s perspective:
oracledb_up
For sessions, Oracle’s 2.3.1 sample output shows oracledb_sessions_value with labels such as status and type, so this is a reasonable first query after validating your own output:
sum by (status, type) (oracledb_sessions_value)
Do not treat this as a direct count of active users. Database background sessions and configuration can make the result noisy. The useful question is whether the shape changes during the incident window: did sessions rise when the endpoint slowed down, did inactive sessions pile up, or did work shift after a deployment or ingestion job?
Activity metrics such as oracledb_activity_execute_count, oracledb_activity_parse_count_total, oracledb_activity_user_commits, and oracledb_activity_user_rollbacks can show whether workload changed. Graph activity metrics directly first. Only use rate() after confirming the metric type and behavior in the exporter output for your version, because the docs expose some cumulative-looking database activity values as gauges.
Wait-time metrics provide another useful triage layer. Default metrics include categories such as user I/O, system I/O, concurrency, application, commit, and network:
oracledb_wait_time_user_io
oracledb_wait_time_system_io
oracledb_wait_time_concurrency
oracledb_wait_time_application
oracledb_wait_time_commit
oracledb_wait_time_network
In the demo environment, there is probably not enough workload to see waits. In a real system, the value is in watching whether wait behavior changes during the same window as application latency.
The default metrics also include oracledb_top_sql_elapsed. Use it as a hotspot signal, not as a complete tuning workflow. In the sample, this metric includes labels such as SQL ID and truncated SQL text. That can be useful for triage, but it also raises label-cardinality and sensitive-text concerns. Review whether those labels are appropriate for your monitoring backend before enabling broad retention or alerting on them. You may not want all your SQL statement being stored in your monitoring backend!
Why AI applications make database metrics part of user experience
AI applications often make database behavior more visible to users, even when the application code hides the details. A RAG endpoint might perform vector search, filter JSON metadata, fetch relational records, update conversation memory, write audit rows, and call a model in a single request path. If one of those database steps slows down, the user experiences it as a slow answer.
Application metrics can show that the endpoint got slower. They may also show that the connection pool waited longer or that model latency was stable. But they usually cannot answer database-side questions by themselves:
Did sessions rise during ingestion? Did execute or parse activity change after a new retrieval strategy? Did waits move toward I/O, concurrency, commit, network, or application categories? Did a new SQL shape become a hotspot? Did the exporter itself start reporting scrape errors?
Oracle Database Metrics Exporter helps answer those questions in the same observability workspace used for service metrics. That does not make the exporter an AI-specific tool. It makes it useful for AI systems because AI request paths often combine application code, database retrieval, background ingestion, and model calls in ways that are difficult to debug from one layer alone.
Security basics before copy/paste becomes habit
The exporter needs database connectivity and credentials. For demos, environment variables are acceptable placeholders. For production-like setups, use configuration files, password files, vault integrations, wallet-based authentication, or platform secret management as appropriate.
There are several credential and authentication options, including username/password, environment-variable substitution, password files, OCI Vault, Azure Vault, HashiCorp Vault, Oracle Wallet with TLS or mTLS, Secure External Password Store, and external authentication. For mTLS connections, including Autonomous Database with mTLS, Oracle Wallet is required.
Privileges deserve the same care. For default metrics, the documentation lists SELECT_CATALOG_ROLE as one option and also lists object-level grants on specific catalog and dynamic performance views. Do not treat SELECT_CATALOG_ROLE as automatically least privilege. In production-like environments, create a dedicated monitoring user and grant the narrowest permissions that support the metrics you actually collect. For custom metrics, grant access only to the objects queried by those metrics.
Tune scrape interval and query timeout for your environment. Metric collection is database work. Its cost depends on the default metrics you enable, custom SQL, scrape frequency, query timeout, number of databases, network latency, and current database load. You should be aware that running the exporter is not “zero overhead” or “low overhead” – it depends on the metrics you define and how often you collect them. You should perform measurement in your own environment.
Network placement also matters. The exporter must reach the database listener. Prometheus or the OpenTelemetry Collector must reach the exporter endpoint. Grafana must reach the metrics backend, not necessarily the exporter. In production, keep database listeners and exporter endpoints inside the intended network boundary and expose only what your monitoring architecture requires.
How it fits with tools you already have
Oracle Database Metrics Exporter complements application telemetry; it does not replace it. Spring Boot Actuator and Micrometer show how the application experiences the database: connection pool wait time, request latency, error rates, JVM behavior, and application-specific counters. The exporter shows database-side signals. The useful workflow is to compare both in the same time window.
It also does not replace Oracle Enterprise Manager or deeper Oracle diagnostic workflows. Enterprise Manager remains a broader Oracle management and monitoring platform. Exporter metrics are a different integration primitive: a bridge into Prometheus-compatible developer and SRE metrics workflows.
For detailed SQL tuning, plan changes, workload analysis, and deeper database diagnosis, use the appropriate Oracle diagnostic tooling and DBA workflows, subject to your edition and licensing. Metrics can tell you where to look. They do not automatically explain every query plan, locking issue, schema design problem, or workload pattern.
The exporter path is most relevant when Prometheus-compatible metrics are the standard integration layer, or when platform teams want Oracle Database behavior available in the same metrics pipeline as the rest of their services.
Custom SQL collectors can also be useful, especially for application-specific signals. The tradeoff is that you own the query design, metric naming, labels, metric types, privileges, scrape endpoint, and operational behavior. A maintained database exporter reduces the need to build every one of those pieces from scratch.
The practical mental model
Think of Oracle Database Metrics Exporter as a narrow but valuable bridge:
Oracle Database behavior
→ exporter metric queries
→ scrapeable metrics endpoint
→ Prometheus-compatible storage
→ Grafana panels, alerts, and shared incident timelines
That bridge is useful because database behavior often explains application symptoms. It is especially useful for AI applications, where one slow response may involve retrieval queries, JSON filters, vector search, ingestion load, connection pools, and model calls.
Start small. Prove oracledb_up. Check exporter scrape health. Add one sessions panel. Add one activity or wait panel. Compare those panels against application latency during a real workload. Once the database appears in the same timeline as the application, the conversation changes from “the database was slow” to “here is what changed, when it changed, and where we should investigate next.”
The first article introduced the graph idea: vertices are things, edges are relationships, and properties describe both.
Now we will build one.
The easiest way to learn SQL property graphs in Oracle AI Database 26ai is to start with ordinary tables. That keeps the graph concrete. You can see the rows first, then see how the graph definition gives those rows a connected shape.
The point of this article is simple: you can create a queryable fraud graph on top of existing Oracle tables in one SQL statement, without duplicating the underlying data. By the end, you will have run the seed SQL, created BANK_GRAPH, and inspected the metadata Oracle stores for the graph.
In this article, we will create a small bank dataset and define a graph named BANK_GRAPH.
Start With Tables, Not Theory
Our fraud example has three tables:
customers, which stores customer names and risk tiers;
bank_accounts, which stores account records;
bank_transfers, which stores money movement from one account to another.
The transfer table is the key. Every transfer has a source account and a destination account. That is already an edge hiding in relational form.
This is why the example works well for a first graph. We do not have to invent a relationship. The relationship already exists in the schema; the graph definition just makes it explicit and queryable as a path.
The demo seed data includes a few useful patterns:
a 3-hop cycle: 101 -> 102 -> 103 -> 101;
a fan-in pattern where several accounts send money to account 105;
a two-hop chain where account 105 sends to 108, and 108 sends to 106;
customer risk tiers that we can join back to graph results later.
First, create the three tables, load the small dataset, and print row counts. Make sure you flip the switch to enable read/write so you can create new objects! If you do not, you will get an “insufficient privileges” error.
If you prefer to work in your own Autonomous Database Serverless instance, copy the same script into SQL Worksheet and run it there. SQLcl and SQL*Plus users can run the file directly from the command line.
When it finishes, you should see row counts for customers, accounts, and transfers.
This first tutorial deliberately avoids Object Storage and bulk-loading APIs. Those matter for production data, but they add noise to a beginner lesson whose real goal is graph modeling.
Define The Graph Vocabulary
The graph definition answers four questions:
Which table contains vertices?
Which table contains edges?
Which columns identify each vertex and edge?
Which properties should graph queries expose?
For this demo, bank_accounts becomes the vertex table and bank_transfers becomes the edge table.
The VERTEX TABLES clause says that rows in bank_accounts are account vertices. The KEY clause says account_id uniquely identifies each vertex.
The EDGE TABLES clause says that rows in bank_transfers are transfer edges. The SOURCE KEY and DESTINATION KEY clauses make the edge direction explicit.
That direction matters. A transfer from account 101 to account 102 is not the same as a transfer from 102 to 101.
The PROPERTIES clauses are equally important. They decide which table columns are visible to graph queries. If you leave a column out, it remains available to ordinary SQL, but it will not be projected as a graph property in the graph pattern. For a teaching demo, keep the exposed properties small and obvious.
Next, create BANK_GRAPH and run the metadata checks shown below.
If you are using ADB-S instead, copy the code into the same schema where you ran the seed script.
Inspect The Graph
After the graph is created, inspect the database metadata.
The first check is simple:
SELECT graph_name
FROM user_property_graphs
WHERE graph_name ='BANK_GRAPH';
That confirms that Oracle Database knows about the graph object.
You should see the account vertex element and the transfer edge element.
Then inspect labels and properties. This is useful when you are teaching or debugging because it shows the vocabulary that graph queries can use.
SELECT graph_name, label_name, property_name
FROM user_pg_label_properties
WHERE graph_name ='BANK_GRAPH'
ORDERBY label_name, property_name;
The output should include properties such as ACCOUNT_ID, ACCOUNT_NAME, BALANCE, AMOUNT, CHANNEL, and TRANSFER_TS.
This metadata check is more than a sanity test. If a later graph query cannot see amount or account_name, the metadata views tell you whether the graph definition exposed those properties in the first place.
What The Graph Does Not Do
Creating BANK_GRAPH does not change the purpose of the base tables. Applications can still use customers, bank_accounts, and bank_transfers as ordinary relational tables.
The graph also does not decide what is fraudulent. It gives you a way to ask connected-data questions. In later articles, those questions become signals: cycles, hub accounts, accounts in the middle of transfer chains, PageRank scores, and connected groups. A real fraud workflow would combine those signals with business rules, investigation history, KYC data, device information, and human review.
The graph helps you see connection patterns. It does not magically replace risk modeling.
A Quick Check Before Moving On
Before you continue, make sure three things are true.
First, the base tables have rows. Second, USER_PROPERTY_GRAPHS shows BANK_GRAPH. Third, USER_PG_LABEL_PROPERTIES shows the properties you expect to query later, especially ACCOUNT_ID, ACCOUNT_NAME, AMOUNT, CHANNEL, and TRANSFER_TS.
Those checks are small, but they save time. Most beginner graph-query mistakes come from one of two places: the graph object was not created in the schema you are querying, or the property you want was not exposed in the graph definition.
If those checks pass, we can focus on patterns instead of setup. That is exactly where graph starts to feel different from ordinary table queries in practice.
Why This Matters
The graph is now ready, but the original data has not stopped being relational data.
That is the point. You can still query customers, bank_accounts, and bank_transfers with ordinary SQL. The graph adds a new query surface for connected patterns.
In the next article, we will use GRAPH_TABLE to ask questions like:
Which accounts receive money from many sources?
Which accounts sit in the middle of transfer chains?
Which accounts are part of transfer cycles?
Which suspicious graph results belong to high-risk customers?
Those questions are where the graph model starts to pay off.
Next step: move to Article 3, where the graph stops being metadata and starts answering fraud questions.
Try The Demo
This article had two runnable moments that you already saw above!
First, load the small bank dataset. That gives you ordinary relational tables with customers, accounts, and transfers. Second, create BANK_GRAPH over those tables and inspect the graph metadata.
Use the embedded FreeSQL runners above if you want the fastest path. Step 1 loads the data; Step 2 creates and checks the graph. If you prefer to use your own ADB-S instance, copy the same SQL into SQL Worksheet and run the scripts in the same order.
When both steps finish, you should have a graph named BANK_GRAPH in your schema. Keep it there for the next article. We will use that same graph to query hubs, two-hop chains, cycles, and customer-risk joins.
The first version of this application could already answer support questions against real policy data. Oracle Vector Store was retrieving the right documents before generation, and the responses were grounded in what was actually in the database rather than in whatever the model happened to know from training.
But every request started from zero.
Send “My name is Maya” and then immediately ask “What’s my name?” and the assistant had no idea. Not because retrieval failed. Because the application itself was stateless. Conversation history was never stored anywhere. Each request arrived with only the current message and the policy documents that matched the similarity search. The moment the response was sent, everything about the exchange was gone.
That is the gap between a search engine with a chat interface and something that actually behaves like a conversation. Episode 2 closes it. Watch it here,
RAG and memory are different problems
It is worth being precise about this before looking at the code, because the two ideas are easy to conflate.
RAG — retrieval-augmented generation — gives the model access to information it would not otherwise have. When a customer asks about the refund policy, the application runs a similarity search against Oracle Vector Store, finds the relevant policy document, and includes it in the prompt. The model can then answer from real data rather than from training knowledge.
Conversational memory is a different thing entirely. It is not about what the model knows. It is about what the application remembers. Without memory, each request is independent. With memory, the application accumulates context across exchanges — facts the user has shared, questions already asked, decisions already made — and makes that context available to the model on every subsequent request.
Both matter. RAG makes individual answers accurate. Memory makes the conversation coherent. Episode 1 solved the first problem. Episode 2 solves the second.
What changes
The overall architecture stays the same. Spring Boot, Spring AI, Oracle Vector Store, and OpenAI are all unchanged from Episode 1. The only addition is persistent chat memory backed by a JDBC repository in Oracle.
MessageWindowChatMemory keeps a sliding window of the most recent messages per conversation. The window size is maxMessages, set to 100 in application.yml. As the conversation grows, older messages fall off the window so the prompt does not grow without bound.
The @ConditionalOnMissingBean annotation is worth noting. Tests substitute an in-memory ChatMemory implementation in TestSupportConfiguration, so the test suite can load the full application context without requiring a live database. The @ConditionalOnMissingBean is what makes that substitution work — if a test has already registered a ChatMemory bean, this bean definition is skipped.
Adding memory to the chat client
Episode 1 already had a QuestionAnswerAdvisor wired into the ChatClient for RAG retrieval. Episode 2 adds MessageChatMemoryAdvisor in front of it:
The order matters, and it is worth understanding why.
Advisors in Spring AI form a chain that wraps around the model call. Each advisor can modify the request before it reaches the model and can inspect or modify the response on the way back out. They execute in registration order on the way in, and in reverse order on the way out.
Registering MessageChatMemoryAdvisor first means it runs before QuestionAnswerAdvisor. On the way in, the memory advisor loads the stored conversation history for the current conversation ID and adds those messages to the request. By the time QuestionAnswerAdvisor runs its similarity search, the full conversation context is already present. The retrieved policy documents are then appended on top of that. By the time the model sees the request, it has both the conversation history and the relevant policy knowledge.
If the advisors were in the wrong order — retrieval first, memory second — the memory advisor would add conversation history after retrieval had already run. That would still work for basic memory, but the similarity search would only have the current message to work with, not the full conversation context. For most queries that does not matter, but it can affect retrieval quality for follow-up questions where the topic is implicit from prior context rather than stated explicitly in the current message.
Scoping memory by conversation
Without scoping, all users would share the same memory, which would be disastrous. The chat endpoint now requires an X-Conversation-Id header:
The caller is responsible for supplying and maintaining the conversation ID. The application does not generate or track session identifiers — that is intentional. For this demo, any string works. In production you would tie the conversation ID to an authenticated user session, but that is out of scope here.
The conversation ID is passed into the advisor chain at call time:
Spring AI reads that parameter, loads the matching rows from SPRING_AI_CHAT_MEMORY, prepends them to the prompt, and writes the new exchange back to the table when the call completes. The scoping is entirely key-based. Two requests with different conversation IDs read and write completely separate sets of rows. There is no shared state between them.
Trying it
Start the application and send two messages with the same conversation ID:
curl-X POST http://localhost:8080/api/v1/agent/chat
-H"Content-Type: application/json"
-H"X-Conversation-Id: demo-1"
-d'{"message":"My name is Maya."}'
curl-X POST http://localhost:8080/api/v1/agent/chat
-H"Content-Type: application/json"
-H"X-Conversation-Id: demo-1"
-d'{"message":"What'''s my name?"}'
The second request should come back with “Maya”. Now try the same question with a different conversation ID:
curl-X POST http://localhost:8080/api/v1/agent/chat
-H"Content-Type: application/json"
-H"X-Conversation-Id: demo-2"
-d'{"message":"What'''s my name?"}'
The assistant should not know. The rows for demo-1 are invisible to a request arriving with demo-2. Memory does not leak between conversations.
Inspecting what Oracle is storing
There is a debug endpoint that reads directly from ChatMemory:
{"role":"assistant","text":"Nice to meet you, Maya! How can I help you today?"}
]
You can also go directly to the table:
SELECT*FROM SPRING_AI_CHAT_MEMORY;
These are ordinary rows. Conversation ID, role, content, timestamp. There is nothing opaque about the storage format. This also means the memory is durable — restart the application and the conversation history is still there, because it is in Oracle and not in process memory.
The model did not become smarter between episodes. The application stopped throwing away state.
Oracle is now storing three kinds of data
After Episode 1, Oracle was already handling two things: relational order data in CUSTOMER_ORDER, and vectorized policy knowledge in the Spring AI Oracle Vector Store table.
Episode 2 adds a third: conversational state in SPRING_AI_CHAT_MEMORY.
All three live in the same database. There is no separate vector store running alongside, no in-memory session cache to manage, no additional infrastructure to operate. The application connects to one Oracle instance and that instance holds everything the assistant needs — business data, knowledge, and conversation history.
That consolidation is not just operationally convenient. It also opens up possibilities that would be harder to achieve across separate systems. If you later wanted to run a query that joined conversation history with relational order data — for example, to find conversations where a customer mentioned an order that was then returned — all the data is in the same place. You do not have to reconcile records across multiple stores.
What is next
The assistant can now retrieve knowledge and remember conversations. But it still cannot act on any of the backend data. If a customer asks about the status of order ORD-1002, the assistant can only explain what the return policy says. It cannot look up the actual order.
Episode 3 changes that. Spring AI tool calling lets the assistant call narrow backend methods — order lookup, return initiation, support ticket creation — with the backend owning all the validation logic. The model handles the conversation. The backend handles the rules.
If you have ever asked, “How is this thing connected to that thing?”, you have already asked a graph question.
A customer sends money to an account. That account sends money to two other accounts. One of those accounts sends money back to the original customer. In ordinary SQL, you can query each transfer. But when the real question is about chains, loops, hubs, and hidden intermediaries, the shape of the question becomes graph-shaped.
Oracle AI Database 26ai makes that graph shape part of the database instead of a separate side project. You can model a property graph over existing relational tables, query it with SQL graph syntax, join the results back to relational data, and then move into Graph Studio or PGX when you need visualization or heavier analytics.
Did you know that graphs were added to the SQL standard? Read about it here.
This series starts from zero. You do not need graph theory. You do not need a separate graph database. You need a basic comfort with SQL and a practical problem where relationships matter.
There are two ways you can try the runnable examples in this series. The first path is to use FreeSQL and this will work just fine for most of the examples: creating the tables, creating the graph, and querying patterns with GRAPH_TABLE. The second path is your own Autonomous Database Serverless instance on OCI for the full analytics chapter, because the DBMS_OGA algorithm examples require packages that are not available in the FreeSQL environment.
By the end of this article, you should be able to explain five ideas well enough to understand the graph DDL in the next article: vertex, edge, label, property, and directed relationship. You should also understand why Oracle’s 26ai graph approach is accessible from ordinary SQL instead of starting in a separate graph-only tool.
That is the only conceptual load for now. We will save weighted paths, algorithms, PGX, and natural-language querying for later articles, when those ideas have a working graph underneath them.
The Graph Mental Model
A graph has two main parts: vertices and edges.
A vertex is a thing. In a banking example, a vertex might be an account. In a social network, it might be a person. In a supply chain, it might be a warehouse, supplier, shipment, or part.
An edge is a relationship between things. In the banking example, a transfer from account 101 to account 102 is an edge. The edge has direction because money moved from one account to another.
Both vertices and edges can have properties. An account vertex can have an account id, customer id, account name, and balance. A transfer edge can have an amount, timestamp, channel, and note.
Labels give those things names in the graph model. In this series, account vertices use the label account, and transfer edges use the label transfer. Labels make graph patterns readable because you can ask for accounts connected by transfers instead of thinking only in table and column names.
That gives us a simple model:
account is a vertex;
transfer is an edge;
transfer direction goes from source account to destination account;
transfer amount and timestamp are edge properties;
customer id and balance are vertex properties.
The point is not to replace tables. The point is to describe relationships over data you already have.
That distinction helps avoid a common beginner mistake. You are not deciding whether the account data is “relational” or “graph.” It is relational data that also has a graph interpretation. Oracle lets you keep both views of the same facts.
Why Graph Questions Feel Different
Relational databases are excellent at storing and querying structured facts. A transfer table can tell you that account 101 sent 500 dollars to account 102. A customer table can tell you that account 102 belongs to a medium-risk customer.
Graph questions start when the relationship pattern matters.
For example:
Which accounts receive money from many different sources?
Which accounts sit in the middle of two-hop transfer chains?
Which accounts participate in round-trip transfer cycles?
Which high-risk customers are connected to suspicious paths?
You can answer some of these with ordinary joins, especially when the path length is fixed and short. But the queries become harder to read as soon as you want to ask about paths, cycles, or repeated relationship patterns.
That is where GRAPH_TABLE becomes useful. It lets you describe a graph pattern and return the matches as rows. Once the graph match is back in row form, you can use normal SQL again: filter it, aggregate it, join it, and sort it.
What Changed In 26ai
In Oracle AI Database 26ai, SQL property graphs are native database objects. You create them with CREATE PROPERTY GRAPH, and you query them with SQL graph syntax such as GRAPH_TABLE.
The important beginner idea is this: a SQL property graph is metadata over database objects. You do not have to copy all of your rows into a separate graph store just to start asking graph questions. The graph definition says which tables provide vertices, which tables provide edges, how the keys connect, which labels to use, and which properties to expose.
For the bank example in this series, the relational data is still stored in ordinary tables:
customers
bank_accounts
bank_transfers
The graph object simply gives those tables graph meaning:
bank_accounts becomes account vertices;
bank_transfers becomes transfer edges;
src_account_id and dst_account_id define edge direction.
That is why graph in 26ai is a good fit for developers and DBAs who already work with Oracle Database. You can start with SQL, keep the data where it is, and add graph-shaped queries where they help.
Why Not RDF?
Oracle supports RDF graphs too, but this series is about property graphs. RDF is the better fit when the work centers on formal semantics, ontologies, inferencing, and standards-based knowledge representation. Property graphs are the better starting point here because the question is operational and concrete: how are these accounts connected, and what suspicious patterns do those connections form?
The Bank Fraud Story We Will Use
The rest of this series uses a small bank-fraud style example. It is intentionally tiny so you can understand every row.
The demo has customers, accounts, and transfers. Some transfers form a simple cycle. Other transfers create a fan-in pattern where several accounts send money to the same destination. Another chain places a high-risk customer in the middle.
That gives us enough data to teach useful graph ideas without hiding the lesson inside a huge dataset.
Here is the shape of the data we will use throughout this series:
customers bank_accounts bank_transfers
---------- ------------- --------------
customer_id -> customer_id transfer_id
risk_tier account_id -> src_account_id
account_name -> dst_account_id
balance amount
transfer_ts
channel
The SQL property graph gives those tables this connected shape:
(account)-[transfer]->(account)
That one pattern, account connected to account by transfer, is enough to teach all four articles’ worth of graph concepts: paths, cycles, hubs, chains, ranking, connected groups, and hybrid SQL joins.
Where This Series Is Going
The next article builds the graph. We will create the tables, load seed data, define BANK_GRAPH, and inspect the graph metadata.
After that, we will query fraud patterns with GRAPH_TABLE. Then, in ADB-S, we will add in-database algorithms with DBMS_OGA and use Graph Studio to visualize the same graph.
The goal is not to memorize every graph feature. The goal is to build a practical mental model:
start with relational data;
define a SQL property graph;
query graph patterns with SQL;
join graph results to ordinary data;
move into Graph Studio or PGX when you need visualization or deeper analytics.
That is the zero-to-hero path.
Try The Demo
The fastest way to make the ideas concrete is to run the tiny bank demo yourself. The first runnable step creates three ordinary relational tables:
customers
bank_accounts
bank_transfers
Those tables hold the facts. The next step creates BANK_GRAPH, a SQL property graph over the account and transfer rows. Nothing is copied into a separate graph store; the graph definition gives the existing rows a connected shape.
Note: this will ask you to login with your oracle.com account since it writes to the database, not just reads. It’s totally free. I recommend you read through the SQL first to understand what it does, then click on the “Run Script” button to execute it, and then scroll through the output to see what happened. Feel free to play around with it and change it however you like!
In the next article, you’ll load the seed data and create the graph. If you prefer to use your own Autonomous Database Serverless instance, copy the same SQL into your SQL Worksheet and run it there. Either way, keep the data small at first. The goal is to see the graph model clearly before adding larger datasets, algorithms, or visualization.
Once BANK_GRAPH exists, the later examples will use the same graph to find inbound hubs, transfer chains, cycles, PageRank scores, connected groups, and weighted paths.
1,360,390 people have been kind enough to visit our humble blog. Others get our posts by RSS or email or through syndicators. We hope you took away something of value. Please come again!
Copyright 2009-2026 Mark Nelson and other contributors. All Rights Reserved. The views expressed in this blog are our own and do not necessarily reflect the views of Oracle Corporation. All content is provided on an ‘as is’ basis, without warranties or conditions of any kind, either express or implied, including, without limitation, any warranties or conditions of title, non-infringement, merchantability, or fitness for a particular purpose. You are solely responsible for determining the appropriateness of using or redistributing and assume any risks.
You must be logged in to post a comment.