Building a Spring Boot Starter for Oracle Spatial

Hi everyone!

Over the last few days I have been working on a new Spring Boot starter for Oracle Spatial.  I did this work with two other real developers, and three AI coding assistants, and I thought it would be interesting to write up the story of how it came together.

This was a good example of how building a starter is not just about getting code to compile or getting a sample app to run. It is also about API shape, developer expectations, reviewer feedback, naming, documentation, tests, and all of those little choices that decide whether something feels natural or awkward once another developer actually tries to use it.

We ended up in a much better place than where we started, but not because the first design was perfect. We got there because the reviews were good, the feedback was honest, and we were willing to change direction once it was clear the API could be better.

So this post is a bit of a behind-the-scenes look at that process.

What we were trying to build

The goal sounded simple enough on paper:

  • create a Spring Boot starter for Oracle Spatial
  • make it easy to work with SDO_GEOMETRY
  • keep the programming model GeoJSON-first
  • provide a sample application that shows realistic use

The kinds of queries we had in mind were the kinds of queries almost everyone reaches for first:

  • store a point or polygon
  • fetch it back as GeoJSON
  • find landmarks near a point
  • find landmarks within or interacting with a polygon

So from the beginning this was going to involve Oracle Spatial operators such as:

  • SDO_UTIL.FROM_GEOJSON
  • SDO_UTIL.TO_GEOJSON
  • SDO_FILTER
  • SDO_RELATE
  • SDO_WITHIN_DISTANCE
  • SDO_NN
  • SDO_GEOM.SDO_DISTANCE

We also wanted a sample app that felt approachable. We used a small San Francisco landmark dataset so the sample would be easy to understand and a little bit fun to play with.

Where we started

The first version of the starter leaned in the direction that I think a lot of us would naturally start with: helper utilities.

We had one piece focused on converting GeoJSON to and from Oracle Spatial, and another piece focused on generating the bits of SQL you need for common spatial predicates and projections.

On one level, that worked.

It absolutely solved real problems:

  • you did not have to remember the exact SDO_UTIL.FROM_GEOJSON(...) call
  • you did not have to hand-type the common predicate shapes every time
  • you could centralize the default SRID and distance unit

If you already like building SQL with JdbcClient, it was useful.

But there was a catch. The public API was still basically returning strings. And once that got called out in review, it was hard to ignore.

The review comment that changed everything

The most important reviewer comment was not about a syntax error or a missing test. It was about the shape of the API.

The feedback was basically: if this is a Spring Boot starter, why is the main experience a SQL string builder?

That was the right question.

It is one thing to have Spring-managed beans. It is another thing entirely to have a Spring-native programming model.

At that point the starter was Spring-managed, but not really Spring-native. Yes, you could inject the beans, but what you got back were still string fragments that you had to splice together yourself.

That triggered the main redesign.

The moment that made the redesign easier

One thing that helped a lot is that the original API had not been released yet.

That is a huge advantage.

It meant we did not need to protect old method names or preserve an API shape just because it already existed. We were free to ask a better question: if we were designing this from scratch for Spring developers, what should it look like?

Once we reframed it that way, the answer became much clearer.

What we changed

The big change was moving away from public string-builder beans and toward one main Spring JDBC integration bean:

  • OracleSpatialJdbcOperations

Instead of treating Oracle Spatial as a collection of string helper methods, we moved to a design where application code injects one bean and then creates typed spatial query parts:

  • SpatialGeometry
  • SpatialExpression
  • SpatialPredicate
  • SpatialRelationMask

This ended up being a much better fit for the way Spring JDBC code is usually written.

You still write SQL. We did not try to invent a whole DSL. But the spatial pieces now carry more meaning and stay connected to the bind process instead of floating around as anonymous fragments.

That was the key design improvement.

Why this felt better almost immediately

One of the things I liked about the redesign is that the sample application got better almost immediately once the API got better.

That is usually a good sign.

When an API is awkward, the sample tends to look awkward too. You can hide that for a little while, but not for long.

With the redesigned API, the service code started to read much more naturally. In the sample’s findNear flow, for example, we now create:

  • a SpatialGeometry
  • a distance expression
  • a within-distance predicate

and then build the SQL around those named pieces.

That may sound like a small thing, but it makes a big difference when someone is reading the sample for the first time and trying to understand the intended usage pattern.

Instead of “here is some mysterious Oracle SQL,” the code reads more like “here is the geometry we are searching around, here is the distance we want to project, and here is the predicate we want to apply.”

That is a much better teaching story.

The sample app became part of the design process

I think sometimes people talk about sample applications as though they are just an afterthought. In practice, for a starter like this, the sample is part of the design process.

It answers questions like:

  • does the API feel natural in a real service?
  • are the method names understandable?
  • can we explain this to somebody without too much ceremony?
  • does the code read like Spring, or does it read like a thin wrapper over raw SQL?

Our sample app is a simple REST service built around landmarks in San Francisco.

It has endpoints for:

  • creating a landmark
  • fetching a landmark by id
  • finding nearby landmarks
  • finding landmarks inside or interacting with a polygon

We also spent some time improving the seed data because I wanted the sample to be a little more recognizable and useful. So the sample now includes landmarks like:

  • Ferry Building
  • Union Square
  • Golden Gate Park
  • Oracle Park
  • Salesforce Tower
  • Transamerica Pyramid
  • Coit Tower

That may not be the most “architectural” part of the work, but it helps make the sample feel real instead of abstract.

One small enum that mattered more than I expected

Another design decision that turned out to matter a lot was replacing raw relation-mask strings with an enum:

  • SpatialRelationMask

This came directly out of review feedback.

The problem with raw strings is obvious once someone points it out: a developer can type something like "INTERSECTS" and everything looks fine until runtime, when Oracle tells them that is not a valid mask.

That is exactly the kind of thing a good starter should help with.

By introducing an enum, we made that part of the API:

  • safer
  • easier to discover
  • harder to misuse

It is a small API element, but it made the whole thing feel more deliberate.

Distance became first-class because it had to

One of the early reviews also pointed out that if we wanted to serve real spatial use cases, we needed first-class support for distance calculations.

That was absolutely right.

A lot of real applications want some version of:

  • find nearby things
  • return the distance
  • order by distance

If the starter handled filtering but not distance calculation, it would always feel like it stopped just short of the most useful scenario.

So SDO_GEOM.SDO_DISTANCE became part of the main API design rather than something developers would have to improvise themselves.

I think that made the starter much more credible for real use.

Documentation ended up being more important than I expected

I do not mean that in a generic “docs matter” way. I mean that for this starter in particular, the documentation had to explain a mental model, not just list methods.

The most useful documentation change we made was adding a clear distinction between:

  • what gets injected as a Spring bean
  • what gets created per query

That turned out to be the right way to explain the design.

You inject:

  • OracleSpatialJdbcOperations

And then per query you create:

  • SpatialGeometry
  • SpatialExpression
  • SpatialPredicate

Once we started explaining it that way, the docs got much easier to follow.

We also added concrete query pattern examples for:

  • inserts
  • GeoJSON projection
  • filter + relate
  • distance-ordered proximity queries

Those examples matter because they show how the design is actually supposed to be used. For a starter, that is every bit as important as the Javadoc.

Some of the polish came from the less glamorous review comments

Not all of the useful review feedback was high-level architecture. Some of it was the kind of practical feedback that makes software much more solid.

A few examples:

The sample needed better error handling

At one point the sample app would throw an exception and return a 500 if a caller passed an invalid spatial relation mask.

That is a perfectly believable bug in a sample. It is also exactly the kind of thing a reviewer should call out, because the sample is part of the product story.

We fixed it so that invalid masks now produce a proper 400 Bad Request with a useful message.

The tests needed to be more Spring-native too

We got feedback to use @ServiceConnection in the Testcontainers-based tests, and that was good advice. It made the tests more consistent with current Spring Boot style and reduced some manual wiring.

We also adjusted the integration test to run as the app user rather than system, which is a much better representation of how the code should actually work.

The SQL setup needed to be repeatable in CI

This is one of those things that only becomes obvious once CI starts yelling at you.

We hit issues with setup SQL being run more than once and colliding with existing objects or metadata. That led to a round of cleanup to make the test setup more idempotent and more robust.

That is not the glamorous part of building a starter, but it is absolutely part of shipping one.

What we deliberately did not do

There were also some things we chose not to do in this round of work.

I think that is worth talking about because saying “not yet” is often part of good design.

We did not try to eliminate SQL entirely

Even after the redesign, application code still assembles SQL statements with JdbcClient.

That was intentional.

We wanted to make the API more Spring-native, but not disappear into a custom DSL or pretend SQL no longer exists. This is still Spring JDBC. SQL is still the right abstraction level. The important improvement was to stop making the public API itself a string-builder API.

There is still room to improve the ergonomics in the future, especially around making it harder to forget a bind contributor. Reviewers called that out too, and I think they are right. But that felt like a v2 refinement, not something we needed to solve before this version was useful.

We did not add every possible spatial operation

Another good review point was that buffer generation would be useful too. I agree.

But once again, that felt like feature expansion rather than a cleanup item for this iteration.

There is always a temptation to keep adding one more thing once the codebase is open and fresh in your mind. In this case I think the right move was to get the core API shape right, get the sample and docs into good condition, and leave some room for future work.

What I think we ended up with

At the end of all of this, what we have is not just a set of helper methods. It is a small but coherent Spring Boot story for Oracle Spatial.

The final result includes:

  • a starter that auto-configures a Spring JDBC-oriented spatial bean
  • a typed API for spatial query parts
  • safer handling of SDO_RELATE masks
  • first-class distance support
  • a sample REST app that demonstrates realistic usage
  • docs that explain the mental model, not just the method list
  • integration tests that run against Oracle AI Database 26ai Free with Testcontainers

And maybe the biggest thing for me is that it now feels like something I would want another Spring developer to pick up and try.

That was not as true of the first version.

What we are already thinking about for v2

Even though I feel good about where this landed, we have also been pretty careful to write down the things that did not belong in this iteration.

There are at least two clear areas for a possible v2.

The first is improving the ergonomics around binding and query composition.

Right now the design is much better than the original string-builder approach, but there is still a pattern like this:

spatial.bind(
        jdbcClient.sql("select ... " + distance.selection("distance")
                + " from landmarks where " + within.clause()),
        distance, within)

That is a reasonable place to be for a JDBC-oriented starter, but it is not hard to imagine a future version that tightens that up further and makes it harder to forget a bind contributor.

The second is expanding the supported spatial operations.

One of the most obvious candidates there is buffer generation with SDO_GEOM.SDO_BUFFER. That came up during review, and I think it is a very good candidate for a future enhancement. There are also broader questions about whether repository-style integrations or richer mapping options might make sense once we have real experience with how people use the current API.

But I want to be careful here. Just because we can imagine a v2 does not mean we should rush into it. What I would really like at this point is to get feedback from actual users first.

I would much rather learn from people who try the starter on real spatial workloads than guess too aggressively about what the next abstraction should be. Maybe the next thing we need is buffer support. Maybe it is better row mapping. Maybe it is a tighter JdbcClient integration. Maybe it is something we have not even thought about yet.

So yes, we do have a v2 plan taking shape. But before we take that next step, I would love to hear from users and see how this first version holds up in practice.

Working with AI coding assistants

Like many people today, I am doing more and more work with AI Coding assistants.  For this piece of work, these are the participants and the roles they played:

  • me: collaborated with GPT to write the specification, set the standards, conventions, etc., guided the whole process, read and reviewed code myself, helped make, or made design decisions
  • GPT-5.4: acted as the primary developer, wrote most of the code, reflected on its own work, helped write and refine the specification, processed review comments and planned, helped with the design
  • Claude Code: acted as a “senior architect” who reviewed the code with a focus on technical aspects and gave detailed feedback and recommendations
  • Gemini (3 thinking/pro): acted as the “product manager” who reviewed the project from the point of view of how well it addressed the need, if it exposed the right features and capabilities, and how useful it would be for a spatial user
  • two human colleagues: acted as reviewers and developers who provided valuable feedback on the code and the design

As we worked through the project together, I saved a lot of the generated plans and reviews in the .ai directory because I think those are valuable artifacts.  We’re obviously using AI more, and we have no intention of hiding it, so why not save these for future use?  Seems like the right thing to do.  We did have a conversation about this before coming to the conclusion that we should save them and put them in the public repo.

I also want to mention that there were several cycles in the project.  We started discussing the requirements and writing a plan.  We reviewed this and iterated on it two or three times before we ever started writing any code.  By “we”, I mean me and the three AIs.  It was interesting to see the slightly different opinions of the AIs and the different areas they focused their attention on.  Just like human developers, the different opinions and insights were useful to help us get to a better result.

After we started development, we continued to cycle through reviews, feedback, updates.  At one point, as discussed above, we decided to redesign – this was a result of human feedback.  The AIs certainly did a good job, and they are getting better all the time, but I do still feel that they are not as good at doing novel things as they are at doing things that have been done before, and they perhaps don’t consider (or at least mention) the implications of architectural decisions as much as some human developers do, at least in my experience to date.  But I am sure they will continue to evolve rapidly.

I guess I’d just say – if you have not tried working in a team with multiple AI coding assistants yet, give it a go!

Why I wanted to write this up

I wanted to tell this story because I think it is a pretty normal and healthy example of how good API work actually happens.

You start with an idea.

You build something that is useful but not yet quite right.

Somebody asks a question that exposes the weakness in the design.

You resist it for a minute.

Then you realize they are right.

Then the real design work starts.

That is basically what happened here.

And I think the result is better precisely because we were willing to let the reviews change the direction of the code rather than just polish the original design.

Wrapping up

I am really happy with how this turned out.

Not because it is finished forever. It is not. There are still good future directions here. But the core design now feels solid, and the sample, tests, and docs all tell a coherent story.

That is what I wanted from this starter.

If you are working with Spring Boot and Oracle Database, and you have spatial use cases in mind, I think this gives you a pretty nice starting point. And if you are designing your own starter or library, maybe the bigger lesson here is that reviewer feedback is not just something to “address.” Sometimes it is the thing that helps you find the real design.

More on this soon.

Posted in Uncategorized | Tagged , , , , , , , , , , | Leave a comment

Building a Simple Spatial Web App with the New Spring Boot Starter for the Spatial Features in Oracle AI Database 26ai

Hi everyone!

In this post, I will show how to run the new Spring Boot starter for the Spatial features in Oracle AI Database 26ai by using the sample application and connecting it to a simple web application front end.

The starter makes it easier to build Spring Boot applications that work with Oracle Spatial data and expose that data through familiar REST APIs. For this example, I am using Oracle AI Database 26ai Free, the Spatial sample application from the Spring Cloud Oracle repository, and a small React front end that displays landmark data on a map.

The app will do three things:

  • load landmarks inside an initial San Francisco polygon
  • find the nearest landmarks to a point you click on the map
  • search the landmarks inside the current visible map area

The Spring Boot starter for the Spatial features in Oracle AI Database 26ai is a v1 release, and I want to say that clearly up front. If you build spatial applications with Spring Boot and Oracle Database, I would really love feedback on what works well, what feels awkward, and what you would want to see next.

A few important things to note before we get into the code:

  • the work is merged into main
  • the new starter artifacts may still be ahead of the next Maven Central release
  • for this walkthrough, I am running the sample application from a source checkout of the spring-cloud-oracle repository

That last point matters because I want to give you steps you can run today, not steps that depend on a release that may not have shipped yet.

Before You Begin

For this walkthrough, you will need:

  • Java 21 or newer
  • Maven
  • Node.js and npm
  • Docker
  • a local checkout of the spring-cloud-oracle repository
  • a local checkout of this frontend companion repository

You will also need Oracle AI Database 26ai Free running locally.

If you already have Oracle AI Database 26ai Free running and you already have a user/schema you want to use for testing, you can adapt the setup below. I am going to show the exact steps that worked for me from a clean local environment.

Start Oracle AI Database 26ai Free

Before you run the container, sign in to Oracle Container Registry and accept the Oracle terms for the Oracle AI Database 26ai Free image. If you skip that step, the image pull may fail with an authentication or authorization error.

Then start Oracle AI Database 26ai Free in a container:

 docker run -d \
   --name oracle-free \
   -p 1521:1521 \
   -e ORACLE_PWD=Welcome1 \
   container-registry.oracle.com/database/free:latest

The first startup will take a few minutes. You can check the logs to see when startup is complete:

docker logs -f oracle-free

Wait until you see the message DATABASE IS READY TO USE before continuing.

Create the Sample User

For local testing, the sample worked most reliably when I created a dedicated user for it and granted the permissions needed for the spatial objects and index.

Connect as the system user and run:

 create user spatialsample identified by spatialsample;
 grant create session to spatialsample;
 grant create table to spatialsample;
 grant create view to spatialsample;
 grant create sequence to spatialsample;
 grant create procedure to spatialsample;
 grant unlimited tablespace to spatialsample;
 grant create indextype to spatialsample;
 grant create operator to spatialsample;
 grant execute on mdsys.sdo_geometry to spatialsample;
 grant execute on mdsys.sdo_util to spatialsample;
 grant execute on mdsys.sdo_geom to spatialsample;
 grant execute on mdsys.sdo_cs to spatialsample;
 grant execute on mdsys.spatial_index_v2 to spatialsample;

I am calling this out explicitly because this is one of the places where a setup detail can quietly derail the rest of the sample. Creating the user and grants up front made the rest of the walkthrough much more predictable.

The explicit grant execute on mdsys.spatial_index_v2 may look a little unusual at first glance, but it is there for a reason. Oracle’s Spatial index documentation calls out the need for EXECUTE privilege on the index type and its implementation type when creating a spatial index.

Create the Schema and Seed Data

Next, connect as the new spatialsample user and set up the schema and initial sample data:

create table if not exists landmarks (
id number primary key,
name varchar2(200) not null,
category varchar2(100) not null,
geometry mdsys.sdo_geometry not null
);
delete from user_sdo_geom_metadata
where table_name = 'LANDMARKS'
and column_name = 'GEOMETRY';
insert into user_sdo_geom_metadata (table_name, column_name, diminfo, srid)
values (
'LANDMARKS',
'GEOMETRY',
mdsys.sdo_dim_array(
mdsys.sdo_dim_element('LONG', -180, 180, 0.005),
mdsys.sdo_dim_element('LAT', -90, 90, 0.005)
),
4326
);
create index if not exists landmarks_spatial_idx
on landmarks (geometry)
indextype is mdsys.spatial_index_v2;
delete from landmarks;
insert into landmarks (id, name, category, geometry)
values
(1, 'Ferry Building', 'MARKET', sdo_util.from_geojson('{"type":"Point","coordinates":[-122.3933,37.7955]}', null, 4326)),
(2, 'Union Square', 'PLAZA', sdo_util.from_geojson('{"type":"Point","coordinates":[-122.4074,37.7879]}', null, 4326)),
(3, 'Golden Gate Park', 'PARK', sdo_util.from_geojson('{"type":"Polygon","coordinates":[[[-122.511,37.771],[-122.454,37.771],[-122.454,37.768],[-122.511,37.768],[-122.511,37.771]]]}', null, 4326)),
(4, 'Oracle Park', 'STADIUM', sdo_util.from_geojson('{"type":"Point","coordinates":[-122.3893,37.7786]}', null, 4326)),
(5, 'Salesforce Tower', 'SKYSCRAPER', sdo_util.from_geojson('{"type":"Point","coordinates":[-122.3969,37.7897]}', null, 4326));
commit;

At this point, I have:

  • a landmarks table
  • spatial metadata in USER_SDO_GEOM_METADATA
  • a spatial index
  • seeded landmarks to query

This is one of those places where I think being explicit helps. It is very easy to gloss over schema setup in a sample, but for Oracle Spatial this part is important:

  • the table exists
  • the geometry metadata exists
  • the spatial index exists
  • the sample data is already in place

Without those pieces, the rest of the walkthrough is much harder to follow.

I also like this version of the setup script because it is easier to rerun while you are testing locally. It clears the metadata entry and seed data before recreating the parts of the sample that need to be there.

Run the Official Sample Application

The backend for this walkthrough is the official sample application from the Spring Cloud Oracle repository.

The command that worked reliably for me was:

cd ~/spring-cloud-oracle/database/starters/oracle-spring-boot-starter-samples/oracle-spring-boot-sample-spatial
mvn spring-boot:run \
-Dspring-boot.run.arguments="--spring.datasource.url=jdbc:oracle:thin:@//localhost:1521/FREEPDB1
--spring.datasource.username=spatialsample --spring.datasource.password=spatialsample"

Note that I split that string for readability, but you need to put it all on one line!

There are two details here that are worth calling out:

  • I ran the command from the sample module directory
  • passing the datasource properties through spring-boot.run.arguments was the most reliable local approach

That is worth preserving exactly. I tried a couple of other ways to pass the datasource values, and this was the one that consistently worked for me.

If that starts cleanly, the sample backend should now be listening on http://localhost:9002.

What the Sample Already Gives Us

Before I write any frontend code, it is worth looking at what the sample already gives us.

The REST API is small and very easy to explain:

  • POST /landmarks
  • GET /landmarks/{id}
  • GET /landmarks/near
  • POST /landmarks/within

That is a useful shape for a tutorial because it is already enough to demonstrate:

  • GeoJSON in and GeoJSON out
  • nearest-neighbor style point searches
  • polygon-based area searches

Add Local CORS for the Frontend

When I first tried to connect the frontend to the sample application, I hit a CORS problem. For local development, the fix was to add @CrossOrigin to LandmarkController.java in the sample application.

Add this import:

 import org.springframework.web.bind.annotation.CrossOrigin;

Then annotate the controller:

 @CrossOrigin(origins = "http://localhost:5173")
 @RestController
 public class LandmarkController {
 // ...
 }
 

This is just for local development. I would not copy this unchanged into a production application.

Also, if you skip this and go straight to the frontend, the browser error is not especially helpful. You will usually just see a generic “failed to fetch” style error when the request to /landmarks/within gets blocked.

Smoke-Test the REST API

Before building the frontend, I like to make sure the API is behaving the way I expect.

Let’s start with a few direct calls to the sample backend.

A simple lookup by id:

curl http://localhost:9002/landmarks/1

A nearest-neighbor style request using a compact GeoJSON point in the query string:


 curl --get "http://localhost:9002/landmarks/near" \
 --data-urlencode 'geometry={"type":"Point","coordinates":[-122.3933,37.7955]}' \
 --data-urlencode 'distance=2000' \
 --data-urlencode 'limit=3'

A polygon search:


 curl -X POST http://localhost:9002/landmarks/within \
 -H "Content-Type: application/json" \
 -d '{
 "geometry":"{\"type\":\"Polygon\",\"coordinates\":[[[-122.515,37.75],[-122.35,37.75],[-122.35,37.808],[-122.515,37.808],[-122.515,37.75]]]}",
 "mask":"ANYINTERACT"
 }'
 

At this point, it is useful to verify that the backend is returning the kind of data we expect before moving on to the browser. The following example shows the API working from the command line.

If these calls work, the rest of the article gets a lot simpler. At that point, we know the database is up, the sample app is talking to it, and the spatial endpoints are behaving before the browser gets involved.

The Frontend Project

For the UI, I used a separate React application built with Vite, TypeScript, and React Leaflet.

Here is the full layout of the web directory before I walk through the main pieces:


 web/
   src/
     App.tsx
     components/
       MapView.tsx
     lib/
       api.ts
       geo.ts
 

I kept this frontend separate from the sample app because it makes the browser-side behavior easier to explain, and it is a very common setup that most people will recognize immediately.

I also like this split for demo applications because it keeps the browser code and the backend code easy to reason about independently.

Start the Frontend

In this frontend project, start the development server like this:


 cd ~/spatial-starter-v1-blog/web
 VITE_API_BASE_URL=http://localhost:9002 npm run dev
 

By default, Vite runs on http://localhost:5173, which is why that is the origin I allowed in the local CORS annotation above.

The Initial Map Load

For the initial screen, I did not want a generic “list all landmarks” endpoint. Since this is a spatial application, I wanted the initial load to already be spatially scoped.

So instead, I defined an opening polygon for a useful part of San Francisco:

  • north: Fisherman’s Wharf
  • east: Oracle Park
  • south: Mission
  • west: the western edge of Golden Gate Park

In the frontend, that polygon is just a small GeoJSON object:

const INITIAL_WEST = -122.515;
const INITIAL_EAST = -122.35;
const INITIAL_SOUTH = 37.75;
const INITIAL_NORTH = 37.808;
export const initialSearchPolygon = {
type: "Feature",
properties: {
label: "Initial San Francisco search area"
},
geometry: {
type: "Polygon",
coordinates: [[
[INITIAL_WEST, INITIAL_SOUTH],
[INITIAL_EAST, INITIAL_SOUTH],
[INITIAL_EAST, INITIAL_NORTH],
[INITIAL_WEST, INITIAL_NORTH],
[INITIAL_WEST, INITIAL_SOUTH]
]]
}
};

Then I post that polygon to /landmarks/within with the ANYINTERACT mask.

That gives the initial screen a much more natural map behavior. Instead of “show me everything,” the app starts with “show me the landmarks in this area.”

One small helper worth showing here is geometryToString. The sample accepts the geometry field as a JSON string, not as a nested object, so the frontend needs to serialize the GeoJSON before sending it.

export function geometryToString(feature: Feature): string {
return JSON.stringify(feature.geometry);
}

With that helper in place, the API call stays pretty small:

export async function fetchLandmarksWithin(feature: Feature<Geometry>, mask: string): Promise<Landmark[]> {
const response = await fetch(`${API_BASE_URL}/landmarks/within`, {
method: "POST",
headers: {
"Content-Type": "application/json"
},
body: JSON.stringify({
geometry: geometryToString(feature),
mask
})
});
if (!response.ok) {
throw new Error(await response.text());
}
return response.json() as Promise<Landmark[]>;
}

And the initial load in the app looks like this:

useEffect(() => {
void loadInitialArea();
}, []);
async function loadInitialArea() {
setLoading(true);
setError(null);
setActiveFeature(initialSearchPolygon as Feature<Geometry>);
setNearestSearchRadius(null);
try {
const landmarks = await fetchLandmarksWithin(initialSearchPolygon as Feature<Geometry>, "ANYINTERACT");
setSearchResults(makeResultSet("initial-area", landmarks));
mapRef.current?.fitBounds([
[37.75, -122.515],
[37.808, -122.35]
]);
} finally {
setLoading(false);
}
}

Once the initial polygon query is wired up, the application starts with a focused view of San Francisco and only the landmarks that fall inside that opening search area.

Rendering Points and Polygons

One of the nice things about the sample is that the API boundary stays GeoJSON-first. That means the frontend does not need a special Oracle representation. It can just parse the returned GeoJSON and render it.

I render polygon landmarks with GeoJSON, and point landmarks with CircleMarker.

{areaFeatures.map((landmark) => (
<GeoJSON
key={landmark.id}
data={parseLandmarkGeometry(landmark.geometry) as Polygon}
style={() => ({
color: "#0f766e",
weight: 2,
fillColor: "#14b8a6",
fillOpacity: 0.22
})}
/>
))}
{pointFeatures.map((landmark) => {
const geometry = parseLandmarkGeometry(landmark.geometry) as Point;
const [longitude, latitude] = geometry.coordinates;
return (
<CircleMarker
key={landmark.id}
center={[latitude, longitude]}
radius={9}
pathOptions={{
color: "#ffffff",
weight: 2,
fillColor: "#ea580c",
fillOpacity: 0.95
}}
/>
);
})}

That is the core value proposition here. The backend gets to use Oracle Spatial, and the browser gets to stay in familiar GeoJSON territory.

For a web application, that is exactly the split I want.

Nearest Landmarks from a Clicked Point

The next feature is the one I wanted most for the demo: click on the map, send a point to the backend, and get the nearest landmarks back.

On click, I convert the map location into a GeoJSON point:

export function pointFeatureFromCoordinates(longitude: number, latitude: number): Feature<Geometry> {
return {
type: "Feature",
properties: {
label: "Selected point"
},
geometry: {
type: "Point",
coordinates: [longitude, latitude]
}
};
}

Then I send that point to /landmarks/near with distance and limit values:

export async function fetchNearbyLandmarks(point: Feature<Geometry>, distance: number, limit: number): Promise<Landmark[]> {
const params = new URLSearchParams({
geometry: geometryToString(point),
distance: String(distance),
limit: String(limit)
});
const response = await fetch(`${API_BASE_URL}/landmarks/near?${params.toString()}`);
if (!response.ok) {
throw new Error(await response.text());
}
return response.json() as Promise<Landmark[]>;
}

One small UX detail I liked here was keeping the clicked point visible and drawing a circle for the search radius.

{nearestSearchRadius ? (
<Circle
center={[activePoint.geometry.coordinates[1], activePoint.geometry.coordinates[0]]}
radius={nearestSearchRadius}
pathOptions={{
color: "#f97316",
weight: 2,
fillColor: "#fdba74",
fillOpacity: 0.12,
dashArray: "10 6"
}}
/>
) : null}
<CircleMarker
center={[activePoint.geometry.coordinates[1], activePoint.geometry.coordinates[0]]}
radius={8}
pathOptions={{
color: "#ffffff",
weight: 3,
fillColor: "#c2410c",
fillOpacity: 1
}}
/>

That makes the search feel much more obvious when you use the map.

It is a small detail, but it makes the demo much easier to understand visually. You can immediately see both the point you picked and the radius you asked the backend to search within.

The screenshot below shows that interaction in context, with the selected point on the map and the search radius drawn around it.

Search the Visible Area

The other interaction I wanted was the ability to search the current map area.

For that, I turn the map bounds into a polygon and post it to /landmarks/within:

export function boundsToPolygon(bounds: LatLngBounds): Feature<Polygon> {
const west = bounds.getWest();
const east = bounds.getEast();
const south = bounds.getSouth();
const north = bounds.getNorth();
return {
type: "Feature",
properties: {
label: "Visible map area"
},
geometry: {
type: "Polygon",
coordinates: [[
[west, south],
[east, south],
[east, north],
[west, north],
[west, south]
]]
}
};
}

Then the app sends that polygon with the currently selected mask:

async function handleVisibleAreaSearch() {
if (!visibleBounds) {
return;
}
const polygon = boundsToPolygon(visibleBounds) as Feature<Geometry>;
setActiveFeature(polygon);
setNearestSearchRadius(null);
const landmarks = await fetchLandmarksWithin(polygon, mask);
setSearchResults(makeResultSet("area-search", landmarks));
}

I kept the mask choices simple for this version:

  • ANYINTERACT
  • INSIDE

ANYINTERACT returns any landmark that touches or overlaps the search polygon, which is useful when a geometry sits on the edge of the visible area. INSIDE returns only landmarks whose geometry falls entirely within the polygon.

After panning the map and running the visible-area query, the UI updates to reflect the current viewport instead of the original startup polygon.

Why the GeoJSON Boundary Matters

There are a few reasons I like this sample as a starting point.

First, the sample stays GeoJSON-first at the API boundary. That is exactly what I want for browser-facing map work.

Second, the backend makes the spatial queries feel very normal from a Spring developer perspective. You are still building a simple REST application, not learning a whole new framework just to use Oracle Spatial.

Third, the v1 scope feels very reasonable. It gives me enough to build real point and polygon workflows, and it gives me something concrete to build on without trying to solve every spatial use case at once.

That said, this is also why I would really like feedback from people who actively use Oracle Spatial and Spring Boot. The best next steps will probably come from the people doing this work for real.

Wrapping Up

For this example, I used the new Oracle Spatial Spring Boot sample as the backend, then layered a small React + Leaflet application on top of it to:

  • load a meaningful initial area
  • search for nearby landmarks from a clicked point
  • query landmarks inside the current visible map area

The part I like most is that nothing about the frontend had to become Oracle-specific. The browser stayed in GeoJSON and map primitives, which is exactly where I want it.

For a Spring developer, that is one of the most useful parts of this v1 starter. It makes it much easier to work with spatial data in a normal Spring Boot application without having to invent the basic plumbing first.

If you are using Oracle Spatial with Spring Boot, I would genuinely love feedback on what works well, what feels awkward, and what you would want to see next.

Links

Posted in Uncategorized | Tagged , , , , , , , , | Leave a comment

Exploring some new Helidon features – Data Repositories and SE Declarative

With Helidon 4.4.0 right around the corner, I’ve been spending some time playing with the latest milestone release (4.4.0-M2). If you haven’t been following along, Helidon 4 was a major milestone because it was the first framework built from the ground up on Java 21 virtual threads.

Now, with 4.4, we are seeing some really cool incubating features becoming more stable inlcuding Helidon Data Repositories, Helidon SE Declarative and Helidon AI. I am working on a sample project called Helidon-Eats to show how these work together. In this installment, I am looking at the first two.

The “No-Magic” Power of Helidon SE Declarative

If you’ve used Helidon MP (MicroProfile), or if you are more familiar with Spring Boot’s Inversion of Control approach (like me), then you’re used to the convenience of dependency injection and annotations. Helidon SE, on the other hand, has always focused on transparency and avoiding “magic.” While that’s great for performance, it usually meant writing more boilerplate code to register routes and manage services manually.

Helidon SE Declarative changes that. It gives you an annotation-driven model like MP, but here is the trick: it does everything at build-time. Using Java annotation processors, Helidon generates service descriptors during compilation. This means you get the clean, injectable code you want, but without the runtime reflection overhead that slows down startup and eats memory. Benchmarks have even shown performance gains of up to +295% over traditional reflection-based models on modern JDKs. (see this article)

Now, to be completely fair, I will say that I am not completely sold on the build-time piece yet. For example, the @GenerateBinding annotation (if I am not mistaken) causes an ApplicationBinding class to be generated at build time, and that lives in your target/classes directory. I found during refactoring that you have to be careful to mvn clean each time to make sure it stays in sync with your code, just doing a mvn compile or package could get you into trouble. And I am not sure I am happy with it not being checked into the source code repository. But, I’ll withhold judegment until I have worked with it a bit more!

Simplifying Persistence with Helidon Data

The Helidon Data Repository is another big addition. It’s a high-level abstraction that acts as a compile-time alternative to heavy runtime ORM frameworks. Instead of writing JDBC code, you define a Java interface, and Helidon’s annotation processor generates the implementation for you.

It supports standard patterns like CrudRepository and PageableRepository, which I used in this project to handle the recipe collection. The framework can even derive queries directly from your method names (like Spring Data does) – so a method like findById is automatically turned into the correct SQL at build-time.

The Backend: Oracle AI Database 26ai

For this sample, I’m using Oracle AI Database 26ai Free. I sourced some public domain recipe data from Kaggle that comes as a line-by-line JSON file (LDJSON).

Normally, if you want to store hierarchical JSON in relational tables, you have to write complex mapping logic in your application. But I wanted to try a more novel approach using JSON Relational Duality Views (DV).

Duality Views are a game-changer because they decouple how data is stored from how it is accessed. My data is stored in three normalized tables (RECIPE, INGREDIENT, and DIRECTION) which ensures ACID consistency and no data duplication. The database can surface this data to applications as a single, hierarchical JSON document. I am not using that feature in this post, but I will in the future!

GraphQL-Based View Creation and Loading

One of the coolest parts of Oracle AI Database 26ai is that you can define these views using a GraphQL-based syntax . The database engine automatically figures out the joins based on the foreign key relationships.

Here is how I defined the recipe_dv:

CREATE OR REPLACE JSON RELATIONAL DUALITY VIEW recipe_dv AS
recipe @insert @update @delete
{
  recipeId: id,
  recipeTitle: title,
  description: description,
  category: category,
  subcategory: subcategory,
  ingredients: ingredient @insert @update @delete
  [
    {
      id: id,
      item: item
    }
  ],
  directions: direction @insert @update @delete
  [
    {
      id: id,
      step: step
    }
  ]
};


Isn’t that just the cleanest piece of SQL that deals with JSON that you’ve ever seen?

Because the view is “updatable” (@insert, @update), I used it to actually load the data. Instead of a complex ETL process, my startup script just reads the LDJSON file line-by-line and does a simple SQL insert directly into the view. The database engine takes that single JSON object and automatically decomposes it into rows for the three underlying tables.

Modeling the Service

On the Java side, I modeled the Recipe entity to handle the parent-child relationships using standard @OneToMany collections.

One detail I want to highlight is the use of @JsonbTransient. When you build a REST API, you often have internal metadata like database primary keys or sort ordinals that you don’t want messing up the JSON that the end user gets to see. By annotating those fields with @JsonbTransient, they are excluded from the final JSON response. This keeps the API response clean and focused only on the recipe data.

@Entity
@Table(name = "RECIPE")
public class Recipe {
    @Id
    @Column(name = "ID")
    private Long recipeId;
    
    private String recipeTitle;

    @JsonbTransient
    private Long internalId; // Hidden from the API
    
    @OneToMany(mappedBy = "recipe")
    private List<Ingredient> ingredients;
    
    //...
}


In the repository object, you can use the method naming conventions to automatically create queries (like Spring Data) and you can also write your own JPQL (not SQL) queries, as I did in this case (also like Spring Data):

package com.github.markxnelson.helidoneats.recipes.model;

import java.util.Optional;

import io.helidon.data.Data;

@Data.Repository
public interface RecipeRepository extends Data.CrudRepository<Recipe, Integer> {

    @Data.Query("SELECT DISTINCT r FROM Recipe r "
        + "LEFT JOIN FETCH r.ingredients "
        + "LEFT JOIN FETCH r.directions "
        + "WHERE r.recipeId = :recipeId")
    Optional<Recipe> findByRecipeIdWithDetails(Integer recipeId);

}

Wiring and Startup

The configuration is handled in the application.yaml, where I point Helidon to the Oracle instance using syntax that again is very reminiscent of what I’d do in Spring Boot.

server:
  port: 8080
  host: 0.0.0.0
app:
    greeting: "Hello"

data:
  sources:
    sql:
      - name: "food"
        provider.hikari:
          username: "food"
          password: "Welcome12345##"
          url: "jdbc:oracle:thin:@//localhost:1521/freepdb1"
          jdbc-driver-class-name: "oracle.jdbc.OracleDriver"
  persistence-units:
    jakarta:
      - name: "recipe"
        data-source: "food"
        properties:
          hibernate.dialect: "org.hibernate.dialect.OracleDialect"
          jakarta.persistence.schema-generation.database.action: "none"

With Declarative SE, injecting the repository into my endpoint is simple. I just use @Service.Inject on the constructor, which allows me to keep my fields private final.

@Service.Singleton
@Http.Path("/recipe")
public class RecipeEndpoint {
    private final RecipeRepository repository;

    @Service.Inject
    public RecipeEndpoint(RecipeRepository repository) {
        this.repository = repository;
    }

    @Http.GET
    @Http.Path("/{id}")
    public Optional<Recipe> getRecipe(Long id) {
        return repository.findById(id);
    }
}


Finally, the Main class uses @Service.GenerateBinding. This tells the annotation processor to generate the “wiring” code that starts the server and initializes the service registry without needing to scan the classpath at runtime.

@Service.GenerateBinding
public class Main {
    public static void main(String args) {
        LogConfig.configureRuntime();
        ServiceRegistryManager.start(ApplicationBinding.create());
    }
}


In this context, the “service registry” is something in Helidon that keeps track of the services in the application and handles injection and so on. It’s a lot like the way Spring Boot scans for beans and wires/injects them where needed.

The Result

When you hit the service, you get a clean, well structured JSON response that masks all the complexity of the underlying three-table relational join.

Example Response for http://localhost:8080/recipe/22387:

{
  "category": "Appetizers And Snacks",
  "description": "I came up with this rhubarb salsa while trying to figure out what to do with an over-abundance of rhubarb...",
  "directions":,
  "ingredients": [
    "2 cups thinly sliced rhubarb",
    "1 small red onion, coarsely chopped",
    "3 roma (plum) tomatoes, finely diced"
  ],
  "recipeId": 22387,
  "recipeTitle": "Tangy Rhubarb Salsa",
  "subcategory": "Salsa"
}


Wrap Up

Helidon 4.4 is making the SE flavor feel a lot more like a high-productivity framework without sacrificing performance. By shifting the data transformation logic to the database with Duality Views and using build-time code generation for injection, we can build services that are both incredibly fast and easy to maintain.

Now, you may have noticed that I said “like Spring” a lot in this post – and that’s because of two things – I do happen to use Spring a lot more than I use Helidon, and I like it. So I am very happy that Helidon is looking more like Spring, it makes it a lot easier to switch between the two, and I think it lowers the barrier to entry for people who are coming from the Spring world.

Grab the code from the Helidon-Eats repo and let me know what you think – and stay tuned for the next steps as I explore Helidon AI!

Posted in Uncategorized | Tagged , , , , , , , , , , , , , , | Leave a comment

Using Reflection to Help LLMs Write Better SQL

Getting LLMs to write good SQL can be tricky. Sure, they can generate syntactically correct queries, but do those queries actually answer the question you asked? Sometimes an LLM might give you technically valid SQL that doesn’t quite capture what you’re really looking for.

I wanted to experiment with the reflection pattern to see if we could get better results. The idea is simple: after the LLM generates SQL and executes it, have it reflect on whether the query actually answers the original question. If not, let it try again with the benefit of seeing both the question and the initial results.

Let me show you how this works.

Setting up the database

I used an Oracle Autonomous Database on Oracle Cloud for this experiment. First, I created a user with the necessary permissions. Connect as ADMIN and run this:

create user moviestream identified by <password>;
grant connect, resource, unlimited tablespace to moviestream;
grant execute on dbms_cloud to moviestream;
grant execute on dbms_cloud_repo to moviestream;
grant create table to moviestream;
grant create view to moviestream;
grant all on directory data_pump_dir to moviestream;
grant create procedure to moviestream;
grant create sequence to moviestream;
grant create job to moviestream;

Next, let’s load the sample dataset. Still as ADMIN, run this:

declare 
    l_uri varchar2(500) := 'https://objectstorage.us-ashburn-1.oraclecloud.com/n/c4u04/b/building_blocks_utilities/o/setup/workshop-setup.sql';
begin
    dbms_cloud_repo.install_sql(
        content => to_clob(dbms_cloud.get_object(object_uri => l_uri))
    );
end;
/

Then connect as the moviestream user and run this to load the rest of the dataset:

BEGIN
    workshop.add_dataset(tag => 'end-to-end');
END;
/

This takes a few minutes to complete, after which we have a database with customer and sales data to work with.

The approach

The reflection pattern works like this:

  1. Give the LLM the database schema and a natural language question
  2. LLM generates SQL (v1)
  3. Execute the SQL and get results
  4. LLM reflects: “Does this SQL actually answer the question?”
  5. Generate improved SQL (v2) based on the reflection
  6. Execute v2 and provide the final answer

The key insight here is that by seeing the actual results, the LLM can judge whether it interpreted the question correctly. For example, if you ask “who are our top customers?”, the LLM might initially think “highest income” when you actually meant “highest spending”. Seeing the results helps it course-correct.

Setting up the Python environment

I used a Jupyter notebook for this experiment. First, let’s install the libraries we need:

%pip install aisuite oracledb 

I’m using Andrew Ng’s aisuite for a unified interface to different LLM providers, and oracledb to connect to the database.

Now let’s import aisuite:

import aisuite as ai

Connecting to Oracle Autonomous Database

For Oracle Autonomous Database, you’ll need to download the wallet and set up the connection. Here’s how I connected:

import oracledb

username = "moviestream"
password = "<password>"
dsn = "<connection_string>"
wallet = '<path_to_wallet>'

try:
    connection = oracledb.connect(
        user=username, 
        password=password, 
        dsn=dsn,
        config_dir=wallet,
        wallet_location=wallet,
        wallet_password='<wallet_password>')
    print("Connection successful!")
except Exception as e:
    print(e)
    print("Connection failed!")

And set the TNS_ADMIN environment variable:

import os
os.environ['TNS_ADMIN'] = wallet

Configuring the LLM client

Let’s set up the AI client. I used GPT-4o for this experiment:

client = ai.Client()
os.environ['OPENAI_API_KEY']='<your_api_key>'

models = ['openai:gpt-4o']

Getting the database schema

For the LLM to write good SQL, it needs to know what tables and columns are available. Let’s write a function to introspect the schema:

def get_schema():
    stmt = f'''
    SELECT 
        utc.table_name,
        utc.column_name,
        utc.data_type,
        utc.data_length,
        utc.nullable,
        utc.column_id,
        ucc.comments AS column_comment,
        utab.comments AS table_comment
    FROM 
        user_tab_columns utc
    LEFT JOIN 
        user_col_comments ucc 
        ON utc.table_name = ucc.table_name 
        AND utc.column_name = ucc.column_name
    LEFT JOIN 
        user_tab_comments utab 
        ON utc.table_name = utab.table_name
    ORDER BY 
        utc.table_name, 
        utc.column_id;
    '''

    cursor = connection.cursor()
    cursor.execute(stmt)
    rows = cursor.fetchall()

    # Convert to one long string
    result_string = '\n'.join([str(row) for row in rows])

    cursor.close()

    return result_string

This function queries the Oracle data dictionary to get information about all tables and columns, including any comments. It returns everything as a single string that we can pass to the LLM.

Generating SQL from natural language

Now let’s write the function that takes a natural language question and generates SQL:

def generate_sql(question: str, schema: str, model: str):
    prompt = f'''
    You are an SQL assistant for Oracle Database.
    You create Oracle SQL statements to help answer user questions.
    Given the user's question and the schema information, write an SQL
    query to answer the question.

    Schema:
    {schema}

    User question:
    {question}

    Respond with the SQL only.  Do not add any extra characters or delimiters.
    '''
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        temperature=0,
    )
    return response.choices[0].message.content.strip()

This function takes the question, the schema information, and the model to use. It constructs a prompt that gives the LLM the context it needs and asks for just the SQL query.

Executing SQL queries

We need a function to actually run the generated SQL:

def execute_sql(stmt):
    cursor = connection.cursor()
    cursor.execute(stmt)
    rows = cursor.fetchall()

    # Convert to one long string
    result_string = '\n'.join([str(row) for row in rows])

    cursor.close()

    return result_string

This executes the query and returns the results as a string.

The reflection step

Here’s where it gets interesting – the function that reviews the SQL and results, and potentially generates improved SQL:

import json

def refine_sql(question, sql_query, output, schema, model):
    prompt = f'''
    You are a SQL reviewer and refiner. 

    User asked:
    {question}

    Original SQL:
    {sql_query}

    SQL Output:
    {output}

    Schema:
    {schema}

    Step 1: Evaluate if the SQL OUTPUT fully answers the user's question.
    Step 2: If improvement is needed, provide a refined SQL query for Oracle.
    If the original SQL is already correct, return it unchanged.

    Return a strict JSON object with two fields:
    - "feedback": brief evaluation and suggestions
    - "refined_sql": the final SQL to run

    Return ONLY the actual JSON document.
    Do NOT add any extra characters or delimiters outside of the actual JSON itself.
    In particular do NOT include backticks before and after the JSON document.
    '''

    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        temperature=0,
    )

    content = response.choices[0].message.content
    try:
        obj = json.loads(content)
        feedback = str(obj.get("feedback", "")).strip()
        refined_sql = str(obj.get("refined_sql", sql_query)).strip()
        if not refined_sql:
            refined_sql = sql_query
    except Exception:
        # Fallback if model doesn't return valid JSON
        feedback = content.strip()
        refined_sql = sql_query

    return feedback, refined_sql

This is the heart of the reflection pattern. The function:

  1. Shows the LLM the original question, the generated SQL, and the actual results
  2. Asks it to evaluate whether the SQL output really answers the question
  3. If not, asks for an improved query
  4. Returns both the feedback and the refined SQL as JSON

The JSON format makes it easy to parse the response and extract both pieces of information. I had to be fairly pedantic to get gpt-4o to give me just JSON!

Providing a final answer

Finally, let’s write a function to convert the query results into a natural language answer:

def provide_final_answer(question, output, model):
    prompt = f'''
    You are helpful assistant.
    Given a user's question, and the results of a database query
    which has been created, evaluated, improved and executed already
    in order to get the provided output, you should provide an
    answer to the user's question.

    User question:
    {question}

    Query results:
    {output}
    '''
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        temperature=0,
    )
    return response.choices[0].message.content.strip()

This takes the final query results and turns them into a friendly, natural language response for the user.

Putting it all together

Now let’s create the main function that orchestrates the entire process:

def generate_and_reflect(question: str):
    
    schema = get_schema()
    print('SCHEMA')
    print(schema)
    print()

    sql_v1 = generate_sql(question, schema, models[0])
    print("SQL V1")
    print(sql_v1)
    print()

    output_v1 = execute_sql(sql_v1)
    print("SQL V1 output")
    print(output_v1)
    print()

    feedback, sql_v2 = refine_sql(question, sql_v1, output_v1, schema, models[0])
    print("FEEDBACK")
    print(feedback)
    print()
    print("SQL V2")
    print(sql_v2)
    print()

    output_v2 = execute_sql(sql_v2)
    print("SQL V2 output")
    print(output_v2)
    print()

    final_answer = provide_final_answer(question, output_v2, models[0])
    print("FINAL ANSWER")
    print(final_answer)
    print()

This function:

  1. Gets the database schema
  2. Generates the first SQL query
  3. Executes it and prints the results
  4. Sends everything to the reflection function for evaluation
  5. Generates and executes the refined SQL
  6. Converts the final results into a natural language answer

Running the experiment

Let’s try it out with a question that could be interpreted multiple ways:

generate_and_reflect('who are our top customers?')

The results

Here’s what happened when I ran this:

First attempt (SQL V1):

SELECT CUST_ID, FIRST_NAME, LAST_NAME, EMAIL, INCOME
FROM CUSTOMER
ORDER BY INCOME DESC
FETCH FIRST 10 ROWS ONLY;

The LLM interpreted “top customers” as customers with the highest income. It returned folks with incomes around $187,000:

(1138797, 'Haruru', 'Takahashi', 'haruru.takahashi6@oraclemail.com', 187168.8)
(1007335, 'Eddie', 'Crawford', 'eddie.crawford@oraclemail.com', 187145.4)
(1404002, 'Yuuto', 'Arai', 'yuuto.arai3@oraclemail.com', 187136.04)
...

Reflection:

The original SQL query retrieves the top 10 customers based on income, which may not 
fully answer the question of 'top customers' as it could be interpreted in terms of 
sales or transactions. To better answer the question, we should consider the total 
sales or transactions made by each customer.

Great! The LLM recognized that “top customers” probably means customers who spend the most, not customers who earn the most.

Second attempt (SQL V2):

SELECT C.CUST_ID, C.FIRST_NAME, C.LAST_NAME, C.EMAIL, SUM(S.ACTUAL_PRICE) AS TOTAL_SALES 
FROM CUSTOMER C 
JOIN CUSTSALES S ON C.CUST_ID = S.CUST_ID 
GROUP BY C.CUST_ID, C.FIRST_NAME, C.LAST_NAME, C.EMAIL 
ORDER BY TOTAL_SALES DESC 
FETCH FIRST 10 ROWS ONLY;

Much better! Now it’s joining with the sales data and calculating total spending per customer.

Final results:

(1234517, 'Tsubasa', 'Nakajima', 'tsubasa.nakajima2@oraclemail.com', 2356.049999999997)
(1280887, 'Steffi', 'Bielvenstram', 'steffi.bielvenstram@oraclemail.com', 2334.7299999999996)
(1017254, 'Guadalupe', 'Zamora', 'guadalupe.zamora@oraclemail.com', 2329.7599999999998)
...

The top customer is Tsubasa Nakajima with $2,356.05 in total sales, followed by Steffi Bielvenstram with $2,334.73, and so on. These are very different customers from the high-income list we got in the first attempt!

Natural language answer:

Our top customers, based on the provided data, are:

1. Tsubasa Nakajima - Email: tsubasa.nakajima2@oraclemail.com, Total: $2356.05
2. Steffi Bielvenstram - Email: steffi.bielvenstram@oraclemail.com, Total: $2334.73
3. Guadalupe Zamora - Email: guadalupe.zamora@oraclemail.com, Total: $2329.76
...

These customers have the highest total amounts associated with them.

What I learned

This reflection approach really does help. The LLM is pretty good at recognizing when its initial SQL doesn’t quite match the intent of the question – especially when it can see the actual results.

The pattern of generate → execute → reflect → regenerate is more expensive (two LLM calls instead of one for generation, plus one more for the final answer), but the quality improvement is noticeable. For production use, you might want to:

  • Cache schema information instead of fetching it every time
  • Add more sophisticated error handling for SQL errors
  • Consider running both queries in parallel and comparing results
  • Track which types of questions benefit most from reflection
  • Use the reflection feedback to build a dataset for fine-tuning

The approach is straightforward to implement and the results speak for themselves – the reflection step caught a subtle but important misinterpretation that would have given technically correct but unhelpful results.

Give it a try with your own database and questions – I think you’ll find the reflection step catches a lot of these subtle misinterpretations that would otherwise lead to valid but wrong answers.

What next? I am going to experiment with some more complex questions, and then compare the performance of a number of different LLMs to see how they go with and without reflection. Stay tuned 🙂

Posted in Uncategorized | Leave a comment

Using Multiple Datasources with Spring Boot and Spring Data JPA

Hi everyone! Today I want to show you how to configure multiple datasources in a Spring Boot application using Spring Data JPA and the Oracle Spring Boot Starter for Universal Connection Pool (UCP).

This is a pattern you’ll need when you have a single application that needs to connect to multiple databases. Maybe you have different domains in separate databases, or you’re working with legacy systems, or you need to separate read and write operations across different database instances. Whatever the reason, Spring Boot makes this pretty straightforward once you understand the configuration pattern.

I’ve put together a complete working example on GitHub at https://github.com/markxnelson/spring-multiple-jpa-datasources, and in this post I’ll walk you through how to build it from scratch.

The Scenario

For this example, we’re going to build a simple application that manages two separate domains:

  • Customers – stored in one database
  • Products – stored in a different database

Each domain will have its own datasource, entity manager, and transaction manager. We’ll use Spring Data JPA repositories to interact with each database, and we’ll show how to use both datasources in a REST controller.

I am assuming you have a database with two users called customer and product and some tables. Here’s the SQL to set that up:

$ sqlplus sys/Welcome12345@localhost:1521/FREEPDB1 as sysdba

alter session set container=freepdb1;
create user customer identified by Welcome12345;
create user product identified by Welcome12345;
grant connect, resource, unlimited tablespace to customer;
grant connect, resource, unlimited tablespace to product;
commit;

$ sqlplus customer/Welcome12345@localhost:1521/FREEPDB1

create table customer (id number, name varchar2(64));
insert into customer (id, name) values (1, 'mark');
commit;

$ sqlplus product/Welcome12345@localhost:1521/FREEPDB1

create table product (id number, name varchar2(64));
insert into product (id, name) values (1, 'coffee machine');
commit;

Step 1: Dependencies

Let’s start with the Maven dependencies. Here’s what you’ll need in your pom.xml:

<dependencies>
    <!-- Spring Boot Starter Web for REST endpoints -->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>

    <!-- Spring Boot Starter Data JPA -->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-data-jpa</artifactId>
    </dependency>

    <!-- Oracle Spring Boot Starter for UCP -->
    <dependency>
        <groupId>com.oracle.database.spring</groupId>
        <artifactId>oracle-spring-boot-starter-ucp</artifactId>
        <version>25.3.0</version>
    </dependency>

</dependencies>

The key dependency here is the oracle-spring-boot-starter-ucp, which provides autoconfiguration for Oracle’s Universal Connection Pool. UCP is Oracle’s high-performance connection pool implementation that provides features like connection affinity, Fast Connection Failover, and Runtime Connection Load Balancing.

Step 2: Configure the Datasources in application.yaml

Now let’s configure our two datasources in the application.yaml file. We’ll define connection properties for both the customer and product databases:

spring:
  application:
    name: demo

  jpa:
    customer:
      properties:
        hibernate.dialect: org.hibernate.dialect.OracleDialect
        hibernate.hbm2ddl.auto: validate
        hibernate.format_sql: true
        hibernate.show_sql: true
    product:
      properties:
        hibernate.dialect: org.hibernate.dialect.OracleDialect
        hibernate.hbm2ddl.auto: validate
        hibernate.format_sql: true
        hibernate.show_sql: true

  datasource:
    customer:
        url: jdbc:oracle:thin:@localhost:1521/freepdb1
        username: customer
        password: Welcome12345
        driver-class-name: oracle.jdbc.OracleDriver
        type: oracle.ucp.jdbc.PoolDataSourceImpl
        oracleucp:
          connection-factory-class-name: oracle.jdbc.pool.OracleDataSource
          connection-pool-name: CustomerConnectionPool
          initial-pool-size: 15
          min-pool-size: 10
          max-pool-size: 30
          shared: true
    product:
        url: jdbc:oracle:thin:@localhost:1521/freepdb1
        username: product
        password: Welcome12345
        driver-class-name: oracle.jdbc.OracleDriver
        type: oracle.ucp.jdbc.PoolDataSourceImpl
        oracleucp:
          connection-factory-class-name: oracle.jdbc.pool.OracleDataSource
          connection-pool-name: CustomerConnectionPool
          initial-pool-size: 15
          min-pool-size: 10
          max-pool-size: 30
          shared: true

Notice that we’re using custom property prefixes (spring.datasource.customer and product) instead of the default spring.datasource. This is because Spring Boot’s autoconfiguration will only create a single datasource by default. When you need multiple datasources, you need to create them manually and use custom configuration properties.

In this example, both datasources happen to point to the same database server but use different schemas (users). In a real-world scenario, these would typically point to completely different database instances.

Step 3: Configure the Customer Datasource

Now we need to create the configuration classes that will set up our datasources, entity managers, and transaction managers. Let’s start with the customer datasource.

Create a new package called customer and add a configuration class called CustomerDataSourceConfig.java:

package com.example.demo.customer;

import javax.sql.DataSource;

import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.boot.autoconfigure.jdbc.DataSourceProperties;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.boot.orm.jpa.EntityManagerFactoryBuilder;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Primary;
import org.springframework.data.jpa.repository.config.EnableJpaRepositories;
import org.springframework.orm.jpa.JpaTransactionManager;
import org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean;
import org.springframework.transaction.PlatformTransactionManager;
import org.springframework.transaction.annotation.EnableTransactionManagement;

@Configuration
@EnableTransactionManagement
@EnableJpaRepositories(entityManagerFactoryRef = "customerEntityManagerFactory", transactionManagerRef = "customerTransactionManager", basePackages = {
        "com.example.demo.customer" })
public class CustomerDataSourceConfig {

    @Bean(name = "customerProperties")
    @ConfigurationProperties("spring.datasource.customer")
    public DataSourceProperties customerDataSourceProperties() {
        return new DataSourceProperties();
    }

    /**
     * Creates and configures the customer DataSource.
     *
     * @param properties the customer datasource properties
     * @return configured DataSource instance
     */
    @Primary
    @Bean(name = "customerDataSource")
    @ConfigurationProperties(prefix = "spring.datasource.customer")
    public DataSource customerDataSource(@Qualifier("customerProperties") DataSourceProperties properties) {
        return properties.initializeDataSourceBuilder().build();
    }

    /**
     * Reads customer JPA properties from application.yaml.
     *
     * @return Map of JPA properties
     */
    @Bean(name = "customerJpaProperties")
    @ConfigurationProperties("spring.jpa.customer.properties")
    public java.util.Map<String, String> customerJpaProperties() {
        return new java.util.HashMap<>();
    }

    /**
     * Creates and configures the customer EntityManagerFactory.
     *
     * @param builder the EntityManagerFactoryBuilder
     * @param dataSource the customer datasource
     * @param jpaProperties the JPA properties from application.yaml
     * @return configured LocalContainerEntityManagerFactoryBean
     */
    @Primary
    @Bean(name = "customerEntityManagerFactory")
    public LocalContainerEntityManagerFactoryBean customerEntityManagerFactory(EntityManagerFactoryBuilder builder,
            @Qualifier("customerDataSource") DataSource dataSource,
            @Qualifier("customerJpaProperties") java.util.Map<String, String> jpaProperties) {

        return builder.dataSource(dataSource)
                .packages("com.example.demo.customer")
                .persistenceUnit("customers")
                .properties(jpaProperties)
                .build();
    }

    @Bean
    @ConfigurationProperties("spring.jpa.customer")
    public PlatformTransactionManager customerTransactionManager(
            @Qualifier("customerEntityManagerFactory") LocalContainerEntityManagerFactoryBean entityManagerFactory) {
        return new JpaTransactionManager(entityManagerFactory.getObject());
    }

}

Let’s break down what’s happening here:

  1. @EnableJpaRepositories – This tells Spring Data JPA where to find the repositories for this datasource. We specify the base package (com.example.multidatasource.customer), and we reference the entity manager factory and transaction manager beans by name.
  2. @Primary – We mark the customer datasource as the primary one. This means it will be used by default when autowiring a datasource, entity manager, or transaction manager without a @Qualifier. You must have exactly one primary datasource when using multiple datasources.
  3. customerDataSource() – This creates the datasource bean using Spring Boot’s DataSourceBuilder. The @ConfigurationProperties annotation binds the properties from our application.yaml (with the customer.datasource prefix) to the datasource configuration.
  4. customerEntityManagerFactory() – This creates the JPA entity manager factory, which is responsible for creating entity managers. We configure it to scan for entities in the customer package and set up Hibernate properties.
  5. customerTransactionManager() – This creates the transaction manager for the customer datasource. The transaction manager handles transaction boundaries and ensures ACID properties.

Step 4: Configure the Product Datasource

Now let’s create the configuration for the product datasource. Create a new package called product and add ProductDataSourceConfig.java:

package com.example.demo.product;

import javax.sql.DataSource;

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.boot.autoconfigure.jdbc.DataSourceProperties;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.boot.orm.jpa.EntityManagerFactoryBuilder;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.jpa.repository.config.EnableJpaRepositories;
import org.springframework.orm.jpa.JpaTransactionManager;
import org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean;
import org.springframework.transaction.PlatformTransactionManager;
import org.springframework.transaction.annotation.EnableTransactionManagement;

@Configuration
@EnableTransactionManagement
@EnableJpaRepositories(entityManagerFactoryRef = "productEntityManagerFactory", transactionManagerRef = "productTransactionManager", basePackages = {
        "com.example.demo.product" })
public class ProductDataSourceConfig {

    @Bean(name = "productProperties")
    @ConfigurationProperties("spring.datasource.product")
    public DataSourceProperties productDataSourceProperties() {
        return new DataSourceProperties();
    }

    @Bean(name = "productDataSource")
    @ConfigurationProperties(prefix = "spring.datasource.product")
    public DataSource productDataSource(@Qualifier("productProperties") DataSourceProperties properties) {
        return properties.initializeDataSourceBuilder().build();
    }

    /**
     * Reads product JPA properties from application.yaml.
     *
     * @return Map of JPA properties
     */
    @Bean(name = "productJpaProperties")
    @ConfigurationProperties("spring.jpa.product.properties")
    public java.util.Map<String, String> productJpaProperties() {
        return new java.util.HashMap<>();
    }

    /**
     * Creates and configures the product EntityManagerFactory.
     *
     * @param builder the EntityManagerFactoryBuilder
     * @param dataSource the product datasource
     * @param jpaProperties the JPA properties from application.yaml
     * @return configured LocalContainerEntityManagerFactoryBean
     */
    @Bean(name = "productEntityManagerFactory")
    public LocalContainerEntityManagerFactoryBean productEntityManagerFactory(@Autowired EntityManagerFactoryBuilder builder,
            @Qualifier("productDataSource") DataSource dataSource,
            @Qualifier("productJpaProperties") java.util.Map<String, String> jpaProperties) {

        return builder.dataSource(dataSource)
                .packages("com.example.demo.product")
                .persistenceUnit("products")
                .properties(jpaProperties)
                .build();
    }

    @Bean
    @ConfigurationProperties("spring.jpa.product")
    public PlatformTransactionManager productTransactionManager(
            @Qualifier("productEntityManagerFactory") LocalContainerEntityManagerFactoryBean entityManagerFactory) {
        return new JpaTransactionManager(entityManagerFactory.getObject());
    }

}

The product configuration is almost identical to the customer configuration, with a few key differences:

  1. No @Primary annotations – Since we already designated the customer datasource as primary, we don’t mark the product beans as primary.
  2. Different package – The @EnableJpaRepositories points to the product package, and the entity manager factory scans the product package for entities.
  3. Different bean names – All the beans have different names (productDataSource, productEntityManagerFactory, productTransactionManager) to avoid conflicts.

Step 5: Create the Domain Models

Now let’s create the JPA entities for each datasource. First, in the customer package, create Customer.java:

package com.example.demo.customer;

import jakarta.persistence.Entity;
import jakarta.persistence.Id;

@Entity
public class Customer {
    @Id
    public int id;
    public String name;

    public Customer() {
        this.id = 0;
        this.name = "";
    }
}

And in the product package, create Product.java:

package com.example.demo.product;

import jakarta.persistence.Entity;
import jakarta.persistence.Id;

@Entity
public class Product {
    @Id
    public int id;
    public String name;

    public Product() {
        this.id = 0;
        this.name = "";
    }
}

Step 6: Create the Repositories

Now let’s create Spring Data JPA repositories for each entity. In the customer package, create CustomerRepository.java:

package com.example.demo.customer;

import org.springframework.data.jpa.repository.JpaRepository;

public interface CustomerRepository extends JpaRepository<Customer, Integer> {

}

And in the product package, create ProductRepository.java:

package com.example.demo.product;

import org.springframework.data.jpa.repository.JpaRepository;

public interface ProductRepository extends JpaRepository<Product, Integer> {

}

Step 7: Create a REST Controller

Finally, let’s create a REST controller that demonstrates how to use both datasources. Create a controller package and add CustomerController.java:

package com.example.demo.controllers;

import java.util.List;

import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

import com.example.demo.customer.Customer;
import com.example.demo.customer.CustomerRepository;

@RestController
public class CustomerController {

    final CustomerRepository customerRepository;

    public CustomerController(CustomerRepository customerRepository) {
        this.customerRepository = customerRepository;
    }

    @GetMapping("/customers")
    public List<Customer> getCustomers() {
        return customerRepository.findAll();
    }

}

A few important things to note about the controller:

  1. Transaction Managers – When you have multiple datasources, you need to explicitly specify which transaction manager to use. Notice the @Transactional("customerTransactionManager") and @Transactional("productTransactionManager") annotations on the write operations. If you don’t specify a transaction manager, Spring will use the primary one (customer) by default.
  2. Repository Autowiring – The repositories are autowired normally. Spring knows which datasource each repository uses based on the package they’re in, which we configured in our datasource configuration classes.
  3. Cross-datasource Operations – The initializeData() method demonstrates working with both datasources in a single method. However, note that these operations are not in a distributed transaction – if one fails, the other won’t automatically roll back. If you need distributed transactions across multiple databases, you would need to use JTA (Java Transaction API).

Let’s also create ProductController.java:

package com.example.demo.controllers;

import java.util.List;

import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

import com.example.demo.product.Product;
import com.example.demo.product.ProductRepository;

@RestController
public class ProductController {

    final ProductRepository productRepository;

    public ProductController(ProductRepository productRepository) {
        this.productRepository = productRepository;
    }

    @GetMapping("/products")
    public List<Product> getProducts() {
        return productRepository.findAll();
    }

}

Testing the Application

Now you can run your application! Make sure you have two Oracle database users created (customer and product), or adjust the configuration to point to your specific databases.

Start the application:

mvn spring-boot:run

Then you can test it with some curl commands:

# Get all customers
$ curl http://localhost:8080/customers
[{"id":1,"name":"mark"}]

# Get all products
$ curl http://localhost:8080/products
[{"id":1,"name":"coffee machine"}]

Wrapping Up

And there you have it! We’ve successfully configured a Spring Boot application with multiple datasources using Spring Data JPA and Oracle’s Universal Connection Pool. The key points to remember are:

  1. Custom configuration properties – Use custom prefixes for each datasource in your application.yaml
  2. Manual configuration – Create configuration classes for each datasource with beans for the datasource, entity manager factory, and transaction manager
  3. Primary datasource – Designate one datasource as primary using @Primary
  4. Package organization – Keep entities and repositories for each datasource in separate packages
  5. Explicit transaction managers – Specify which transaction manager to use for write operations with @Transactional

This pattern works great when you need to connect to multiple databases, whether they’re different types of databases or different instances of the same database. Oracle’s Universal Connection Pool provides excellent performance and reliability for your database connections.

I hope this helps you work with multiple datasources in your Spring Boot applications! The complete working code is available on GitHub at https://github.com/markxnelson/spring-multiple-jpa-datasources.

Happy coding!

Posted in Uncategorized | Tagged , , , , , , | Leave a comment

Custom vector distance functions in Oracle (using JavaScript)

In case you missed it, Oracle Database 26ai was announced last week at Oracle AI World, with a heap of new AI features and capabilities like hybrid vector search, MCP server support, acceleration with NVIDIA and much more – check the link for details.

Of course, I wanted to check it out, and I was thinking about what to do first. I remembered this LinkedIn post from Anders Swanson about implementing custom vector distance functions in Oracle using the new JavaScript capabilities, and I thought that could be something interesting to do, so I am going to show you how to implement and use Jaccard distance for dense vector embeddings for similarity searches.

Now, this is a slightly contrived example, because I am more interested in showing you how to add a custom metric than in the actual metric itself. I chose Jaccard because the actual implementation is pretty compact.

Now, Oracle does already include Jaccard distance, but only for the BINARY data type, which is where Jaccard is mostly used. But there is a version that can be used for continuous/real-valued vectors as well (this version is for dense vectors), and that is what we will implement.

This is the formula for Jaccard similarity for continuous vectors. This is also known as the Tanimoto coefficient. It is the intersection divided by the union (or zero if the union is zero):

To get the Jaccard distance, we just subtract the Jaccard similarity from one.

Before we start, let’s look at a two-dimensional example to get a feel for how it works. Of course, the real vectors created by embedding models have many more dimensions, but it is hard for us to visualize more than two or three dimensions without also introducing techniques like dimensionality reduction and projection).

Here we have two vectors A [5 8] and B [7 4]:

The union is calculated using the max values, as you see in the formular above, so in this example it is 7×8, as shown by the area shaded pink. The intersection is calculated with the min values, so it is 5×4, as shown by the green area.

So in this example, the Jaccard similarity is (7×8) / (5×4) = 56 / 20 = 0.6

And so the Jaccard distance is 1 – 0.6 = 0.4

Ok, now that we have some intuition about how this distance metric works, let’s implement it in Oracle.

Start up an Oracle Database

First, let’s fire up Oracle Database Free 26ai in a container:

docker run -d --name db26ai \
    -p 1521:1521 \
    -e ORACLE_PWD=Welcome12345 \
    -v db26ai-volume:/opt/oracle/oradata \
    container-registry.oracle.com/database/free:latest

This will pull the latest image, which at the time of writing is 26ai (version tag 23.26.0.0). You can check the logs to see when startup is complete, you’ll see a message “DATABASE IS READY TO USE”:

docker logs -f db26ai

Let’s create a user called vector with the necessary privileges:

docker exec -i db26ai sqlplus sys/Welcome12345@localhost:1521/FREEPDB1 as sysdba <<EOF
alter session set container=FREEPDB1;
create user vector identified by vector;
grant connect, resource, unlimited tablespace, create credential, create procedure, create mle, create any index to vector;
commit;
EOF

Now you can connect with your favorite client. I am going to use Oracle SQL Developer for VS Code. See the link for install instructions.

Implement the custom distance function

Open up an SQL Worksheet, or run this in your tool of choice:


create or replace function jaccard_distance("a" vector, "b" vector)
return binary_double 
deterministic parallel_enable
as mle language javascript pure {{
    // check the vectors are the same length
    if (a.length !== b.length) {
        throw new Error('Vectors must have same length');
    }

    let intersection = 0;
    let union = 0;

    for (let i = 0; i < a.length; i++) { 
        intersection += Math.min(a[i], b[i]);
        union += Math.max(a[i], b[i]);
    }

    // handle the case where union is zero (all-zero vectors)
    if (union === 0) {
        return 0;
    }

    const similarity = intersection / union;
    return 1 - similarity;

}};
/

Let’s walk throught this. First, you see that we are creating a function called jaccard_distance which accepts two vectors (a and b) as input and returns a binary_double. This function sugnature is required for distance functions. Next we must include the deterministric keyword and we have also included the parallel_enable keyword so that this function could be used with HNSW vector indexes. For the purposes of this example, you can just ignore those or assume that they are just needed as part of the function signature.

Next you see that we mention this will be an MLE function written in JavaScript, and we added the pure keyword to let the database know that this is a pure function – meaning it has no side effects, it will not update any data, and its output will always be the same for a given set of inputs (i.e., that it is memoizable).

Then we have the actual implementation of the function. First, we check that the vectors have the same length (i.e., the same number of dimensions) which is required for this calculation to be applicable.

Then we work through the vectors and collect the minimums and maximums to calculate the intersection and the union.

Next, we check if the union is zero, and if so we return zero to handle that special case. And finally, we calculate the similarity, then subtract it from one to get the distance and return that.

Using our custom distance function

Great, so let’s test our function. We can start by creating a table t1 to store some vectors:

create table t1 (
    id number,
    v vector(2, float32)
);

And let’s add a couple of vectors, including the one we saw in the example above [5 8]:

insert into t1 (id, v) values 
(1, vector('[5, 8]')),
(2, vector('[1, 2]'));

You can so a simple select statement to see the contents of the table:

select * from t1;

This will give these results:

ID     V
1      [5.0E+000,8.0E+000]
2      [1.0E+000,2.0E+000]

Now let’s use our function to see the Jaccard distance for each vector in our table t1 from the other vector we used in the example above [7 4]:

select 
    v,
    jaccard_distance(v, vector('[7, 4]')) distance
from t1
order by distance; 

This returns these results:

V                       DISTANCE
[5.0E+000,8.0E+000]     0.4
[1.0E+000,2.0E+000]     0.7272727272727273

As you can see, the Jaccard distance from [5 8] to [7 4] is 0.4, as we calculated in the example above, and [1 2] to [7 4] is 0.72…

Let’s see how it works with large embeddings

Ok, two dimension vectors are good for simple visualization, but let’s try this out with some ‘real’ vectors.

I am using Visual Studio Code with the Python and Jupyter extensions from Microsoft installed

Create a new Jupyter Notebook using File > New File… then choose Jupyter Notebook as the type of file, and save your new file at jaccard.ipynb.

First, we need to set up the Python runtime environment. Click on the Select Kernel button (its on the top right). Select Python Environment then Create Python Environment. Select the option to create a Venv (Virtual Environment) and choose your Python interpreter. I recommend using at least Python 3.11. This will download all the necessary files and will take a minute or two.

Now, let’s install the libraries we will need – enter this into a cell and run it:

%pip install oracledb sentence-transformers

Now, connect to the same Oracle database (again, enter this into a cell and run it):

import oracledb

username = "vector"
password = "vector"
dsn = "localhost:1521/FREEPDB1"

try:
    connection = oracledb.connect(
        user=username, 
        password=password, 
        dsn=dsn)
    print("Connection successful!")
except Exception as e:
    print("Connection failed!")

Let’s create a table to hold 1024 dimension vectors that we will create with the mxbai-embed-large-v1 embedding model. Back in your SQL Worksheet, run this statement:

create table t2 (
    id number,
    v vector(1024, float32)
);

Ok, now let’s create some embeddings. Back in your notebook, create a new cell with this code:

import oracledb
from sentence_transformers import SentenceTransformer

# Initialize the embedding model
print("Loading embedding model...")
model = SentenceTransformer('mixedbread-ai/mxbai-embed-large-v1')

# Your text data
texts = [
    "The quick brown fox jumps over the lazy dog",
    "Machine learning is a subset of artificial intelligence",
    "Oracle Database 23ai supports vector embeddings",
    "Python is a popular programming language",
    "Embeddings capture semantic meaning of text"
]

# Generate embeddings
print("Generating embeddings...")
embeddings = model.encode(texts)

Let’s discuss what we are doing in this code. First, we are going to download the embedding model usign the SentenceTransformer. Then, we define a few simple texts that we can use for this example and use the embedding model to create the vector embeddings for those texts.

If you want to see what the embeddings look like, just enter “embeddings” in a cell and run it. In the output you can see the shape is 5 (rows) with 1024 dimensions and the type is float32.

Now, let’s insert the embeddings into our new table t2:

import array 
cursor = connection.cursor()

# Insert data
for i in range(len(embeddings)):
    cursor.execute("""
        INSERT INTO t2 (id, v)
        VALUES (:1, :2)
    """, [i, array.array('f', embeddings[i].tolist())])

connection.commit()
print(f"Successfully inserted {len(texts)} records")

You can take a look at the vectors using the simple query (back in your SQL Worksheet):

select * from t2

Which will show you something like this:

And, now let’s try our distance function with these vectors. Back in your notebook, run this cell. I’ve included the built-in cosine distance as well, just for comparison purposes:

query = array.array('f', model.encode("Antarctica is the driest continent").tolist())

cursor = connection.cursor()
cursor.execute("""
    select 
        id,
        jaccard_distance(v, :1),
        vector_distance(v, :2, cosine)
    from t2
    order by id
""", [query, query])

for row in cursor:
    print(f"id: {row[0]} has jaccard distance: {row[1]} and cosine distance: {row[2]}")

cursor.close()

Your output will look something like this:

id: 0 has jaccard distance: 2.0163214889484307 and cosine distance: 0.7859490566650003
id: 1 has jaccard distance: 2.0118706751976925 and cosine distance: 0.6952327173906239
id: 2 has jaccard distance: 2.0152858933816775 and cosine distance: 0.717824211314015
id: 3 has jaccard distance: 2.0216149035530537 and cosine distance: 0.6455277387099003
id: 4 has jaccard distance: 2.0132575761281766 and cosine distance: 0.6962028121886988

Well, there you go! We implemented and used a custom vector distance function. Enjoy!

Posted in Uncategorized | Tagged , , , , , , , , , | Leave a comment

Let’s make a simple MCP tool for Oracle AI Vector Search

In this earlier post, we created a vector store in our Oracle Database 23ai and populated it with some content from Moby Dick. Since MCP is very popular these days, I thought it might be interesting to look how to create a very simple MCP server to expose the similarity search as and MCP tool.

Let’s jump right into it. First we are going to need a requirements.txt file with a list of the dependencies we need:

mcp>=1.0.0
oracledb
langchain-community
langchain-huggingface
sentence-transformers
pydantic

And then go ahead and install these by running:

pip install -r requirements.txt

Note: I used Python 3.12 and a virtual environment.

Now let’s create a file called mcp_server.py and get to work! Let’s start with some imports:

import asyncio
import oracledb
from mcp.server import Server
from mcp.types import Tool, TextContent
from pydantic import BaseModel
from langchain_community.vectorstores import OracleVS
from langchain_community.vectorstores.utils import DistanceStrategy
from langchain_huggingface import HuggingFaceEmbeddings

And we are going to need the details of the database so we can connect to that, so let’s define some variables to hold those parameters:

# Database connection parameters for Oracle Vector Store
DB_USERNAME = "vector"
DB_PASSWORD = "vector"
DB_DSN = "localhost:1521/FREEPDB1" 
TABLE_NAME = "moby_dick_500_30"  

Note: These match the database and vector store used in the previous post.

Let’s create a function to connect to the database, and set up the embedding model and the vector store.

# Global variables for database connection and embedding model
# These are initialized once on server startup for efficiency
embedding_model = None  # HuggingFace sentence transformer model
vector_store = None     # LangChain OracleVS wrapper for vector operations
connection = None       # Oracle database connection

def initialize_db():
    """
    Initialize database connection and vector store

    This function is called once at server startup to establish:
    1. Connection to Oracle database
    2. HuggingFace embedding model (sentence-transformers/all-mpnet-base-v2)
    3. LangChain OracleVS wrapper for vector similarity operations

    The embedding model converts text queries into 768-dimensional vectors
    that can be compared against pre-computed embeddings in the database.
    """
    global embedding_model, vector_store, connection

    # Connect to Oracle database using oracledb driver
    connection = oracledb.connect(
        user=DB_USERNAME,
        password=DB_PASSWORD,
        dsn=DB_DSN
    )

    # Initialize HuggingFace embeddings model
    # This model converts text to 768-dimensional vectors
    # Same model used to create the original embeddings in the database
    embedding_model = HuggingFaceEmbeddings(
        model_name="sentence-transformers/all-mpnet-base-v2"
    )

    # Initialize vector store wrapper
    # OracleVS provides convenient interface for vector similarity operations
    vector_store = OracleVS(
        client=connection,
        table_name=TABLE_NAME,
        embedding_function=embedding_model,
        # Use cosine similarity for comparison
        distance_strategy=DistanceStrategy.COSINE,  
    )

Again, note that I am using the same embedding model that we used to create the vectors in this vector store. This is important because we need to create embedding vectors for the queries using the same model, so that similarity comparisons will be valid. It’s also important that we use the right distance strategy – for text data, cosine is generally agreed to be the best option. For performance reasons, if we had created a vector index, we’d want to use the same algorithm so the index would be used when performing the search. Oracle will default to doing an “exact search” if there is no index and the algorithm does not match.

Now, let’s add a function to perform a query in our Moby Dick vector store, we’ll include a top-k parameter so the caller can specify how many results they want:

def search_moby_dick(query: str, k: int = 4) -> list[dict]:
    """
    Perform vector similarity search on the moby_dick_500_30 table

    This function:
    1. Converts the query text to a vector using the embedding model
    2. Searches the database for the k most similar text chunks
    3. Returns results ranked by similarity (cosine distance)

    Args:
        query: The search query text (natural language)
        k: Number of results to return (default: 4)

    Returns:
        List of dictionaries containing rank, content, and metadata for each result
    """
    if vector_store is None:
        raise RuntimeError("Vector store not initialized")

    # Perform similarity search
    # The query is automatically embedded and compared against database vectors
    docs = vector_store.similarity_search(query, k=k)

    # Format results into structured dictionaries
    results = []
    for i, doc in enumerate(docs):
        results.append({
            "rank": i + 1,  # 1-indexed ranking by similarity
            "content": doc.page_content,  # The actual text chunk
            "metadata": doc.metadata  # Headers from the original HTML structure
        })

    return results

As you can see, this function returns a dictionary containing the rank, the content (chunk) and the metadata.

Ok, now let’s turn this into an MCP server! First let’s create the server instance:

# Create MCP server instance
# The server name "moby-dick-search" identifies this server in MCP client connections
app = Server("moby-dick-search")

Now we want to provide a list-tools method so that MCP clients can find out what kinds of tools this server provides. We are just going to have our search tool, so let’s define that:

@app.list_tools()
async def list_tools() -> list[Tool]:
    """
    MCP protocol handler: returns list of available tools

    Called by MCP clients to discover what capabilities this server provides.
    This server exposes a single tool: search_moby_dick

    Returns:
        List of Tool objects with names, descriptions, and input schemas
    """
    return [
        Tool(
            name="search_moby_dick",
            description="Search the Moby Dick text using vector similarity. Returns relevant passages based on semantic similarity to the query.",
            inputSchema={
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "The search query text"
                    },
                    "k": {
                        "type": "integer",
                        "description": "Number of results to return (default: 4)",
                        "default": 4
                    }
                },
                "required": ["query"]
            }
        )
    ]

And now, the part we’ve all been waiting for – let’s define the actual search tool (and a class to hold the arguments)!

class SearchArgs(BaseModel):
    """
    Arguments for the vector search tool

    Attributes:
        query: The natural language search query
        k: Number of most similar results to return (default: 4)
    """
    query: str
    k: int = 4

@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    """
    MCP protocol handler: executes tool calls

    Called when an MCP client wants to use one of the server's tools.
    Validates the tool name, parses arguments, performs the search,
    and returns formatted results.

    Args:
        name: Name of the tool to call
        arguments: Dictionary of tool arguments

    Returns:
        List of TextContent objects containing the formatted search results
    """
    # Validate tool name
    if name != "search_moby_dick":
        raise ValueError(f"Unknown tool: {name}")

    # Parse and validate arguments using Pydantic model
    args = SearchArgs(**arguments)

    # Perform the vector similarity search
    results = search_moby_dick(args.query, args.k)

    # Format response as human-readable text
    response_text = f"Found {len(results)} results for query: '{args.query}'\n\n"

    for result in results:
        response_text += f"--- Result {result['rank']} ---\n"
        response_text += f"Metadata: {result['metadata']}\n"
        response_text += f"Content: {result['content']}\n\n"

    # Return as MCP TextContent type
    return [TextContent(type="text", text=response_text)]

That was not too bad. Finally, let’s set up a main function to start up everything and handle the requests:

async def main():
    """
    Main entry point for the MCP server

    This function:
    1. Initializes the database connection and embedding model
    2. Sets up stdio transport for MCP communication
    3. Runs the server event loop to handle requests

    The server communicates via stdio (stdin/stdout), which allows
    it to be easily spawned by MCP clients as a subprocess.
    """
    # Initialize database connection and models
    initialize_db()

    # Import stdio server transport
    from mcp.server.stdio import stdio_server

    # Run the server using stdio transport
    # The server reads MCP protocol messages from stdin and writes responses to stdout
    async with stdio_server() as (read_stream, write_stream):
        await app.run(
            read_stream,
            write_stream,
            app.create_initialization_options()
        )

if __name__ == "__main__":
    asyncio.run(main())

Ok, that’s it! We can run this with the command:

python mcp_server.py

Now, to test it, we’re groing to need a client! MCP Inspector is the logical place to start, you can get it from here, or (assuming you have node installed) by just running this command:

 npx @modelcontextprotocol/inspector python3.12 mcp_server.py

That’s going to start up a UI that looks like this:

Click on the connect button, and you should see an updated screen in a few seconds that looks like this:

Go ahead and click on List Tools and you will see our Search Moby Dick Tool show up – click on it to try it out.

You should see some results like this:

There you go, it works great! And that’s a super simple, basic MCP server and tool! Enjoy.

Posted in Uncategorized | Tagged , , , , , | 1 Comment

Exploring securing vector similarity searches with Real Application Security

In this post, I want to explore how you can use Real Application Security to provide access controls for vectors in a vector store in Oracle Database 23ai.

I’m going to use the vector store we created in the last post as an example. If you want to follow along, you should follow that one first to create and populate your vector store, then come back here.

You should have a vector store table called MOBY_DICK_500_30 that you created in that previous post. You can connect to Oracle using SQLcl or SQL*Plus or whatever tool you prefer and check the structure of that table:

SQL> describe moby_dick_500_30

Name         Null?       Type
____________ ___________ ____________________________
ID           NOT NULL    RAW(16 BYTE)
TEXT                     CLOB
METADATA                 JSON
EMBEDDING                VECTOR(768,FLOAT32,DENSE)

Let’s observe that that metadata column contains the document structure information from the loaders that we used. If we filter for Chapter 12, we can see there are 13 vectors associated with that chapter:

SQL> select metadata from moby_dick_500_30 where metadata like '%CHAPTER 12.%';

METADATA
__________________________________________________________________________________
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}

13 rows selected.

We are going to use this metadata to filter access to the vectors.

Set up permissions

Let’s start by setting up the necessary permissions. You will need to run this as the SYS user:

alter session set container=freepdb1;
grant create session, xs_session_admin to vector;
exec sys.xs_admin_util.grant_system_privilege('provision', 'vector', sys.xs_admin_util.ptype_db);
grant create role to vector;
exec sys.xs_admin_util.grant_system_privilege('admin_sec_policy', 'vector', sys.xs_admin_util.ptype_db);
exec sys.xs_admin_util.grant_system_privilege('ADMIN_ANY_SEC_POLICY', 'vector', sys.xs_admin_util.ptype_db);

Great! Now let’s set up Real Application Security. We will run the rest of these commands as the VECTOR user.

Let’s start by creating a RAS role named role1:

exec sys.xs_principal.create_role(name => 'role1', enabled => true);

Now, we will create a user named user1 and add grant them role1 and connect privileges:

exec  sys.xs_principal.create_user(name => 'user1', schema => 'vector');
exec  sys.xs_principal.set_password('user1', 'pwd1');
exec  sys.xs_principal.grant_roles('user1', 'XSCONNECT');
exec  sys.xs_principal.grant_roles('user1', 'role1');

Let’s also create a regular database role and give it access to the vector store table:

create role db_emp;
grant select, insert, update, delete on vector.moby_dick_500_30 to db_emp; 

Grant DB_EMP to the application roles, so they have the required object privileges to access the table:

grant db_emp to role1;

Next, we want to create a security class, and include the predefined DML security class:

begin
  sys.xs_security_class.create_security_class(
    name        => 'moby_privileges',
    parent_list => xs$name_list('sys.dml'),
    priv_list   => xs$privilege_list(xs$privilege('view_moby_dick')));
end;

Now we can create an ACL (access control list) which will grant the privileges for the policy that we will define in a moment:

declare 
  aces xs$ace_list := xs$ace_list(); 
begin
  aces.extend(1);
  aces(1) := xs$ace_type(
     privilege_list => xs$name_list('select'),
     principal_name => 'USER1');
  
  sys.xs_acl.create_acl(
    name  => 'moby_acl',
    ace_list  => aces,
    sec_class => 'moby_privileges');
end;

Ok, nearly there! Finally, let’s define the security policy and apply it to the table:

declare
  realms xs$realm_constraint_list := xs$realm_constraint_list();      
begin  
  realms.extend(1);
 
  -- Filter based on column value
  realms(1) := xs$realm_constraint_type(
    realm    => 'metadata LIKE ''%CHAPTER 12.%''',
    acl_list => xs$name_list('moby_acl'));

  sys.xs_data_security.create_policy(
    name                   => 'moby_policy',
    realm_constraint_list  => realms);
    
  sys.xs_data_security.apply_object_policy(
    policy => 'moby_policy',
    schema => 'vector',
    object =>'moby_dick_500_30');
end;

Ok, that’s it!

Now, you may have noticed we did not give ourselves any permissions, so if we try to query that vector store table now, you’ll see it appears empty!

SQL> select count(*) from moby_dick_500_30;

   COUNT(*)
___________
          0

But, if we reconnect with the application user (user1) that we defined, and do the same query, we will see those 13 records for Chapter 12:

SQL> connect user1/pwd1
Connected.
SQL> select count(*) from moby_dick_500_30;

   COUNT(*)
___________
         13

So there you have it! We can define policies to easily control access to vectors. In this example we used the metadata to create the filtering rules, of course you could create whatever kind of rules you need.

This allows you to have a vector store which can be easily filtered for different users (or roles), essentially creating a virtual private vector store. You might want to allow ‘customer-support’ role access a certain subset of vectors for example, but your ‘supervisor’ role to access a larger set (or all) of the vectors.

What’s great about this, is that the security is enforced in the database itself. When an AI Assistant, chatbot, MCP client, etc., performs a vector search, they will only ever be able to get back results from the vectors that the user is allowed to see. The database will never send vectors to users which they are not allowed to see. So you don’t have to worry about trusting the LLM not to make a mistake and give out the wrong data, because it will literally never see the data in the first place.

Posted in Uncategorized | Tagged , , , , , | Leave a comment

Basic Retrieval Augmented Generation with Oracle Vector Store in LangChain

In this earlier post, we learned how to create a Vector Store in Oracle from LangChain, the hugely popular Python library for working with Generative AI and Large Language Models. We populated it with a little data and then performed some simple vector similarity searches.

In this post, let’s expand on that to implement basic Retrieval Augmented Generation!

First, let’s talk about some concepts – if you alread know this, feel free to jump ahead!

Generative AI – This is a type of Artificial Intelligence (AI) that uses a specialized form of machine learning model, called a Large Language Model (or “LLM”), to create (“generate”) new content based on a prompt from a user. It works by looking at the “tokens” that it received in the input (the “prompt”) and then figuring out what is the most probable next token in the sequence. What is a token? Well, it may be a word, or a part of a word, but we use the word “token” because it could also be part of an audio file or an image, since some of these models support other types of data, not just text. Note that it only generates one token at a time – you have to “run” the model again for every subsequent token.

Training – these models are “trained” by exposing them to very large amounts of data. Usually the data is publicly available information, collected from the Internet and/or other repositories. Training a model is very expensive, both in terms of time, and in terms of the cost of running the specialized GPU hardware needed to perform the training. You may see a model described as having “70 billion parameters” or something like that. Training is basically a process of tuning the probabilities of each of these parameters based on the new input.

When a model sees a prompt like “My Husky is a very good” it will use those probabilities to determine that comes next. In this example, “dog” would have a very high probability of being the next “token”.

Hyper-parameters – models also have extra parameters that control how they behave. These so-called “hyper-parameters” include things like “temperature” which controls how creative the model will be, “top-K” which controls how many options the model will consider when choose the next token, and various kinds of “frequency penalties” that will cause the model to be more or less likely to reuse/repeat tokens. Of course these are just a few examples.

Knowledge cut-off – an important property of LLMs is that they have not been exposed to any information that was created after their training ended. So a model training in 2023 would not know who won an election held in 2024 for example.

Hallucination – LLMs tend to “make up an answer” if they do not “know” the answer. Now, obviously they don’t really know anything in the same sense that we know things, they are rworking with probabilities. But if we can anthropomorphize them for a moment, they tend to “want” to be helpful, and they are very likely to offer you a very confident but completely incorrect answer if they do not have the necessary information to answer a question.


Now, of course a lot of people want to use this exciting new technology to implement solutions to help their customers or users. ChatBots is a prime example of something that is frequently implemented using Generative AI these days. But, of course no new technology is a silver bullet, and they all come with their own challenges and issues. Let’s consider some common challenges when attempting to implement a project with Generative AI:

  • Which model to use? There are many models available, and they have all beeen trained differently. Some are specialized models that are trained to perform a particular task, for example summarizing a document. Other models are general purpose and can perform different tasks. Some models understand only one language (like English) and others understand many. Some models only understand text, others only images, others video, and others still are multi-modal and understand various different types of data. Models also have different licensing requirements. Some models are provided as a service, like a utility, where you typically pay some very small amount per request. Other models are able to be self-hosted, or run on your own hardware.
  • Privacy. Very often the data that you need for your project is non-public data, and very often you do not want to share that data with a third-party organization for privacy reasons, or even regulatory reasons, depending on your industry. People are also very wary about a third-party organization using their non-public data to train future models.
  • How to “tune” to models hyper-parameters. As we discussed earlier, the hyper-parameters control how the model behaves. The settings of these parameters can have a significant impact on the quality of the results that are produced.
  • Dealing with knowledge cut-off. Giving the model access to information that is newer than when it was trained is also a key challenge. Probably the most obvious way to do this is to continue to model’s training by exposing it to this newer information. This is known as “fine-tuning”. The key challenge is that is an extremely expensive enterprise, requiring specialized GPU hardware and very highly skilled people to plan and run the training.

Enter “Retrieval Augmented Generation,” (or “RAG”) first introduced by Patrick Lewis et al, from Meta, in the 2020 paper “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks“. RAG is a technique that gives LLMs access to non-public information and/or information created after their training and is orders of magnitude less expensive than fine-tuning.

The essence of RAG is to provide the LLM with the information that it needs to answer a question by “stuffing” that information into the LLM’s “context window” or “prompt” along with the actual question. It’s a bit like an open-book test. Imagine you get a question like this:

How much does a checking account from (some bank) cost? 

And let’s assume that information is not readily available on the Internet. How would you come up with the answer? You likely could not.

But if the question was more like this:

How much does a checking acount from (some bank) cost? 

To answer this question, consult these DOCUMENTS:

(here there would be the actual content of those documents that provide the information necessary to answer that question)

Much easier right? That’s basically what RAG is. It provides the most relevant information to the LLM so that it can answer the question.

So now, the obvious questions are – where does it get this information from, and how does it know which parts are the most relevant?

This is where our Vector Store comes in!

The set of information, the non-public data that we want the LLM to use, we call that the “corpus”. Very often the corpus will be so large that there is no reasonable way for us to just give the LLM the whole thing. Now, as I am writing this in May 2025, there are models that have very large “context windows” and could be given a large amount of data. Llama 4 was just released, as I write this, and has a context window size of 10 million tokens! So you could in fact give it a large amount of information. But models that were released as recently as six or twelve months ago have much smaller context windows.

So the approach we use is to take the corpus, and split it up into small pieces, called “chunks” and we create an “embedding vector” for each of these chunks. This vector is basically an n-dimensional numerical representation of the semantic meaning of the chunk. Chunks with similar meanings will have similiar (i.e., close) vectors. Chunks with different meanings will have vectors that are futher apart.

Now, visualizing an n-dimensional vector is challenging. But if n=2, its a lot easier. So let’s do that! Remember, in real models, n is much more likely to be in the thousands or tens of thousands, but the concepts are the same. Consider the diagram below:

In this diagram, we have only two dimensions, the vertical dimension is “largeness” – how large (or small) the thing is. The horizontal dimension is “dog-ness” – how much the thing is (or is not) a dog.

Notice that both the Saint Bernard and the German Shephard (I hope I got those breeds right!) are large dogs. So the vector for both of them are high on both axes, and their vectors are very close together, because in this two-dimensional world, they are indeed very, very similar. The wolf is also large, but it is not actually a dog. Dogs are related to (descended from) wolves, so it is somewhat dog-like, but its vector is quite a distance away from the actual large dogs.

Now, look at the tennis ball! It is not large, and it is not a dog, so it’s vector is almost in the complete opposite direction to the large dogs.

Now, consider the question “Is a Husky a large dog?”

What we do in RAG, is we take that question, and turn that into a vector, using the exact same “embedding model” that we used to create those vectors we just looked at above, and then we see what other vectors are close to it.

Notice that the resulting vector, represented by the red dot, ended up very close to those two large dogs! So if we did a similarity search, that is, if we found the closest vectors to our question vector, what we would get back is the vectors for the Saint Bernard and the German Shephard.

Here’s a diagram of the RAG process:

So we take the question from the user, we turn it into a vector, we find the closest vectors to that in our corprus, and then we get the actual content that those vectors was created from and give that information to the LLM to allow it to answer the question. Remember, in real life there are many more dimensions, and they are not going to be some concept that we can neatly label, like “largeness”. The actual dimensions are things that are learned by the model over many billions of iterations of weight adjustments as it was exposed to vast amounts of data. The closest (non-mathematical) analogy I can think of is Isaac Asimov’s “positronic brain” in his Robots, Empire and Foundation series which he described as learning through countless small adjustments of uncountable numbers of weights..


Wow! That was a lot of theory! Let’s get back to some code, please!

In the previous post, we populated our vector store with just three very small quotes from Moby Dick. Now, let’s use the entire text!

Here’s the plain text version: https://www.gutenberg.org/cache/epub/2701/pg2701.txt

Here’s the same book in HTML, with some basic structure like H2 tags for the chapter headings: https://www.gutenberg.org/cache/epub/2701/pg2701-images.html

Let’s create a new notebook. If you followed along in the previous post, you can just create a new notebook in the same project and choose the same environment/kernel. If not, create a new project, then create a notebook, for example basic-rag.ipynb and create a kernel:

Click on the Select Kernel button (its on the top right). Select Python Environment then Create Python Environment. Select the option to create a Venv (Virtual Environment) and choose your Python interpreter. I recommend using at least Python 3.11. This will download all the necessary files and will take a minute or two.

If you created a new environment, install the necessary packages by creating and running a cell with this content. Note that you can run this even if you have a pre-existing environment, it won’t do any harm:

%pip install -qU "langchain[openai]"
%pip install oracledb
%pip install langchain-community langchain-huggingface

Now, create and run this cell to set your OpenAI API key:

import getpass
import os

if not os.environ.get("OPENAI_API_KEY"):
  os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")

from langchain.chat_models import init_chat_model

model = init_chat_model("gpt-4o-mini", model_provider="openai")

model.invoke("Hello, world!")

Paste your key in when prompted (see the previous post if you need to know how to get one) and confirm you got the expected response from the model.

Note: You could, of course, use a different model if you wanted to. See the LangChain model documentation for options.

Now, let’s connect to the database by creating and running this cell (this assumes that you started the database container and created the user as described in the previous post!)

import oracledb

username = "vector"
password = "vector"
dsn = "localhost:1521/FREEPDB1"

try:
    connection = oracledb.connect(
        user=username, 
        password=password, 
        dsn=dsn)
    print("Connection successful!")
except Exception as e:
    print("Connection failed!")

Ok, now we are ready to read that document and create our vector embeddings. But how? In the previous post we manually created some excerpts, but now we want to read the whole document.

Enter Document Loaders! Take a look at that page, LangChain has hundreds of different document loaders that understand all kinds of documetn formats.

Let’s try the basic web loader, create and run this cell to install it:

%pip install -qU langchain_community beautifulsoup4

Now create and run this cell to initialize the document loader:

from langchain_community.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://www.gutenberg.org/cache/epub/2701/pg2701-images.html")

Now load the documents by running this cell:

docs = loader.load()

If you’d like, take a look at the result, by running this cell:

docs[0]

Well, that is just one big document, that is not so helpful, we want to split that document up into smaller chunks so we can create vectors for each smaller part. Let’s use a document splitter instead.

Install a splitter by running this cell:

%pip install -qU langchain-text-splitters

Note: Check out this page for more information about the available splitters. We are going to use the HTMLHeaderTextSplitter. Run this cell:

from langchain_text_splitters import HTMLHeaderTextSplitter

url = "https://www.gutenberg.org/cache/epub/2701/pg2701-images.html"

headers_to_split_on = [
    ("h1", "Header 1"),
    ("h2", "Header 2"),
    ("h3", "Header 3"),
    ("h4", "Header 4"),
]

html_splitter = HTMLHeaderTextSplitter(headers_to_split_on)

html_header_splits = html_splitter.split_text_from_url(url)

Let’s see what that did, run this cell:

html_header_splits

You’ll see a long list of Documents and if you look carefully, you can see that it has maintained the structure information.

Great! That’s a lot better.

Now, let’s suppose we wanted to constrain the size of the chunks. Some of those might be too big, we might want to split them even further. We can do that with a RecursiveCharacterTextSplitter.

Let’s say we wanted chunks no bigger than 500 characters, with an overlap of 30. Now this might not be a good idea, but just for the sake of the example, let’s do it by running this cell:

from langchain_text_splitters import RecursiveCharacterTextSplitter

chunk_size = 500
chunk_overlap = 30
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=chunk_size, chunk_overlap=chunk_overlap
)

# Split
splits = text_splitter.split_documents(html_header_splits)

You can take a look at a few of the chunks by running this cell:

splits[80:85]

Ok, great! Next, we need to create our embeddings and populate our vector store.

Install the dependencies, if you have not already, by running this cell:

%pip install langchain-community langchain-huggingface

And let’s create the vector store! Run this cell:

from langchain_community.vectorstores import oraclevs
from langchain_community.vectorstores.oraclevs import OracleVS
from langchain_community.vectorstores.utils import DistanceStrategy
from langchain_core.documents import Document
from langchain_huggingface import HuggingFaceEmbeddings

embedding_model = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")

vector_store = OracleVS.from_documents(
    splits,
    embedding_model,
    client=connection,
    table_name="moby_dick_500_30",
    distance_strategy=DistanceStrategy.COSINE,
)

We are using the same model as we did in the previous post, but now we are passing in our splits that we just created – our 500 character long chunks of our larger chunks created from the HTML document respecting the document structure. And we called our vector store table moby_dick_500_30 to make it a little easier to remember what we put in there.

After that cell has finished (it might take a few minutes), you can take a look to see what is in the vector store by running this command in your terminal window:

docker exec -i db23ai sqlplus vector/vector@localhost:1521/FREEPDB1 <<EOF
select table_name from user_tables;
describe documents_cosine;
column id format a20;
column text format a30;
column metadata format a30;
column embedding format a30;
set linesize 150;
select * from moby_dick_500_30
fetch first 3 rows only;
EOF

You should get something similar to this:

Let’s try our searches again, run this cell:

query = 'Where is Rokovoko?'
print(vector_store.similarity_search(query, 1))

query2 = 'What does Ahab like to do after breakfast?'
print(vector_store.similarity_search(query2, 1))

You can change that 1 to a larger number now, since you have many more vectors, to see what you get!

Ok, now we have all the pieces we need and we are ready to implement the RAG!

The most basic way to implement RAG is to use a “retriever” – we can grab one from our vector store like this:

retriever = vector_store.as_retriever()

Try it out by asking a question:

docs = retriever.invoke("Where is Rokovoko?")

docs

You’ll get something like this:

Nearly there!

Now, we want to give the LLM a good prompt to tell it what to do, and include the retrieved documents. Let’s use a standard prompt for now:

from langchain import hub

prompt = hub.pull("rlm/rag-prompt")

example_messages = prompt.invoke(
    {"context": "(context goes here)", "question": "(question goes here)"}
).to_messages()

assert len(example_messages) == 1
print(example_messages[0].content)

The prompt looks like this:

You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: (question goes here) 
Context: (context goes here) 
Answer:

Ok, now to put it all together. Now, in real life we’d probably want to use LangGraph at this point, and we’d want to think about including things like memory, ranking the results from the vector search, citations/references (“grounding” the answer), and streaming the output. But that’s all for another post! For now, let’s just do the most basic implementation:

question = "..."

retrieved_docs = vector_store.similarity_search(question)
docs_content = "\n\n".join(doc.page_content for doc in retrieved_docs)
prompt_val = prompt.invoke({"question": question, "context": docs_content})
answer = llm.invoke(prompt_val)

answer

You should get an answer similar to this:

AIMessage(content='Rokovoko is an island located far away to the West and South, as mentioned in relation to Queequeg, a native of the island. It is not found on any maps, suggesting it may be fictional.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 46, 'prompt_tokens': 402, 'total_tokens': 448, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_54eb4bd693', 'id': 'chatcmpl-BaW3g2omlCY0l6LwDkC9Ub8Ls3V88', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None}, id='run--86f14be2-9c8f-43c9-ae89-259db1c640bd-0', usage_metadata={'input_tokens': 402, 'output_tokens': 46, 'total_tokens': 448, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

That’s a pretty good answer!

Well, there you go, we covered a lot of ground in this post, but that’s just a very basic RAG. Stay tuned to learn about implementing a more realistic RAG in the next post!

Posted in Uncategorized | Tagged , , , , , , | 2 Comments

Getting started with Oracle Vector Store support in LangChain

In this post, I would like to show you the basics of how to use the Oracle Vector Store support in LangChain. I am using Visual Studio Code with the Python and Jupyter extensions from Microsoft installed. I will show more detailed usage in future posts!

Prefer to watch a video? Check it out here:

To get started, create a new project in Visual Studio code, and then create a new Jupyter Notebook using File > New File… then choose Jupyter Notebook as the type of file, and save your new file at getting_started.ipynb,

First, we need to set up the Python runtime environment. Click on the Select Kernel button (its on the top right). Select Python Environment then Create Python Environment. Select the option to create a Venv (Virtual Environment) and choose your Python interpreter. I recommend using at least Python 3.11. This will download all the necessary files and will take a minute or two.

In this example, we will use OpenAI for our chat model. You’ll need to get an API Key from OpenAI, which you can do by logging into https://platform.openai.com/settings/organization/api-keys and creating a key. Of course you could use a different model, including a self-hosted model so that you don’t have to send your data outside your organization. I’ll cover that in future posts, stay tuned!

In the first cell, check that the type is Python and enter this code:

%pip install -qU "langchain[openai]"

Press Shift+Enter or click on the Run icon to run this code block. This will also take a minute or so to install the LangChain library for OpenAI.

Now create a second cell and paste in this code:

import getpass
import os

if not os.environ.get("OPENAI_API_KEY"):
  os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")

from langchain.chat_models import init_chat_model

model = init_chat_model("gpt-4o-mini", model_provider="openai")

Run this block, and when it prompts you for your key, paste in the key, it will start with something like sk-proj and have a long string of mostly letters and numbers after that. This will save your key in the environment so that you don’t have to keep entering it each time.

Now, we are ready to talk to the LLM! Let’s try a simple prompt. Create a new cell and enter this code:

model.invoke("Hello, world!")

Run this cell and observe the output. It should look something like this:

Great, now we are ready to connect to a vector store. If you don’t already have one, start up an instance of Oracle Database 23ai in a container on your machine. Run this command in a terminal window (not the notebook)

docker run -d --name db23ai \
  -p 1521:1521 \
  -e ORACLE_PWD=Welcome12345 \
  -v db23ai-volume:/opt/oracle/oradata \
  container-registry.oracle.com/database/free:latest

This will start up an Oracle Database 23ai Free instance in a container. It will have a PDB called FREEPDB1 and the password for PDBADMIN (and SYS and SYSTEM) will be Welcome12345.

Now, run the following command to create an Oracle user with appropriate permissions to create a vector store:

docker exec -i db23ai sqlplus sys/Welcome12345@localhost:1521/FREEPDB1 as sysdba <<EOF
alter session set container=FREEPDB1;
create user vector identified by vector;
grant connect, resource, unlimited tablespace, create credential, create procedure, create any index to vector;
commit;
EOF

Let’s connect to the database! First we’ll isntall the oracledb library. Create a new cell and enter this code:

%pip install oracledb

Run this code block to install the libary.

Now create a new code block with this code:

import oracledb

username = "vector"
password = "vector"
dsn = "localhost:1521/FREEPDB1"

try:
    connection = oracledb.connect(
        user=username, 
        password=password, 
        dsn=dsn)
    print("Connection successful!")
except Exception as e:
    print("Connection failed!")

Run this code block. You should see the output “Connection successful!”

Now, let’s install the dependencies we will need to load some documents into the vector store. Create a new cell with this code and run it:

%pip install langchain-community langchain-huggingface

Now, import the things we will need by creating a new call with this code and running it:

from langchain_community.vectorstores import oraclevs
from langchain_community.vectorstores.oraclevs import OracleVS
from langchain_community.vectorstores.utils import DistanceStrategy
from langchain_core.documents import Document
from langchain_huggingface import HuggingFaceEmbeddings

We are going to need some documents to load into the vector store, so let’s define some to use for an example. In real life, you’d probably want use your own non-public documents to load a vector store if you were building a chatbot or using retrieval augmented generation. Create and run a new call with this code:

documents_json_list = [
    {
        "id": "moby_dick_2701_P1",
        "text": "Queequeg was a native of Rokovoko, an island far away to the West and South. It is not down in any map; true places never are.",
        "link": "https://www.gutenberg.org/cache/epub/2701/pg2701-images.html#link2HCH0012",
    },
    {
        "id": "moby_dick_2701_P2",
        "text": "It was not a great while after the affair of the pipe, that one morning shortly after breakfast, Ahab, as was his wont, ascended the cabin-gangway to the deck. There most sea-captains usually walk at that hour, as country gentlemen, after the same meal, take a few turns in the garden.",
        "link": "https://www.gutenberg.org/cache/epub/2701/pg2701-images.html#link2HCH0036",
    },
    {
        "id": "moby_dick_2701_P3",
        "text": "Now, from the South and West the Pequod was drawing nigh to Formosa and the Bashee Isles, between which lies one of the tropical outlets from the China waters into the Pacific. And so Starbuck found Ahab with a general chart of the oriental archipelagoes spread before him; and another separate one representing the long eastern coasts of the Japanese islands—Niphon, Matsmai, and Sikoke. ",
        "link": "https://www.gutenberg.org/cache/epub/2701/pg2701-images.html#link2HCH0109",
    },
]

Now, let’s load them into a LangChain documents list with some metadata. Create and run a cell with this code:

# Create Langchain Documents

documents_langchain = []

for doc in documents_json_list:
    metadata = {"id": doc["id"], "link": doc["link"]}
    doc_langchain = Document(page_content=doc["text"], metadata=metadata)
    documents_langchain.append(doc_langchain)

Ok, great. Now we can create a vector store and load those documents. Create and run a cell with this code:

model = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")

vector_store = OracleVS.from_documents(
    documents_langchain,
    model,
    client=connection,
    table_name="Documents_COSINE",
    distance_strategy=DistanceStrategy.COSINE,
)

Let’s have a look in the database and see what was created. Run this code in your terminal:

docker exec -i db23ai sqlplus vector/vector@localhost:1521/FREEPDB1 <<EOF
select table_name from user_tables;
describe documents_cosine;
column id format a20;
column text format a30;
column metadata format a30;
column embedding format a30;
set linesize 150;
select * from documents_cosine;
EOF

You should see output similar to this:

SQL>
TABLE_NAME
--------------------------------------------------------------------------------
DOCUMENTS_COSINE

SQL>  Name                                         Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID                                        NOT NULL RAW(16)
 TEXT                                               CLOB
 METADATA                                           JSON
 EMBEDDING                                          VECTOR(768, FLOAT32)

SQL> SQL> SQL> SQL> SQL> SQL>
ID                   TEXT                           METADATA                       EMBEDDING
-------------------- ------------------------------ ------------------------------ ------------------------------
957B602A0B55C487     Now, from the South and West t {"id":"moby_dick_2701_P3","lin [9.29364376E-003,-5.70030287E-
                     he Pequod was drawing nigh to  k":"https://www.gutenberg.org/ 002,-4.62282933E-002,-1.599499
                     Formosa and the Bash           cache/epub/2701/pg27           58E-002,

A8A71597D56432FD     Queequeg was a native of Rokov {"id":"moby_dick_2701_P1","lin [4.28722538E-002,-8.80071707E-
                     oko, an island far away to the k":"https://www.gutenberg.org/ 003,3.56001826E-003,6.765306E-
                      West and South. It            cache/epub/2701/pg27           003,

E7675836CF07A695     It was not a great while after {"id":"moby_dick_2701_P2","lin [1.06763924E-002,3.91203648E-0
                      the affair of the pipe, that  k":"https://www.gutenberg.org/ 04,-1.01576066E-002,-3.5316135
                     one morning shortly            cache/epub/2701/pg27           7E-002,

Now, let’s do a vector similarity search. Create and run a cell with this code:

query = 'Where is Rokovoko?'
print(vector_store.similarity_search(query, 1))

query2 = 'What does Ahab like to do after breakfast?'
print(vector_store.similarity_search(query2, 1))

This will find the one (1) nearest match in each case. You should get an answer like this:

[Document(metadata={'id': 'moby_dick_2701_P1', 'link': 'https://www.gutenberg.org/cache/epub/2701/pg2701-images.html#link2HCH0012'}, page_content='Queequeg was a native of Rokovoko, an island far away to the West and South. It is not down in any map; true places never are.')]

[Document(metadata={'id': 'moby_dick_2701_P2', 'link': 'https://www.gutenberg.org/cache/epub/2701/pg2701-images.html#link2HCH0036'}, page_content='It was not a great while after the affair of the pipe, that one morning shortly after breakfast, Ahab, as was his wont, ascended the cabin-gangway to the deck. There most sea-captains usually walk at that hour, as country gentlemen, after the same meal, take a few turns in the garden.')]

Well, there you go, that’s the most basic example of creating a vector store, loading some documents into it and doing a simple similarity search. Stay tuned to learn about more advanced features!

Posted in Uncategorized | Tagged , , , , , , , | 1 Comment