Using Reflection to Help LLMs Write Better SQL

Getting LLMs to write good SQL can be tricky. Sure, they can generate syntactically correct queries, but do those queries actually answer the question you asked? Sometimes an LLM might give you technically valid SQL that doesn’t quite capture what you’re really looking for.

I wanted to experiment with the reflection pattern to see if we could get better results. The idea is simple: after the LLM generates SQL and executes it, have it reflect on whether the query actually answers the original question. If not, let it try again with the benefit of seeing both the question and the initial results.

Let me show you how this works.

Setting up the database

I used an Oracle Autonomous Database on Oracle Cloud for this experiment. First, I created a user with the necessary permissions. Connect as ADMIN and run this:

create user moviestream identified by <password>;
grant connect, resource, unlimited tablespace to moviestream;
grant execute on dbms_cloud to moviestream;
grant execute on dbms_cloud_repo to moviestream;
grant create table to moviestream;
grant create view to moviestream;
grant all on directory data_pump_dir to moviestream;
grant create procedure to moviestream;
grant create sequence to moviestream;
grant create job to moviestream;

Next, let’s load the sample dataset. Still as ADMIN, run this:

declare 
    l_uri varchar2(500) := 'https://objectstorage.us-ashburn-1.oraclecloud.com/n/c4u04/b/building_blocks_utilities/o/setup/workshop-setup.sql';
begin
    dbms_cloud_repo.install_sql(
        content => to_clob(dbms_cloud.get_object(object_uri => l_uri))
    );
end;
/

Then connect as the moviestream user and run this to load the rest of the dataset:

BEGIN
    workshop.add_dataset(tag => 'end-to-end');
END;
/

This takes a few minutes to complete, after which we have a database with customer and sales data to work with.

The approach

The reflection pattern works like this:

  1. Give the LLM the database schema and a natural language question
  2. LLM generates SQL (v1)
  3. Execute the SQL and get results
  4. LLM reflects: “Does this SQL actually answer the question?”
  5. Generate improved SQL (v2) based on the reflection
  6. Execute v2 and provide the final answer

The key insight here is that by seeing the actual results, the LLM can judge whether it interpreted the question correctly. For example, if you ask “who are our top customers?”, the LLM might initially think “highest income” when you actually meant “highest spending”. Seeing the results helps it course-correct.

Setting up the Python environment

I used a Jupyter notebook for this experiment. First, let’s install the libraries we need:

%pip install aisuite oracledb 

I’m using Andrew Ng’s aisuite for a unified interface to different LLM providers, and oracledb to connect to the database.

Now let’s import aisuite:

import aisuite as ai

Connecting to Oracle Autonomous Database

For Oracle Autonomous Database, you’ll need to download the wallet and set up the connection. Here’s how I connected:

import oracledb

username = "moviestream"
password = "<password>"
dsn = "<connection_string>"
wallet = '<path_to_wallet>'

try:
    connection = oracledb.connect(
        user=username, 
        password=password, 
        dsn=dsn,
        config_dir=wallet,
        wallet_location=wallet,
        wallet_password='<wallet_password>')
    print("Connection successful!")
except Exception as e:
    print(e)
    print("Connection failed!")

And set the TNS_ADMIN environment variable:

import os
os.environ['TNS_ADMIN'] = wallet

Configuring the LLM client

Let’s set up the AI client. I used GPT-4o for this experiment:

client = ai.Client()
os.environ['OPENAI_API_KEY']='<your_api_key>'

models = ['openai:gpt-4o']

Getting the database schema

For the LLM to write good SQL, it needs to know what tables and columns are available. Let’s write a function to introspect the schema:

def get_schema():
    stmt = f'''
    SELECT 
        utc.table_name,
        utc.column_name,
        utc.data_type,
        utc.data_length,
        utc.nullable,
        utc.column_id,
        ucc.comments AS column_comment,
        utab.comments AS table_comment
    FROM 
        user_tab_columns utc
    LEFT JOIN 
        user_col_comments ucc 
        ON utc.table_name = ucc.table_name 
        AND utc.column_name = ucc.column_name
    LEFT JOIN 
        user_tab_comments utab 
        ON utc.table_name = utab.table_name
    ORDER BY 
        utc.table_name, 
        utc.column_id;
    '''

    cursor = connection.cursor()
    cursor.execute(stmt)
    rows = cursor.fetchall()

    # Convert to one long string
    result_string = '\n'.join([str(row) for row in rows])

    cursor.close()

    return result_string

This function queries the Oracle data dictionary to get information about all tables and columns, including any comments. It returns everything as a single string that we can pass to the LLM.

Generating SQL from natural language

Now let’s write the function that takes a natural language question and generates SQL:

def generate_sql(question: str, schema: str, model: str):
    prompt = f'''
    You are an SQL assistant for Oracle Database.
    You create Oracle SQL statements to help answer user questions.
    Given the user's question and the schema information, write an SQL
    query to answer the question.

    Schema:
    {schema}

    User question:
    {question}

    Respond with the SQL only.  Do not add any extra characters or delimiters.
    '''
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        temperature=0,
    )
    return response.choices[0].message.content.strip()

This function takes the question, the schema information, and the model to use. It constructs a prompt that gives the LLM the context it needs and asks for just the SQL query.

Executing SQL queries

We need a function to actually run the generated SQL:

def execute_sql(stmt):
    cursor = connection.cursor()
    cursor.execute(stmt)
    rows = cursor.fetchall()

    # Convert to one long string
    result_string = '\n'.join([str(row) for row in rows])

    cursor.close()

    return result_string

This executes the query and returns the results as a string.

The reflection step

Here’s where it gets interesting – the function that reviews the SQL and results, and potentially generates improved SQL:

import json

def refine_sql(question, sql_query, output, schema, model):
    prompt = f'''
    You are a SQL reviewer and refiner. 

    User asked:
    {question}

    Original SQL:
    {sql_query}

    SQL Output:
    {output}

    Schema:
    {schema}

    Step 1: Evaluate if the SQL OUTPUT fully answers the user's question.
    Step 2: If improvement is needed, provide a refined SQL query for Oracle.
    If the original SQL is already correct, return it unchanged.

    Return a strict JSON object with two fields:
    - "feedback": brief evaluation and suggestions
    - "refined_sql": the final SQL to run

    Return ONLY the actual JSON document.
    Do NOT add any extra characters or delimiters outside of the actual JSON itself.
    In particular do NOT include backticks before and after the JSON document.
    '''

    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        temperature=0,
    )

    content = response.choices[0].message.content
    try:
        obj = json.loads(content)
        feedback = str(obj.get("feedback", "")).strip()
        refined_sql = str(obj.get("refined_sql", sql_query)).strip()
        if not refined_sql:
            refined_sql = sql_query
    except Exception:
        # Fallback if model doesn't return valid JSON
        feedback = content.strip()
        refined_sql = sql_query

    return feedback, refined_sql

This is the heart of the reflection pattern. The function:

  1. Shows the LLM the original question, the generated SQL, and the actual results
  2. Asks it to evaluate whether the SQL output really answers the question
  3. If not, asks for an improved query
  4. Returns both the feedback and the refined SQL as JSON

The JSON format makes it easy to parse the response and extract both pieces of information. I had to be fairly pedantic to get gpt-4o to give me just JSON!

Providing a final answer

Finally, let’s write a function to convert the query results into a natural language answer:

def provide_final_answer(question, output, model):
    prompt = f'''
    You are helpful assistant.
    Given a user's question, and the results of a database query
    which has been created, evaluated, improved and executed already
    in order to get the provided output, you should provide an
    answer to the user's question.

    User question:
    {question}

    Query results:
    {output}
    '''
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        temperature=0,
    )
    return response.choices[0].message.content.strip()

This takes the final query results and turns them into a friendly, natural language response for the user.

Putting it all together

Now let’s create the main function that orchestrates the entire process:

def generate_and_reflect(question: str):
    
    schema = get_schema()
    print('SCHEMA')
    print(schema)
    print()

    sql_v1 = generate_sql(question, schema, models[0])
    print("SQL V1")
    print(sql_v1)
    print()

    output_v1 = execute_sql(sql_v1)
    print("SQL V1 output")
    print(output_v1)
    print()

    feedback, sql_v2 = refine_sql(question, sql_v1, output_v1, schema, models[0])
    print("FEEDBACK")
    print(feedback)
    print()
    print("SQL V2")
    print(sql_v2)
    print()

    output_v2 = execute_sql(sql_v2)
    print("SQL V2 output")
    print(output_v2)
    print()

    final_answer = provide_final_answer(question, output_v2, models[0])
    print("FINAL ANSWER")
    print(final_answer)
    print()

This function:

  1. Gets the database schema
  2. Generates the first SQL query
  3. Executes it and prints the results
  4. Sends everything to the reflection function for evaluation
  5. Generates and executes the refined SQL
  6. Converts the final results into a natural language answer

Running the experiment

Let’s try it out with a question that could be interpreted multiple ways:

generate_and_reflect('who are our top customers?')

The results

Here’s what happened when I ran this:

First attempt (SQL V1):

SELECT CUST_ID, FIRST_NAME, LAST_NAME, EMAIL, INCOME
FROM CUSTOMER
ORDER BY INCOME DESC
FETCH FIRST 10 ROWS ONLY;

The LLM interpreted “top customers” as customers with the highest income. It returned folks with incomes around $187,000:

(1138797, 'Haruru', 'Takahashi', 'haruru.takahashi6@oraclemail.com', 187168.8)
(1007335, 'Eddie', 'Crawford', 'eddie.crawford@oraclemail.com', 187145.4)
(1404002, 'Yuuto', 'Arai', 'yuuto.arai3@oraclemail.com', 187136.04)
...

Reflection:

The original SQL query retrieves the top 10 customers based on income, which may not 
fully answer the question of 'top customers' as it could be interpreted in terms of 
sales or transactions. To better answer the question, we should consider the total 
sales or transactions made by each customer.

Great! The LLM recognized that “top customers” probably means customers who spend the most, not customers who earn the most.

Second attempt (SQL V2):

SELECT C.CUST_ID, C.FIRST_NAME, C.LAST_NAME, C.EMAIL, SUM(S.ACTUAL_PRICE) AS TOTAL_SALES 
FROM CUSTOMER C 
JOIN CUSTSALES S ON C.CUST_ID = S.CUST_ID 
GROUP BY C.CUST_ID, C.FIRST_NAME, C.LAST_NAME, C.EMAIL 
ORDER BY TOTAL_SALES DESC 
FETCH FIRST 10 ROWS ONLY;

Much better! Now it’s joining with the sales data and calculating total spending per customer.

Final results:

(1234517, 'Tsubasa', 'Nakajima', 'tsubasa.nakajima2@oraclemail.com', 2356.049999999997)
(1280887, 'Steffi', 'Bielvenstram', 'steffi.bielvenstram@oraclemail.com', 2334.7299999999996)
(1017254, 'Guadalupe', 'Zamora', 'guadalupe.zamora@oraclemail.com', 2329.7599999999998)
...

The top customer is Tsubasa Nakajima with $2,356.05 in total sales, followed by Steffi Bielvenstram with $2,334.73, and so on. These are very different customers from the high-income list we got in the first attempt!

Natural language answer:

Our top customers, based on the provided data, are:

1. Tsubasa Nakajima - Email: tsubasa.nakajima2@oraclemail.com, Total: $2356.05
2. Steffi Bielvenstram - Email: steffi.bielvenstram@oraclemail.com, Total: $2334.73
3. Guadalupe Zamora - Email: guadalupe.zamora@oraclemail.com, Total: $2329.76
...

These customers have the highest total amounts associated with them.

What I learned

This reflection approach really does help. The LLM is pretty good at recognizing when its initial SQL doesn’t quite match the intent of the question – especially when it can see the actual results.

The pattern of generate → execute → reflect → regenerate is more expensive (two LLM calls instead of one for generation, plus one more for the final answer), but the quality improvement is noticeable. For production use, you might want to:

  • Cache schema information instead of fetching it every time
  • Add more sophisticated error handling for SQL errors
  • Consider running both queries in parallel and comparing results
  • Track which types of questions benefit most from reflection
  • Use the reflection feedback to build a dataset for fine-tuning

The approach is straightforward to implement and the results speak for themselves – the reflection step caught a subtle but important misinterpretation that would have given technically correct but unhelpful results.

Give it a try with your own database and questions – I think you’ll find the reflection step catches a lot of these subtle misinterpretations that would otherwise lead to valid but wrong answers.

What next? I am going to experiment with some more complex questions, and then compare the performance of a number of different LLMs to see how they go with and without reflection. Stay tuned 🙂

Posted in Uncategorized | Leave a comment

Using Multiple Datasources with Spring Boot and Spring Data JPA

Hi everyone! Today I want to show you how to configure multiple datasources in a Spring Boot application using Spring Data JPA and the Oracle Spring Boot Starter for Universal Connection Pool (UCP).

This is a pattern you’ll need when you have a single application that needs to connect to multiple databases. Maybe you have different domains in separate databases, or you’re working with legacy systems, or you need to separate read and write operations across different database instances. Whatever the reason, Spring Boot makes this pretty straightforward once you understand the configuration pattern.

I’ve put together a complete working example on GitHub at https://github.com/markxnelson/spring-multiple-jpa-datasources, and in this post I’ll walk you through how to build it from scratch.

The Scenario

For this example, we’re going to build a simple application that manages two separate domains:

  • Customers – stored in one database
  • Products – stored in a different database

Each domain will have its own datasource, entity manager, and transaction manager. We’ll use Spring Data JPA repositories to interact with each database, and we’ll show how to use both datasources in a REST controller.

I am assuming you have a database with two users called customer and product and some tables. Here’s the SQL to set that up:

$ sqlplus sys/Welcome12345@localhost:1521/FREEPDB1 as sysdba

alter session set container=freepdb1;
create user customer identified by Welcome12345;
create user product identified by Welcome12345;
grant connect, resource, unlimited tablespace to customer;
grant connect, resource, unlimited tablespace to product;
commit;

$ sqlplus customer/Welcome12345@localhost:1521/FREEPDB1

create table customer (id number, name varchar2(64));
insert into customer (id, name) values (1, 'mark');
commit;

$ sqlplus product/Welcome12345@localhost:1521/FREEPDB1

create table product (id number, name varchar2(64));
insert into product (id, name) values (1, 'coffee machine');
commit;

Step 1: Dependencies

Let’s start with the Maven dependencies. Here’s what you’ll need in your pom.xml:

<dependencies>
    <!-- Spring Boot Starter Web for REST endpoints -->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>

    <!-- Spring Boot Starter Data JPA -->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-data-jpa</artifactId>
    </dependency>

    <!-- Oracle Spring Boot Starter for UCP -->
    <dependency>
        <groupId>com.oracle.database.spring</groupId>
        <artifactId>oracle-spring-boot-starter-ucp</artifactId>
        <version>25.3.0</version>
    </dependency>

</dependencies>

The key dependency here is the oracle-spring-boot-starter-ucp, which provides autoconfiguration for Oracle’s Universal Connection Pool. UCP is Oracle’s high-performance connection pool implementation that provides features like connection affinity, Fast Connection Failover, and Runtime Connection Load Balancing.

Step 2: Configure the Datasources in application.yaml

Now let’s configure our two datasources in the application.yaml file. We’ll define connection properties for both the customer and product databases:

spring:
  application:
    name: demo

  jpa:
    customer:
      properties:
        hibernate.dialect: org.hibernate.dialect.OracleDialect
        hibernate.hbm2ddl.auto: validate
        hibernate.format_sql: true
        hibernate.show_sql: true
    product:
      properties:
        hibernate.dialect: org.hibernate.dialect.OracleDialect
        hibernate.hbm2ddl.auto: validate
        hibernate.format_sql: true
        hibernate.show_sql: true

  datasource:
    customer:
        url: jdbc:oracle:thin:@localhost:1521/freepdb1
        username: customer
        password: Welcome12345
        driver-class-name: oracle.jdbc.OracleDriver
        type: oracle.ucp.jdbc.PoolDataSourceImpl
        oracleucp:
          connection-factory-class-name: oracle.jdbc.pool.OracleDataSource
          connection-pool-name: CustomerConnectionPool
          initial-pool-size: 15
          min-pool-size: 10
          max-pool-size: 30
          shared: true
    product:
        url: jdbc:oracle:thin:@localhost:1521/freepdb1
        username: product
        password: Welcome12345
        driver-class-name: oracle.jdbc.OracleDriver
        type: oracle.ucp.jdbc.PoolDataSourceImpl
        oracleucp:
          connection-factory-class-name: oracle.jdbc.pool.OracleDataSource
          connection-pool-name: CustomerConnectionPool
          initial-pool-size: 15
          min-pool-size: 10
          max-pool-size: 30
          shared: true

Notice that we’re using custom property prefixes (spring.datasource.customer and product) instead of the default spring.datasource. This is because Spring Boot’s autoconfiguration will only create a single datasource by default. When you need multiple datasources, you need to create them manually and use custom configuration properties.

In this example, both datasources happen to point to the same database server but use different schemas (users). In a real-world scenario, these would typically point to completely different database instances.

Step 3: Configure the Customer Datasource

Now we need to create the configuration classes that will set up our datasources, entity managers, and transaction managers. Let’s start with the customer datasource.

Create a new package called customer and add a configuration class called CustomerDataSourceConfig.java:

package com.example.demo.customer;

import javax.sql.DataSource;

import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.boot.autoconfigure.jdbc.DataSourceProperties;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.boot.orm.jpa.EntityManagerFactoryBuilder;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Primary;
import org.springframework.data.jpa.repository.config.EnableJpaRepositories;
import org.springframework.orm.jpa.JpaTransactionManager;
import org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean;
import org.springframework.transaction.PlatformTransactionManager;
import org.springframework.transaction.annotation.EnableTransactionManagement;

@Configuration
@EnableTransactionManagement
@EnableJpaRepositories(entityManagerFactoryRef = "customerEntityManagerFactory", transactionManagerRef = "customerTransactionManager", basePackages = {
        "com.example.demo.customer" })
public class CustomerDataSourceConfig {

    @Bean(name = "customerProperties")
    @ConfigurationProperties("spring.datasource.customer")
    public DataSourceProperties customerDataSourceProperties() {
        return new DataSourceProperties();
    }

    /**
     * Creates and configures the customer DataSource.
     *
     * @param properties the customer datasource properties
     * @return configured DataSource instance
     */
    @Primary
    @Bean(name = "customerDataSource")
    @ConfigurationProperties(prefix = "spring.datasource.customer")
    public DataSource customerDataSource(@Qualifier("customerProperties") DataSourceProperties properties) {
        return properties.initializeDataSourceBuilder().build();
    }

    /**
     * Reads customer JPA properties from application.yaml.
     *
     * @return Map of JPA properties
     */
    @Bean(name = "customerJpaProperties")
    @ConfigurationProperties("spring.jpa.customer.properties")
    public java.util.Map<String, String> customerJpaProperties() {
        return new java.util.HashMap<>();
    }

    /**
     * Creates and configures the customer EntityManagerFactory.
     *
     * @param builder the EntityManagerFactoryBuilder
     * @param dataSource the customer datasource
     * @param jpaProperties the JPA properties from application.yaml
     * @return configured LocalContainerEntityManagerFactoryBean
     */
    @Primary
    @Bean(name = "customerEntityManagerFactory")
    public LocalContainerEntityManagerFactoryBean customerEntityManagerFactory(EntityManagerFactoryBuilder builder,
            @Qualifier("customerDataSource") DataSource dataSource,
            @Qualifier("customerJpaProperties") java.util.Map<String, String> jpaProperties) {

        return builder.dataSource(dataSource)
                .packages("com.example.demo.customer")
                .persistenceUnit("customers")
                .properties(jpaProperties)
                .build();
    }

    @Bean
    @ConfigurationProperties("spring.jpa.customer")
    public PlatformTransactionManager customerTransactionManager(
            @Qualifier("customerEntityManagerFactory") LocalContainerEntityManagerFactoryBean entityManagerFactory) {
        return new JpaTransactionManager(entityManagerFactory.getObject());
    }

}

Let’s break down what’s happening here:

  1. @EnableJpaRepositories – This tells Spring Data JPA where to find the repositories for this datasource. We specify the base package (com.example.multidatasource.customer), and we reference the entity manager factory and transaction manager beans by name.
  2. @Primary – We mark the customer datasource as the primary one. This means it will be used by default when autowiring a datasource, entity manager, or transaction manager without a @Qualifier. You must have exactly one primary datasource when using multiple datasources.
  3. customerDataSource() – This creates the datasource bean using Spring Boot’s DataSourceBuilder. The @ConfigurationProperties annotation binds the properties from our application.yaml (with the customer.datasource prefix) to the datasource configuration.
  4. customerEntityManagerFactory() – This creates the JPA entity manager factory, which is responsible for creating entity managers. We configure it to scan for entities in the customer package and set up Hibernate properties.
  5. customerTransactionManager() – This creates the transaction manager for the customer datasource. The transaction manager handles transaction boundaries and ensures ACID properties.

Step 4: Configure the Product Datasource

Now let’s create the configuration for the product datasource. Create a new package called product and add ProductDataSourceConfig.java:

package com.example.demo.product;

import javax.sql.DataSource;

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.boot.autoconfigure.jdbc.DataSourceProperties;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.boot.orm.jpa.EntityManagerFactoryBuilder;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.jpa.repository.config.EnableJpaRepositories;
import org.springframework.orm.jpa.JpaTransactionManager;
import org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean;
import org.springframework.transaction.PlatformTransactionManager;
import org.springframework.transaction.annotation.EnableTransactionManagement;

@Configuration
@EnableTransactionManagement
@EnableJpaRepositories(entityManagerFactoryRef = "productEntityManagerFactory", transactionManagerRef = "productTransactionManager", basePackages = {
        "com.example.demo.product" })
public class ProductDataSourceConfig {

    @Bean(name = "productProperties")
    @ConfigurationProperties("spring.datasource.product")
    public DataSourceProperties productDataSourceProperties() {
        return new DataSourceProperties();
    }

    @Bean(name = "productDataSource")
    @ConfigurationProperties(prefix = "spring.datasource.product")
    public DataSource productDataSource(@Qualifier("productProperties") DataSourceProperties properties) {
        return properties.initializeDataSourceBuilder().build();
    }

    /**
     * Reads product JPA properties from application.yaml.
     *
     * @return Map of JPA properties
     */
    @Bean(name = "productJpaProperties")
    @ConfigurationProperties("spring.jpa.product.properties")
    public java.util.Map<String, String> productJpaProperties() {
        return new java.util.HashMap<>();
    }

    /**
     * Creates and configures the product EntityManagerFactory.
     *
     * @param builder the EntityManagerFactoryBuilder
     * @param dataSource the product datasource
     * @param jpaProperties the JPA properties from application.yaml
     * @return configured LocalContainerEntityManagerFactoryBean
     */
    @Bean(name = "productEntityManagerFactory")
    public LocalContainerEntityManagerFactoryBean productEntityManagerFactory(@Autowired EntityManagerFactoryBuilder builder,
            @Qualifier("productDataSource") DataSource dataSource,
            @Qualifier("productJpaProperties") java.util.Map<String, String> jpaProperties) {

        return builder.dataSource(dataSource)
                .packages("com.example.demo.product")
                .persistenceUnit("products")
                .properties(jpaProperties)
                .build();
    }

    @Bean
    @ConfigurationProperties("spring.jpa.product")
    public PlatformTransactionManager productTransactionManager(
            @Qualifier("productEntityManagerFactory") LocalContainerEntityManagerFactoryBean entityManagerFactory) {
        return new JpaTransactionManager(entityManagerFactory.getObject());
    }

}

The product configuration is almost identical to the customer configuration, with a few key differences:

  1. No @Primary annotations – Since we already designated the customer datasource as primary, we don’t mark the product beans as primary.
  2. Different package – The @EnableJpaRepositories points to the product package, and the entity manager factory scans the product package for entities.
  3. Different bean names – All the beans have different names (productDataSource, productEntityManagerFactory, productTransactionManager) to avoid conflicts.

Step 5: Create the Domain Models

Now let’s create the JPA entities for each datasource. First, in the customer package, create Customer.java:

package com.example.demo.customer;

import jakarta.persistence.Entity;
import jakarta.persistence.Id;

@Entity
public class Customer {
    @Id
    public int id;
    public String name;

    public Customer() {
        this.id = 0;
        this.name = "";
    }
}

And in the product package, create Product.java:

package com.example.demo.product;

import jakarta.persistence.Entity;
import jakarta.persistence.Id;

@Entity
public class Product {
    @Id
    public int id;
    public String name;

    public Product() {
        this.id = 0;
        this.name = "";
    }
}

Step 6: Create the Repositories

Now let’s create Spring Data JPA repositories for each entity. In the customer package, create CustomerRepository.java:

package com.example.demo.customer;

import org.springframework.data.jpa.repository.JpaRepository;

public interface CustomerRepository extends JpaRepository<Customer, Integer> {

}

And in the product package, create ProductRepository.java:

package com.example.demo.product;

import org.springframework.data.jpa.repository.JpaRepository;

public interface ProductRepository extends JpaRepository<Product, Integer> {

}

Step 7: Create a REST Controller

Finally, let’s create a REST controller that demonstrates how to use both datasources. Create a controller package and add CustomerController.java:

package com.example.demo.controllers;

import java.util.List;

import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

import com.example.demo.customer.Customer;
import com.example.demo.customer.CustomerRepository;

@RestController
public class CustomerController {

    final CustomerRepository customerRepository;

    public CustomerController(CustomerRepository customerRepository) {
        this.customerRepository = customerRepository;
    }

    @GetMapping("/customers")
    public List<Customer> getCustomers() {
        return customerRepository.findAll();
    }

}

A few important things to note about the controller:

  1. Transaction Managers – When you have multiple datasources, you need to explicitly specify which transaction manager to use. Notice the @Transactional("customerTransactionManager") and @Transactional("productTransactionManager") annotations on the write operations. If you don’t specify a transaction manager, Spring will use the primary one (customer) by default.
  2. Repository Autowiring – The repositories are autowired normally. Spring knows which datasource each repository uses based on the package they’re in, which we configured in our datasource configuration classes.
  3. Cross-datasource Operations – The initializeData() method demonstrates working with both datasources in a single method. However, note that these operations are not in a distributed transaction – if one fails, the other won’t automatically roll back. If you need distributed transactions across multiple databases, you would need to use JTA (Java Transaction API).

Let’s also create ProductController.java:

package com.example.demo.controllers;

import java.util.List;

import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

import com.example.demo.product.Product;
import com.example.demo.product.ProductRepository;

@RestController
public class ProductController {

    final ProductRepository productRepository;

    public ProductController(ProductRepository productRepository) {
        this.productRepository = productRepository;
    }

    @GetMapping("/products")
    public List<Product> getProducts() {
        return productRepository.findAll();
    }

}

Testing the Application

Now you can run your application! Make sure you have two Oracle database users created (customer and product), or adjust the configuration to point to your specific databases.

Start the application:

mvn spring-boot:run

Then you can test it with some curl commands:

# Get all customers
$ curl http://localhost:8080/customers
[{"id":1,"name":"mark"}]

# Get all products
$ curl http://localhost:8080/products
[{"id":1,"name":"coffee machine"}]

Wrapping Up

And there you have it! We’ve successfully configured a Spring Boot application with multiple datasources using Spring Data JPA and Oracle’s Universal Connection Pool. The key points to remember are:

  1. Custom configuration properties – Use custom prefixes for each datasource in your application.yaml
  2. Manual configuration – Create configuration classes for each datasource with beans for the datasource, entity manager factory, and transaction manager
  3. Primary datasource – Designate one datasource as primary using @Primary
  4. Package organization – Keep entities and repositories for each datasource in separate packages
  5. Explicit transaction managers – Specify which transaction manager to use for write operations with @Transactional

This pattern works great when you need to connect to multiple databases, whether they’re different types of databases or different instances of the same database. Oracle’s Universal Connection Pool provides excellent performance and reliability for your database connections.

I hope this helps you work with multiple datasources in your Spring Boot applications! The complete working code is available on GitHub at https://github.com/markxnelson/spring-multiple-jpa-datasources.

Happy coding!

Posted in Uncategorized | Tagged , , , , , , | Leave a comment

Custom vector distance functions in Oracle (using JavaScript)

In case you missed it, Oracle Database 26ai was announced last week at Oracle AI World, with a heap of new AI features and capabilities like hybrid vector search, MCP server support, acceleration with NVIDIA and much more – check the link for details.

Of course, I wanted to check it out, and I was thinking about what to do first. I remembered this LinkedIn post from Anders Swanson about implementing custom vector distance functions in Oracle using the new JavaScript capabilities, and I thought that could be something interesting to do, so I am going to show you how to implement and use Jaccard distance for dense vector embeddings for similarity searches.

Now, this is a slightly contrived example, because I am more interested in showing you how to add a custom metric than in the actual metric itself. I chose Jaccard because the actual implementation is pretty compact.

Now, Oracle does already include Jaccard distance, but only for the BINARY data type, which is where Jaccard is mostly used. But there is a version that can be used for continuous/real-valued vectors as well (this version is for dense vectors), and that is what we will implement.

This is the formula for Jaccard similarity for continuous vectors. This is also known as the Tanimoto coefficient. It is the intersection divided by the union (or zero if the union is zero):

To get the Jaccard distance, we just subtract the Jaccard similarity from one.

Before we start, let’s look at a two-dimensional example to get a feel for how it works. Of course, the real vectors created by embedding models have many more dimensions, but it is hard for us to visualize more than two or three dimensions without also introducing techniques like dimensionality reduction and projection).

Here we have two vectors A [5 8] and B [7 4]:

The union is calculated using the max values, as you see in the formular above, so in this example it is 7×8, as shown by the area shaded pink. The intersection is calculated with the min values, so it is 5×4, as shown by the green area.

So in this example, the Jaccard similarity is (7×8) / (5×4) = 56 / 20 = 0.6

And so the Jaccard distance is 1 – 0.6 = 0.4

Ok, now that we have some intuition about how this distance metric works, let’s implement it in Oracle.

Start up an Oracle Database

First, let’s fire up Oracle Database Free 26ai in a container:

docker run -d --name db26ai \
    -p 1521:1521 \
    -e ORACLE_PWD=Welcome12345 \
    -v db26ai-volume:/opt/oracle/oradata \
    container-registry.oracle.com/database/free:latest

This will pull the latest image, which at the time of writing is 26ai (version tag 23.26.0.0). You can check the logs to see when startup is complete, you’ll see a message “DATABASE IS READY TO USE”:

docker logs -f db26ai

Let’s create a user called vector with the necessary privileges:

docker exec -i db26ai sqlplus sys/Welcome12345@localhost:1521/FREEPDB1 as sysdba <<EOF
alter session set container=FREEPDB1;
create user vector identified by vector;
grant connect, resource, unlimited tablespace, create credential, create procedure, create mle, create any index to vector;
commit;
EOF

Now you can connect with your favorite client. I am going to use Oracle SQL Developer for VS Code. See the link for install instructions.

Implement the custom distance function

Open up an SQL Worksheet, or run this in your tool of choice:


create or replace function jaccard_distance("a" vector, "b" vector)
return binary_double 
deterministic parallel_enable
as mle language javascript pure {{
    // check the vectors are the same length
    if (a.length !== b.length) {
        throw new Error('Vectors must have same length');
    }

    let intersection = 0;
    let union = 0;

    for (let i = 0; i < a.length; i++) { 
        intersection += Math.min(a[i], b[i]);
        union += Math.max(a[i], b[i]);
    }

    // handle the case where union is zero (all-zero vectors)
    if (union === 0) {
        return 0;
    }

    const similarity = intersection / union;
    return 1 - similarity;

}};
/

Let’s walk throught this. First, you see that we are creating a function called jaccard_distance which accepts two vectors (a and b) as input and returns a binary_double. This function sugnature is required for distance functions. Next we must include the deterministric keyword and we have also included the parallel_enable keyword so that this function could be used with HNSW vector indexes. For the purposes of this example, you can just ignore those or assume that they are just needed as part of the function signature.

Next you see that we mention this will be an MLE function written in JavaScript, and we added the pure keyword to let the database know that this is a pure function – meaning it has no side effects, it will not update any data, and its output will always be the same for a given set of inputs (i.e., that it is memoizable).

Then we have the actual implementation of the function. First, we check that the vectors have the same length (i.e., the same number of dimensions) which is required for this calculation to be applicable.

Then we work through the vectors and collect the minimums and maximums to calculate the intersection and the union.

Next, we check if the union is zero, and if so we return zero to handle that special case. And finally, we calculate the similarity, then subtract it from one to get the distance and return that.

Using our custom distance function

Great, so let’s test our function. We can start by creating a table t1 to store some vectors:

create table t1 (
    id number,
    v vector(2, float32)
);

And let’s add a couple of vectors, including the one we saw in the example above [5 8]:

insert into t1 (id, v) values 
(1, vector('[5, 8]')),
(2, vector('[1, 2]'));

You can so a simple select statement to see the contents of the table:

select * from t1;

This will give these results:

ID     V
1      [5.0E+000,8.0E+000]
2      [1.0E+000,2.0E+000]

Now let’s use our function to see the Jaccard distance for each vector in our table t1 from the other vector we used in the example above [7 4]:

select 
    v,
    jaccard_distance(v, vector('[7, 4]')) distance
from t1
order by distance; 

This returns these results:

V                       DISTANCE
[5.0E+000,8.0E+000]     0.4
[1.0E+000,2.0E+000]     0.7272727272727273

As you can see, the Jaccard distance from [5 8] to [7 4] is 0.4, as we calculated in the example above, and [1 2] to [7 4] is 0.72…

Let’s see how it works with large embeddings

Ok, two dimension vectors are good for simple visualization, but let’s try this out with some ‘real’ vectors.

I am using Visual Studio Code with the Python and Jupyter extensions from Microsoft installed

Create a new Jupyter Notebook using File > New File… then choose Jupyter Notebook as the type of file, and save your new file at jaccard.ipynb.

First, we need to set up the Python runtime environment. Click on the Select Kernel button (its on the top right). Select Python Environment then Create Python Environment. Select the option to create a Venv (Virtual Environment) and choose your Python interpreter. I recommend using at least Python 3.11. This will download all the necessary files and will take a minute or two.

Now, let’s install the libraries we will need – enter this into a cell and run it:

%pip install oracledb sentence-transformers

Now, connect to the same Oracle database (again, enter this into a cell and run it):

import oracledb

username = "vector"
password = "vector"
dsn = "localhost:1521/FREEPDB1"

try:
    connection = oracledb.connect(
        user=username, 
        password=password, 
        dsn=dsn)
    print("Connection successful!")
except Exception as e:
    print("Connection failed!")

Let’s create a table to hold 1024 dimension vectors that we will create with the mxbai-embed-large-v1 embedding model. Back in your SQL Worksheet, run this statement:

create table t2 (
    id number,
    v vector(1024, float32)
);

Ok, now let’s create some embeddings. Back in your notebook, create a new cell with this code:

import oracledb
from sentence_transformers import SentenceTransformer

# Initialize the embedding model
print("Loading embedding model...")
model = SentenceTransformer('mixedbread-ai/mxbai-embed-large-v1')

# Your text data
texts = [
    "The quick brown fox jumps over the lazy dog",
    "Machine learning is a subset of artificial intelligence",
    "Oracle Database 23ai supports vector embeddings",
    "Python is a popular programming language",
    "Embeddings capture semantic meaning of text"
]

# Generate embeddings
print("Generating embeddings...")
embeddings = model.encode(texts)

Let’s discuss what we are doing in this code. First, we are going to download the embedding model usign the SentenceTransformer. Then, we define a few simple texts that we can use for this example and use the embedding model to create the vector embeddings for those texts.

If you want to see what the embeddings look like, just enter “embeddings” in a cell and run it. In the output you can see the shape is 5 (rows) with 1024 dimensions and the type is float32.

Now, let’s insert the embeddings into our new table t2:

import array 
cursor = connection.cursor()

# Insert data
for i in range(len(embeddings)):
    cursor.execute("""
        INSERT INTO t2 (id, v)
        VALUES (:1, :2)
    """, [i, array.array('f', embeddings[i].tolist())])

connection.commit()
print(f"Successfully inserted {len(texts)} records")

You can take a look at the vectors using the simple query (back in your SQL Worksheet):

select * from t2

Which will show you something like this:

And, now let’s try our distance function with these vectors. Back in your notebook, run this cell. I’ve included the built-in cosine distance as well, just for comparison purposes:

query = array.array('f', model.encode("Antarctica is the driest continent").tolist())

cursor = connection.cursor()
cursor.execute("""
    select 
        id,
        jaccard_distance(v, :1),
        vector_distance(v, :2, cosine)
    from t2
    order by id
""", [query, query])

for row in cursor:
    print(f"id: {row[0]} has jaccard distance: {row[1]} and cosine distance: {row[2]}")

cursor.close()

Your output will look something like this:

id: 0 has jaccard distance: 2.0163214889484307 and cosine distance: 0.7859490566650003
id: 1 has jaccard distance: 2.0118706751976925 and cosine distance: 0.6952327173906239
id: 2 has jaccard distance: 2.0152858933816775 and cosine distance: 0.717824211314015
id: 3 has jaccard distance: 2.0216149035530537 and cosine distance: 0.6455277387099003
id: 4 has jaccard distance: 2.0132575761281766 and cosine distance: 0.6962028121886988

Well, there you go! We implemented and used a custom vector distance function. Enjoy!

Posted in Uncategorized | Tagged , , , , , , , , , | Leave a comment

Let’s make a simple MCP tool for Oracle AI Vector Search

In this earlier post, we created a vector store in our Oracle Database 23ai and populated it with some content from Moby Dick. Since MCP is very popular these days, I thought it might be interesting to look how to create a very simple MCP server to expose the similarity search as and MCP tool.

Let’s jump right into it. First we are going to need a requirements.txt file with a list of the dependencies we need:

mcp>=1.0.0
oracledb
langchain-community
langchain-huggingface
sentence-transformers
pydantic

And then go ahead and install these by running:

pip install -r requirements.txt

Note: I used Python 3.12 and a virtual environment.

Now let’s create a file called mcp_server.py and get to work! Let’s start with some imports:

import asyncio
import oracledb
from mcp.server import Server
from mcp.types import Tool, TextContent
from pydantic import BaseModel
from langchain_community.vectorstores import OracleVS
from langchain_community.vectorstores.utils import DistanceStrategy
from langchain_huggingface import HuggingFaceEmbeddings

And we are going to need the details of the database so we can connect to that, so let’s define some variables to hold those parameters:

# Database connection parameters for Oracle Vector Store
DB_USERNAME = "vector"
DB_PASSWORD = "vector"
DB_DSN = "localhost:1521/FREEPDB1" 
TABLE_NAME = "moby_dick_500_30"  

Note: These match the database and vector store used in the previous post.

Let’s create a function to connect to the database, and set up the embedding model and the vector store.

# Global variables for database connection and embedding model
# These are initialized once on server startup for efficiency
embedding_model = None  # HuggingFace sentence transformer model
vector_store = None     # LangChain OracleVS wrapper for vector operations
connection = None       # Oracle database connection

def initialize_db():
    """
    Initialize database connection and vector store

    This function is called once at server startup to establish:
    1. Connection to Oracle database
    2. HuggingFace embedding model (sentence-transformers/all-mpnet-base-v2)
    3. LangChain OracleVS wrapper for vector similarity operations

    The embedding model converts text queries into 768-dimensional vectors
    that can be compared against pre-computed embeddings in the database.
    """
    global embedding_model, vector_store, connection

    # Connect to Oracle database using oracledb driver
    connection = oracledb.connect(
        user=DB_USERNAME,
        password=DB_PASSWORD,
        dsn=DB_DSN
    )

    # Initialize HuggingFace embeddings model
    # This model converts text to 768-dimensional vectors
    # Same model used to create the original embeddings in the database
    embedding_model = HuggingFaceEmbeddings(
        model_name="sentence-transformers/all-mpnet-base-v2"
    )

    # Initialize vector store wrapper
    # OracleVS provides convenient interface for vector similarity operations
    vector_store = OracleVS(
        client=connection,
        table_name=TABLE_NAME,
        embedding_function=embedding_model,
        # Use cosine similarity for comparison
        distance_strategy=DistanceStrategy.COSINE,  
    )

Again, note that I am using the same embedding model that we used to create the vectors in this vector store. This is important because we need to create embedding vectors for the queries using the same model, so that similarity comparisons will be valid. It’s also important that we use the right distance strategy – for text data, cosine is generally agreed to be the best option. For performance reasons, if we had created a vector index, we’d want to use the same algorithm so the index would be used when performing the search. Oracle will default to doing an “exact search” if there is no index and the algorithm does not match.

Now, let’s add a function to perform a query in our Moby Dick vector store, we’ll include a top-k parameter so the caller can specify how many results they want:

def search_moby_dick(query: str, k: int = 4) -> list[dict]:
    """
    Perform vector similarity search on the moby_dick_500_30 table

    This function:
    1. Converts the query text to a vector using the embedding model
    2. Searches the database for the k most similar text chunks
    3. Returns results ranked by similarity (cosine distance)

    Args:
        query: The search query text (natural language)
        k: Number of results to return (default: 4)

    Returns:
        List of dictionaries containing rank, content, and metadata for each result
    """
    if vector_store is None:
        raise RuntimeError("Vector store not initialized")

    # Perform similarity search
    # The query is automatically embedded and compared against database vectors
    docs = vector_store.similarity_search(query, k=k)

    # Format results into structured dictionaries
    results = []
    for i, doc in enumerate(docs):
        results.append({
            "rank": i + 1,  # 1-indexed ranking by similarity
            "content": doc.page_content,  # The actual text chunk
            "metadata": doc.metadata  # Headers from the original HTML structure
        })

    return results

As you can see, this function returns a dictionary containing the rank, the content (chunk) and the metadata.

Ok, now let’s turn this into an MCP server! First let’s create the server instance:

# Create MCP server instance
# The server name "moby-dick-search" identifies this server in MCP client connections
app = Server("moby-dick-search")

Now we want to provide a list-tools method so that MCP clients can find out what kinds of tools this server provides. We are just going to have our search tool, so let’s define that:

@app.list_tools()
async def list_tools() -> list[Tool]:
    """
    MCP protocol handler: returns list of available tools

    Called by MCP clients to discover what capabilities this server provides.
    This server exposes a single tool: search_moby_dick

    Returns:
        List of Tool objects with names, descriptions, and input schemas
    """
    return [
        Tool(
            name="search_moby_dick",
            description="Search the Moby Dick text using vector similarity. Returns relevant passages based on semantic similarity to the query.",
            inputSchema={
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "The search query text"
                    },
                    "k": {
                        "type": "integer",
                        "description": "Number of results to return (default: 4)",
                        "default": 4
                    }
                },
                "required": ["query"]
            }
        )
    ]

And now, the part we’ve all been waiting for – let’s define the actual search tool (and a class to hold the arguments)!

class SearchArgs(BaseModel):
    """
    Arguments for the vector search tool

    Attributes:
        query: The natural language search query
        k: Number of most similar results to return (default: 4)
    """
    query: str
    k: int = 4

@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    """
    MCP protocol handler: executes tool calls

    Called when an MCP client wants to use one of the server's tools.
    Validates the tool name, parses arguments, performs the search,
    and returns formatted results.

    Args:
        name: Name of the tool to call
        arguments: Dictionary of tool arguments

    Returns:
        List of TextContent objects containing the formatted search results
    """
    # Validate tool name
    if name != "search_moby_dick":
        raise ValueError(f"Unknown tool: {name}")

    # Parse and validate arguments using Pydantic model
    args = SearchArgs(**arguments)

    # Perform the vector similarity search
    results = search_moby_dick(args.query, args.k)

    # Format response as human-readable text
    response_text = f"Found {len(results)} results for query: '{args.query}'\n\n"

    for result in results:
        response_text += f"--- Result {result['rank']} ---\n"
        response_text += f"Metadata: {result['metadata']}\n"
        response_text += f"Content: {result['content']}\n\n"

    # Return as MCP TextContent type
    return [TextContent(type="text", text=response_text)]

That was not too bad. Finally, let’s set up a main function to start up everything and handle the requests:

async def main():
    """
    Main entry point for the MCP server

    This function:
    1. Initializes the database connection and embedding model
    2. Sets up stdio transport for MCP communication
    3. Runs the server event loop to handle requests

    The server communicates via stdio (stdin/stdout), which allows
    it to be easily spawned by MCP clients as a subprocess.
    """
    # Initialize database connection and models
    initialize_db()

    # Import stdio server transport
    from mcp.server.stdio import stdio_server

    # Run the server using stdio transport
    # The server reads MCP protocol messages from stdin and writes responses to stdout
    async with stdio_server() as (read_stream, write_stream):
        await app.run(
            read_stream,
            write_stream,
            app.create_initialization_options()
        )

if __name__ == "__main__":
    asyncio.run(main())

Ok, that’s it! We can run this with the command:

python mcp_server.py

Now, to test it, we’re groing to need a client! MCP Inspector is the logical place to start, you can get it from here, or (assuming you have node installed) by just running this command:

 npx @modelcontextprotocol/inspector python3.12 mcp_server.py

That’s going to start up a UI that looks like this:

Click on the connect button, and you should see an updated screen in a few seconds that looks like this:

Go ahead and click on List Tools and you will see our Search Moby Dick Tool show up – click on it to try it out.

You should see some results like this:

There you go, it works great! And that’s a super simple, basic MCP server and tool! Enjoy.

Posted in Uncategorized | Tagged , , , , , | 1 Comment

Exploring securing vector similarity searches with Real Application Security

In this post, I want to explore how you can use Real Application Security to provide access controls for vectors in a vector store in Oracle Database 23ai.

I’m going to use the vector store we created in the last post as an example. If you want to follow along, you should follow that one first to create and populate your vector store, then come back here.

You should have a vector store table called MOBY_DICK_500_30 that you created in that previous post. You can connect to Oracle using SQLcl or SQL*Plus or whatever tool you prefer and check the structure of that table:

SQL> describe moby_dick_500_30

Name         Null?       Type
____________ ___________ ____________________________
ID           NOT NULL    RAW(16 BYTE)
TEXT                     CLOB
METADATA                 JSON
EMBEDDING                VECTOR(768,FLOAT32,DENSE)

Let’s observe that that metadata column contains the document structure information from the loaders that we used. If we filter for Chapter 12, we can see there are 13 vectors associated with that chapter:

SQL> select metadata from moby_dick_500_30 where metadata like '%CHAPTER 12.%';

METADATA
__________________________________________________________________________________
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}
{"Header 1":"MOBY-DICK; or, THE WHALE.","Header 2":"CHAPTER 12. Biographical."}

13 rows selected.

We are going to use this metadata to filter access to the vectors.

Set up permissions

Let’s start by setting up the necessary permissions. You will need to run this as the SYS user:

alter session set container=freepdb1;
grant create session, xs_session_admin to vector;
exec sys.xs_admin_util.grant_system_privilege('provision', 'vector', sys.xs_admin_util.ptype_db);
grant create role to vector;
exec sys.xs_admin_util.grant_system_privilege('admin_sec_policy', 'vector', sys.xs_admin_util.ptype_db);
exec sys.xs_admin_util.grant_system_privilege('ADMIN_ANY_SEC_POLICY', 'vector', sys.xs_admin_util.ptype_db);

Great! Now let’s set up Real Application Security. We will run the rest of these commands as the VECTOR user.

Let’s start by creating a RAS role named role1:

exec sys.xs_principal.create_role(name => 'role1', enabled => true);

Now, we will create a user named user1 and add grant them role1 and connect privileges:

exec  sys.xs_principal.create_user(name => 'user1', schema => 'vector');
exec  sys.xs_principal.set_password('user1', 'pwd1');
exec  sys.xs_principal.grant_roles('user1', 'XSCONNECT');
exec  sys.xs_principal.grant_roles('user1', 'role1');

Let’s also create a regular database role and give it access to the vector store table:

create role db_emp;
grant select, insert, update, delete on vector.moby_dick_500_30 to db_emp; 

Grant DB_EMP to the application roles, so they have the required object privileges to access the table:

grant db_emp to role1;

Next, we want to create a security class, and include the predefined DML security class:

begin
  sys.xs_security_class.create_security_class(
    name        => 'moby_privileges',
    parent_list => xs$name_list('sys.dml'),
    priv_list   => xs$privilege_list(xs$privilege('view_moby_dick')));
end;

Now we can create an ACL (access control list) which will grant the privileges for the policy that we will define in a moment:

declare 
  aces xs$ace_list := xs$ace_list(); 
begin
  aces.extend(1);
  aces(1) := xs$ace_type(
     privilege_list => xs$name_list('select'),
     principal_name => 'USER1');
  
  sys.xs_acl.create_acl(
    name  => 'moby_acl',
    ace_list  => aces,
    sec_class => 'moby_privileges');
end;

Ok, nearly there! Finally, let’s define the security policy and apply it to the table:

declare
  realms xs$realm_constraint_list := xs$realm_constraint_list();      
begin  
  realms.extend(1);
 
  -- Filter based on column value
  realms(1) := xs$realm_constraint_type(
    realm    => 'metadata LIKE ''%CHAPTER 12.%''',
    acl_list => xs$name_list('moby_acl'));

  sys.xs_data_security.create_policy(
    name                   => 'moby_policy',
    realm_constraint_list  => realms);
    
  sys.xs_data_security.apply_object_policy(
    policy => 'moby_policy',
    schema => 'vector',
    object =>'moby_dick_500_30');
end;

Ok, that’s it!

Now, you may have noticed we did not give ourselves any permissions, so if we try to query that vector store table now, you’ll see it appears empty!

SQL> select count(*) from moby_dick_500_30;

   COUNT(*)
___________
          0

But, if we reconnect with the application user (user1) that we defined, and do the same query, we will see those 13 records for Chapter 12:

SQL> connect user1/pwd1
Connected.
SQL> select count(*) from moby_dick_500_30;

   COUNT(*)
___________
         13

So there you have it! We can define policies to easily control access to vectors. In this example we used the metadata to create the filtering rules, of course you could create whatever kind of rules you need.

This allows you to have a vector store which can be easily filtered for different users (or roles), essentially creating a virtual private vector store. You might want to allow ‘customer-support’ role access a certain subset of vectors for example, but your ‘supervisor’ role to access a larger set (or all) of the vectors.

What’s great about this, is that the security is enforced in the database itself. When an AI Assistant, chatbot, MCP client, etc., performs a vector search, they will only ever be able to get back results from the vectors that the user is allowed to see. The database will never send vectors to users which they are not allowed to see. So you don’t have to worry about trusting the LLM not to make a mistake and give out the wrong data, because it will literally never see the data in the first place.

Posted in Uncategorized | Tagged , , , , , | Leave a comment

Basic Retrieval Augmented Generation with Oracle Vector Store in LangChain

In this earlier post, we learned how to create a Vector Store in Oracle from LangChain, the hugely popular Python library for working with Generative AI and Large Language Models. We populated it with a little data and then performed some simple vector similarity searches.

In this post, let’s expand on that to implement basic Retrieval Augmented Generation!

First, let’s talk about some concepts – if you alread know this, feel free to jump ahead!

Generative AI – This is a type of Artificial Intelligence (AI) that uses a specialized form of machine learning model, called a Large Language Model (or “LLM”), to create (“generate”) new content based on a prompt from a user. It works by looking at the “tokens” that it received in the input (the “prompt”) and then figuring out what is the most probable next token in the sequence. What is a token? Well, it may be a word, or a part of a word, but we use the word “token” because it could also be part of an audio file or an image, since some of these models support other types of data, not just text. Note that it only generates one token at a time – you have to “run” the model again for every subsequent token.

Training – these models are “trained” by exposing them to very large amounts of data. Usually the data is publicly available information, collected from the Internet and/or other repositories. Training a model is very expensive, both in terms of time, and in terms of the cost of running the specialized GPU hardware needed to perform the training. You may see a model described as having “70 billion parameters” or something like that. Training is basically a process of tuning the probabilities of each of these parameters based on the new input.

When a model sees a prompt like “My Husky is a very good” it will use those probabilities to determine that comes next. In this example, “dog” would have a very high probability of being the next “token”.

Hyper-parameters – models also have extra parameters that control how they behave. These so-called “hyper-parameters” include things like “temperature” which controls how creative the model will be, “top-K” which controls how many options the model will consider when choose the next token, and various kinds of “frequency penalties” that will cause the model to be more or less likely to reuse/repeat tokens. Of course these are just a few examples.

Knowledge cut-off – an important property of LLMs is that they have not been exposed to any information that was created after their training ended. So a model training in 2023 would not know who won an election held in 2024 for example.

Hallucination – LLMs tend to “make up an answer” if they do not “know” the answer. Now, obviously they don’t really know anything in the same sense that we know things, they are rworking with probabilities. But if we can anthropomorphize them for a moment, they tend to “want” to be helpful, and they are very likely to offer you a very confident but completely incorrect answer if they do not have the necessary information to answer a question.


Now, of course a lot of people want to use this exciting new technology to implement solutions to help their customers or users. ChatBots is a prime example of something that is frequently implemented using Generative AI these days. But, of course no new technology is a silver bullet, and they all come with their own challenges and issues. Let’s consider some common challenges when attempting to implement a project with Generative AI:

  • Which model to use? There are many models available, and they have all beeen trained differently. Some are specialized models that are trained to perform a particular task, for example summarizing a document. Other models are general purpose and can perform different tasks. Some models understand only one language (like English) and others understand many. Some models only understand text, others only images, others video, and others still are multi-modal and understand various different types of data. Models also have different licensing requirements. Some models are provided as a service, like a utility, where you typically pay some very small amount per request. Other models are able to be self-hosted, or run on your own hardware.
  • Privacy. Very often the data that you need for your project is non-public data, and very often you do not want to share that data with a third-party organization for privacy reasons, or even regulatory reasons, depending on your industry. People are also very wary about a third-party organization using their non-public data to train future models.
  • How to “tune” to models hyper-parameters. As we discussed earlier, the hyper-parameters control how the model behaves. The settings of these parameters can have a significant impact on the quality of the results that are produced.
  • Dealing with knowledge cut-off. Giving the model access to information that is newer than when it was trained is also a key challenge. Probably the most obvious way to do this is to continue to model’s training by exposing it to this newer information. This is known as “fine-tuning”. The key challenge is that is an extremely expensive enterprise, requiring specialized GPU hardware and very highly skilled people to plan and run the training.

Enter “Retrieval Augmented Generation,” (or “RAG”) first introduced by Patrick Lewis et al, from Meta, in the 2020 paper “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks“. RAG is a technique that gives LLMs access to non-public information and/or information created after their training and is orders of magnitude less expensive than fine-tuning.

The essence of RAG is to provide the LLM with the information that it needs to answer a question by “stuffing” that information into the LLM’s “context window” or “prompt” along with the actual question. It’s a bit like an open-book test. Imagine you get a question like this:

How much does a checking account from (some bank) cost? 

And let’s assume that information is not readily available on the Internet. How would you come up with the answer? You likely could not.

But if the question was more like this:

How much does a checking acount from (some bank) cost? 

To answer this question, consult these DOCUMENTS:

(here there would be the actual content of those documents that provide the information necessary to answer that question)

Much easier right? That’s basically what RAG is. It provides the most relevant information to the LLM so that it can answer the question.

So now, the obvious questions are – where does it get this information from, and how does it know which parts are the most relevant?

This is where our Vector Store comes in!

The set of information, the non-public data that we want the LLM to use, we call that the “corpus”. Very often the corpus will be so large that there is no reasonable way for us to just give the LLM the whole thing. Now, as I am writing this in May 2025, there are models that have very large “context windows” and could be given a large amount of data. Llama 4 was just released, as I write this, and has a context window size of 10 million tokens! So you could in fact give it a large amount of information. But models that were released as recently as six or twelve months ago have much smaller context windows.

So the approach we use is to take the corpus, and split it up into small pieces, called “chunks” and we create an “embedding vector” for each of these chunks. This vector is basically an n-dimensional numerical representation of the semantic meaning of the chunk. Chunks with similar meanings will have similiar (i.e., close) vectors. Chunks with different meanings will have vectors that are futher apart.

Now, visualizing an n-dimensional vector is challenging. But if n=2, its a lot easier. So let’s do that! Remember, in real models, n is much more likely to be in the thousands or tens of thousands, but the concepts are the same. Consider the diagram below:

In this diagram, we have only two dimensions, the vertical dimension is “largeness” – how large (or small) the thing is. The horizontal dimension is “dog-ness” – how much the thing is (or is not) a dog.

Notice that both the Saint Bernard and the German Shephard (I hope I got those breeds right!) are large dogs. So the vector for both of them are high on both axes, and their vectors are very close together, because in this two-dimensional world, they are indeed very, very similar. The wolf is also large, but it is not actually a dog. Dogs are related to (descended from) wolves, so it is somewhat dog-like, but its vector is quite a distance away from the actual large dogs.

Now, look at the tennis ball! It is not large, and it is not a dog, so it’s vector is almost in the complete opposite direction to the large dogs.

Now, consider the question “Is a Husky a large dog?”

What we do in RAG, is we take that question, and turn that into a vector, using the exact same “embedding model” that we used to create those vectors we just looked at above, and then we see what other vectors are close to it.

Notice that the resulting vector, represented by the red dot, ended up very close to those two large dogs! So if we did a similarity search, that is, if we found the closest vectors to our question vector, what we would get back is the vectors for the Saint Bernard and the German Shephard.

Here’s a diagram of the RAG process:

So we take the question from the user, we turn it into a vector, we find the closest vectors to that in our corprus, and then we get the actual content that those vectors was created from and give that information to the LLM to allow it to answer the question. Remember, in real life there are many more dimensions, and they are not going to be some concept that we can neatly label, like “largeness”. The actual dimensions are things that are learned by the model over many billions of iterations of weight adjustments as it was exposed to vast amounts of data. The closest (non-mathematical) analogy I can think of is Isaac Asimov’s “positronic brain” in his Robots, Empire and Foundation series which he described as learning through countless small adjustments of uncountable numbers of weights..


Wow! That was a lot of theory! Let’s get back to some code, please!

In the previous post, we populated our vector store with just three very small quotes from Moby Dick. Now, let’s use the entire text!

Here’s the plain text version: https://www.gutenberg.org/cache/epub/2701/pg2701.txt

Here’s the same book in HTML, with some basic structure like H2 tags for the chapter headings: https://www.gutenberg.org/cache/epub/2701/pg2701-images.html

Let’s create a new notebook. If you followed along in the previous post, you can just create a new notebook in the same project and choose the same environment/kernel. If not, create a new project, then create a notebook, for example basic-rag.ipynb and create a kernel:

Click on the Select Kernel button (its on the top right). Select Python Environment then Create Python Environment. Select the option to create a Venv (Virtual Environment) and choose your Python interpreter. I recommend using at least Python 3.11. This will download all the necessary files and will take a minute or two.

If you created a new environment, install the necessary packages by creating and running a cell with this content. Note that you can run this even if you have a pre-existing environment, it won’t do any harm:

%pip install -qU "langchain[openai]"
%pip install oracledb
%pip install langchain-community langchain-huggingface

Now, create and run this cell to set your OpenAI API key:

import getpass
import os

if not os.environ.get("OPENAI_API_KEY"):
  os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")

from langchain.chat_models import init_chat_model

model = init_chat_model("gpt-4o-mini", model_provider="openai")

model.invoke("Hello, world!")

Paste your key in when prompted (see the previous post if you need to know how to get one) and confirm you got the expected response from the model.

Note: You could, of course, use a different model if you wanted to. See the LangChain model documentation for options.

Now, let’s connect to the database by creating and running this cell (this assumes that you started the database container and created the user as described in the previous post!)

import oracledb

username = "vector"
password = "vector"
dsn = "localhost:1521/FREEPDB1"

try:
    connection = oracledb.connect(
        user=username, 
        password=password, 
        dsn=dsn)
    print("Connection successful!")
except Exception as e:
    print("Connection failed!")

Ok, now we are ready to read that document and create our vector embeddings. But how? In the previous post we manually created some excerpts, but now we want to read the whole document.

Enter Document Loaders! Take a look at that page, LangChain has hundreds of different document loaders that understand all kinds of documetn formats.

Let’s try the basic web loader, create and run this cell to install it:

%pip install -qU langchain_community beautifulsoup4

Now create and run this cell to initialize the document loader:

from langchain_community.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://www.gutenberg.org/cache/epub/2701/pg2701-images.html")

Now load the documents by running this cell:

docs = loader.load()

If you’d like, take a look at the result, by running this cell:

docs[0]

Well, that is just one big document, that is not so helpful, we want to split that document up into smaller chunks so we can create vectors for each smaller part. Let’s use a document splitter instead.

Install a splitter by running this cell:

%pip install -qU langchain-text-splitters

Note: Check out this page for more information about the available splitters. We are going to use the HTMLHeaderTextSplitter. Run this cell:

from langchain_text_splitters import HTMLHeaderTextSplitter

url = "https://www.gutenberg.org/cache/epub/2701/pg2701-images.html"

headers_to_split_on = [
    ("h1", "Header 1"),
    ("h2", "Header 2"),
    ("h3", "Header 3"),
    ("h4", "Header 4"),
]

html_splitter = HTMLHeaderTextSplitter(headers_to_split_on)

html_header_splits = html_splitter.split_text_from_url(url)

Let’s see what that did, run this cell:

html_header_splits

You’ll see a long list of Documents and if you look carefully, you can see that it has maintained the structure information.

Great! That’s a lot better.

Now, let’s suppose we wanted to constrain the size of the chunks. Some of those might be too big, we might want to split them even further. We can do that with a RecursiveCharacterTextSplitter.

Let’s say we wanted chunks no bigger than 500 characters, with an overlap of 30. Now this might not be a good idea, but just for the sake of the example, let’s do it by running this cell:

from langchain_text_splitters import RecursiveCharacterTextSplitter

chunk_size = 500
chunk_overlap = 30
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=chunk_size, chunk_overlap=chunk_overlap
)

# Split
splits = text_splitter.split_documents(html_header_splits)

You can take a look at a few of the chunks by running this cell:

splits[80:85]

Ok, great! Next, we need to create our embeddings and populate our vector store.

Install the dependencies, if you have not already, by running this cell:

%pip install langchain-community langchain-huggingface

And let’s create the vector store! Run this cell:

from langchain_community.vectorstores import oraclevs
from langchain_community.vectorstores.oraclevs import OracleVS
from langchain_community.vectorstores.utils import DistanceStrategy
from langchain_core.documents import Document
from langchain_huggingface import HuggingFaceEmbeddings

embedding_model = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")

vector_store = OracleVS.from_documents(
    splits,
    embedding_model,
    client=connection,
    table_name="moby_dick_500_30",
    distance_strategy=DistanceStrategy.COSINE,
)

We are using the same model as we did in the previous post, but now we are passing in our splits that we just created – our 500 character long chunks of our larger chunks created from the HTML document respecting the document structure. And we called our vector store table moby_dick_500_30 to make it a little easier to remember what we put in there.

After that cell has finished (it might take a few minutes), you can take a look to see what is in the vector store by running this command in your terminal window:

docker exec -i db23ai sqlplus vector/vector@localhost:1521/FREEPDB1 <<EOF
select table_name from user_tables;
describe documents_cosine;
column id format a20;
column text format a30;
column metadata format a30;
column embedding format a30;
set linesize 150;
select * from moby_dick_500_30
fetch first 3 rows only;
EOF

You should get something similar to this:

Let’s try our searches again, run this cell:

query = 'Where is Rokovoko?'
print(vector_store.similarity_search(query, 1))

query2 = 'What does Ahab like to do after breakfast?'
print(vector_store.similarity_search(query2, 1))

You can change that 1 to a larger number now, since you have many more vectors, to see what you get!

Ok, now we have all the pieces we need and we are ready to implement the RAG!

The most basic way to implement RAG is to use a “retriever” – we can grab one from our vector store like this:

retriever = vector_store.as_retriever()

Try it out by asking a question:

docs = retriever.invoke("Where is Rokovoko?")

docs

You’ll get something like this:

Nearly there!

Now, we want to give the LLM a good prompt to tell it what to do, and include the retrieved documents. Let’s use a standard prompt for now:

from langchain import hub

prompt = hub.pull("rlm/rag-prompt")

example_messages = prompt.invoke(
    {"context": "(context goes here)", "question": "(question goes here)"}
).to_messages()

assert len(example_messages) == 1
print(example_messages[0].content)

The prompt looks like this:

You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: (question goes here) 
Context: (context goes here) 
Answer:

Ok, now to put it all together. Now, in real life we’d probably want to use LangGraph at this point, and we’d want to think about including things like memory, ranking the results from the vector search, citations/references (“grounding” the answer), and streaming the output. But that’s all for another post! For now, let’s just do the most basic implementation:

question = "..."

retrieved_docs = vector_store.similarity_search(question)
docs_content = "\n\n".join(doc.page_content for doc in retrieved_docs)
prompt_val = prompt.invoke({"question": question, "context": docs_content})
answer = llm.invoke(prompt_val)

answer

You should get an answer similar to this:

AIMessage(content='Rokovoko is an island located far away to the West and South, as mentioned in relation to Queequeg, a native of the island. It is not found on any maps, suggesting it may be fictional.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 46, 'prompt_tokens': 402, 'total_tokens': 448, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_54eb4bd693', 'id': 'chatcmpl-BaW3g2omlCY0l6LwDkC9Ub8Ls3V88', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None}, id='run--86f14be2-9c8f-43c9-ae89-259db1c640bd-0', usage_metadata={'input_tokens': 402, 'output_tokens': 46, 'total_tokens': 448, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

That’s a pretty good answer!

Well, there you go, we covered a lot of ground in this post, but that’s just a very basic RAG. Stay tuned to learn about implementing a more realistic RAG in the next post!

Posted in Uncategorized | Tagged , , , , , , | 2 Comments

Getting started with Oracle Vector Store support in LangChain

In this post, I would like to show you the basics of how to use the Oracle Vector Store support in LangChain. I am using Visual Studio Code with the Python and Jupyter extensions from Microsoft installed. I will show more detailed usage in future posts!

Prefer to watch a video? Check it out here:

To get started, create a new project in Visual Studio code, and then create a new Jupyter Notebook using File > New File… then choose Jupyter Notebook as the type of file, and save your new file at getting_started.ipynb,

First, we need to set up the Python runtime environment. Click on the Select Kernel button (its on the top right). Select Python Environment then Create Python Environment. Select the option to create a Venv (Virtual Environment) and choose your Python interpreter. I recommend using at least Python 3.11. This will download all the necessary files and will take a minute or two.

In this example, we will use OpenAI for our chat model. You’ll need to get an API Key from OpenAI, which you can do by logging into https://platform.openai.com/settings/organization/api-keys and creating a key. Of course you could use a different model, including a self-hosted model so that you don’t have to send your data outside your organization. I’ll cover that in future posts, stay tuned!

In the first cell, check that the type is Python and enter this code:

%pip install -qU "langchain[openai]"

Press Shift+Enter or click on the Run icon to run this code block. This will also take a minute or so to install the LangChain library for OpenAI.

Now create a second cell and paste in this code:

import getpass
import os

if not os.environ.get("OPENAI_API_KEY"):
  os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")

from langchain.chat_models import init_chat_model

model = init_chat_model("gpt-4o-mini", model_provider="openai")

Run this block, and when it prompts you for your key, paste in the key, it will start with something like sk-proj and have a long string of mostly letters and numbers after that. This will save your key in the environment so that you don’t have to keep entering it each time.

Now, we are ready to talk to the LLM! Let’s try a simple prompt. Create a new cell and enter this code:

model.invoke("Hello, world!")

Run this cell and observe the output. It should look something like this:

Great, now we are ready to connect to a vector store. If you don’t already have one, start up an instance of Oracle Database 23ai in a container on your machine. Run this command in a terminal window (not the notebook)

docker run -d --name db23ai \
  -p 1521:1521 \
  -e ORACLE_PWD=Welcome12345 \
  -v db23ai-volume:/opt/oracle/oradata \
  container-registry.oracle.com/database/free:latest

This will start up an Oracle Database 23ai Free instance in a container. It will have a PDB called FREEPDB1 and the password for PDBADMIN (and SYS and SYSTEM) will be Welcome12345.

Now, run the following command to create an Oracle user with appropriate permissions to create a vector store:

docker exec -i db23ai sqlplus sys/Welcome12345@localhost:1521/FREEPDB1 as sysdba <<EOF
alter session set container=FREEPDB1;
create user vector identified by vector;
grant connect, resource, unlimited tablespace, create credential, create procedure, create any index to vector;
commit;
EOF

Let’s connect to the database! First we’ll isntall the oracledb library. Create a new cell and enter this code:

%pip install oracledb

Run this code block to install the libary.

Now create a new code block with this code:

import oracledb

username = "vector"
password = "vector"
dsn = "localhost:1521/FREEPDB1"

try:
    connection = oracledb.connect(
        user=username, 
        password=password, 
        dsn=dsn)
    print("Connection successful!")
except Exception as e:
    print("Connection failed!")

Run this code block. You should see the output “Connection successful!”

Now, let’s install the dependencies we will need to load some documents into the vector store. Create a new cell with this code and run it:

%pip install langchain-community langchain-huggingface

Now, import the things we will need by creating a new call with this code and running it:

from langchain_community.vectorstores import oraclevs
from langchain_community.vectorstores.oraclevs import OracleVS
from langchain_community.vectorstores.utils import DistanceStrategy
from langchain_core.documents import Document
from langchain_huggingface import HuggingFaceEmbeddings

We are going to need some documents to load into the vector store, so let’s define some to use for an example. In real life, you’d probably want use your own non-public documents to load a vector store if you were building a chatbot or using retrieval augmented generation. Create and run a new call with this code:

documents_json_list = [
    {
        "id": "moby_dick_2701_P1",
        "text": "Queequeg was a native of Rokovoko, an island far away to the West and South. It is not down in any map; true places never are.",
        "link": "https://www.gutenberg.org/cache/epub/2701/pg2701-images.html#link2HCH0012",
    },
    {
        "id": "moby_dick_2701_P2",
        "text": "It was not a great while after the affair of the pipe, that one morning shortly after breakfast, Ahab, as was his wont, ascended the cabin-gangway to the deck. There most sea-captains usually walk at that hour, as country gentlemen, after the same meal, take a few turns in the garden.",
        "link": "https://www.gutenberg.org/cache/epub/2701/pg2701-images.html#link2HCH0036",
    },
    {
        "id": "moby_dick_2701_P3",
        "text": "Now, from the South and West the Pequod was drawing nigh to Formosa and the Bashee Isles, between which lies one of the tropical outlets from the China waters into the Pacific. And so Starbuck found Ahab with a general chart of the oriental archipelagoes spread before him; and another separate one representing the long eastern coasts of the Japanese islands—Niphon, Matsmai, and Sikoke. ",
        "link": "https://www.gutenberg.org/cache/epub/2701/pg2701-images.html#link2HCH0109",
    },
]

Now, let’s load them into a LangChain documents list with some metadata. Create and run a cell with this code:

# Create Langchain Documents

documents_langchain = []

for doc in documents_json_list:
    metadata = {"id": doc["id"], "link": doc["link"]}
    doc_langchain = Document(page_content=doc["text"], metadata=metadata)
    documents_langchain.append(doc_langchain)

Ok, great. Now we can create a vector store and load those documents. Create and run a cell with this code:

model = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")

vector_store = OracleVS.from_documents(
    documents_langchain,
    model,
    client=connection,
    table_name="Documents_COSINE",
    distance_strategy=DistanceStrategy.COSINE,
)

Let’s have a look in the database and see what was created. Run this code in your terminal:

docker exec -i db23ai sqlplus vector/vector@localhost:1521/FREEPDB1 <<EOF
select table_name from user_tables;
describe documents_cosine;
column id format a20;
column text format a30;
column metadata format a30;
column embedding format a30;
set linesize 150;
select * from documents_cosine;
EOF

You should see output similar to this:

SQL>
TABLE_NAME
--------------------------------------------------------------------------------
DOCUMENTS_COSINE

SQL>  Name                                         Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID                                        NOT NULL RAW(16)
 TEXT                                               CLOB
 METADATA                                           JSON
 EMBEDDING                                          VECTOR(768, FLOAT32)

SQL> SQL> SQL> SQL> SQL> SQL>
ID                   TEXT                           METADATA                       EMBEDDING
-------------------- ------------------------------ ------------------------------ ------------------------------
957B602A0B55C487     Now, from the South and West t {"id":"moby_dick_2701_P3","lin [9.29364376E-003,-5.70030287E-
                     he Pequod was drawing nigh to  k":"https://www.gutenberg.org/ 002,-4.62282933E-002,-1.599499
                     Formosa and the Bash           cache/epub/2701/pg27           58E-002,

A8A71597D56432FD     Queequeg was a native of Rokov {"id":"moby_dick_2701_P1","lin [4.28722538E-002,-8.80071707E-
                     oko, an island far away to the k":"https://www.gutenberg.org/ 003,3.56001826E-003,6.765306E-
                      West and South. It            cache/epub/2701/pg27           003,

E7675836CF07A695     It was not a great while after {"id":"moby_dick_2701_P2","lin [1.06763924E-002,3.91203648E-0
                      the affair of the pipe, that  k":"https://www.gutenberg.org/ 04,-1.01576066E-002,-3.5316135
                     one morning shortly            cache/epub/2701/pg27           7E-002,

Now, let’s do a vector similarity search. Create and run a cell with this code:

query = 'Where is Rokovoko?'
print(vector_store.similarity_search(query, 1))

query2 = 'What does Ahab like to do after breakfast?'
print(vector_store.similarity_search(query2, 1))

This will find the one (1) nearest match in each case. You should get an answer like this:

[Document(metadata={'id': 'moby_dick_2701_P1', 'link': 'https://www.gutenberg.org/cache/epub/2701/pg2701-images.html#link2HCH0012'}, page_content='Queequeg was a native of Rokovoko, an island far away to the West and South. It is not down in any map; true places never are.')]

[Document(metadata={'id': 'moby_dick_2701_P2', 'link': 'https://www.gutenberg.org/cache/epub/2701/pg2701-images.html#link2HCH0036'}, page_content='It was not a great while after the affair of the pipe, that one morning shortly after breakfast, Ahab, as was his wont, ascended the cabin-gangway to the deck. There most sea-captains usually walk at that hour, as country gentlemen, after the same meal, take a few turns in the garden.')]

Well, there you go, that’s the most basic example of creating a vector store, loading some documents into it and doing a simple similarity search. Stay tuned to learn about more advanced features!

Posted in Uncategorized | Tagged , , , , , , , | 1 Comment

Running Oracle Autonomous Database in a container

Did you know that you can easily run Oracle Autonomous Database in a container on your local machine? This is a great for development. It’s totally free, and you don’t even need to authenticate to pull the image. It also includes Oracle REST Data Services, APEX, Database Actions and the MongoDB API, so you get a nice built-in browser-based UI to work with your database. The free version does have a 20GB limit on database size, but for development purposes, that’s fine.

Prefer to watch a video? Watch this content on YouTube instead

To start up a database, you can use this command, just replace the “xxxxxx”s with proper passwords. Note that the volume is needed so data will be persisted across container restarts, if you leave that out, you’ll get a new empty database every time you restart the container:

docker run -d \
  -p 1521:1522 \
  -p 1522:1522 \
  -p 8443:8443 \
  -p 27017:27017 \
  -e WALLET_PASSWORD=xxxxxx \
  -e ADMIN_PASSWORD=xxxxxx \
  --cap-add SYS_ADMIN \
  --device /dev/fuse \
  --volume adb-free-volume:/data \
  --name adb-free \
  container-registry.oracle.com/database/adb-free:latest-23ai

The ports listed are for the following access methods:

  • 1521 TLS
  • 1522 mTLS
  • 8443 HTTPS port for ORDS, APEX and Database Actions
  • 27017 MongoDB API

Once the database has started up, you can access the web UI using these URLs:

Here’s what the Database Actions login page looks like, you can log in with the user “admin” and the password you specified:

When you sign in, you will see the launchpad, from where you can access various tools:

For example, you could open the SQL tool and try executing a statement:

You may also want to connect to your database using other tools like Oracle SQL Developer (which is a Visual Studio Code extension) or SQLcl (which is a command line tool), or from a program. To do this, you will probably want to grab the wallet, read on!

Connecting the the database

If you want to use mTLS, you can get the wallet by copying it from the image using this command, just provide the desired destination path in the last argument:

docker cp adb-free:/u01/app/oracle/wallets/tls_wallet /path/to/wallet

Note that the address will be ‘localhost’ in the tnsnames.ora, so you will need to update that if necessary.

To use the wallet, set your TNS_ADMIN environment variable:

export TNS_ADMIN=/path/to/wallet

The following TNS aliases are provided, for mTLS:

  • myatp_medium
  • myatp_high
  • myatp_low
  • myatp_tp
  • myatp_tpurgent

And for TLS:

  • myatp_medium_tls
  • myatp_high_tls
  • myatp_low_tls
  • myatp_tp_tls
  • myatp_tpurgent_tls

Here’s an example of connecting with SQLcl:

$ TNS_ADMIN=/path/to/wallet sql admin/xxxxxx@myatp_high

SQLcl: Release 24.1 Production on Fri Apr 25 10:41:46 2025

Copyright (c) 1982, 2025, Oracle.  All rights reserved.

Last Successful login time: Fri Apr 25 2025 10:41:48 -04:00

Connected to:
Oracle Database 23ai Enterprise Edition Release 23.0.0.0.0 - Production
Version 23.6.0.24.11

SQL> select sysdate;

SYSDATE
____________
25-APR-25

Here’s an example of connecting from SQL Developer. When you create the connection, just choose the location of the wallet (tnsnames.ora file) and it will let you select the TNS name to connect to:

Enjoy!

Posted in Uncategorized | Tagged , , , , , , , | Leave a comment

How to read the content of a JMS message using PL/SQL

This is just a short post – but all the details are in this post from Rob Van Wijk.

Today I wanted to read the contents of a JMS Text Message sitting in a queue. I wrote a Spring Boot micrsoervice that sends a message, and I have not written the one that recieves and processes the message yet, so I wanted to look at the message on the queue to check it was correct.

So I went and did a good old “select user_data from deposits_qt” and stared at the answer: “Object”. Hmmm, not what I wanted.

After a quick bit of Googling, I found Rob’s post which told me exactly what I needed to know. Yay! Thanks Rob!

Then I changed my query to this:

select qt.user_data.text_vc from account.deposits_qt qt;

And I got exactly what I needed:

{"accountId":2,"amount":200}

Fantastic! Thnaks a lot Rob!

Posted in Uncategorized | Tagged , , , , | Leave a comment

Vote for my session at VMWare Explore (Spring track)!

Vote for my People’s Choice Session “Experiences and lessons learnt building a multi-cloud #SpringBoot backend (ID 2002)” to be featured at #VMwareExplore 2023 Las Vegas! Place your vote by May 26: https://lnkd.in/eiRi-YF7

Register for VMWare Explore here. Learn more about Oracle Backend for Spring Boot here.

Posted in Uncategorized | Tagged , , | Leave a comment