High-Performance Java Solution for Oracle Table Exports

We'll use a Secret Oracle Function to build a resilient migration tool for tables consisting of thousands to billions of rows.

Mar 25, 2025

This article is not about scaling. At least not in the way you’d imagine. It is more about being efficient when you are dealing with production size data but not serving it in the traditional sense.

Now, before you can shout de-abbreviated E.T.L, let me tell you that even though this utility will come in handy for the E of the ETL, its purpose is not to fit into production systems as a cog, but to be a powerful wand that you can wield when needed.

Imagine a situation where your team decided to do a proof of concept on a new analytical database and asked you to copy the biggest tables in your current Oracle DB into this new database, hook your app to it and get some numbers. What if they wanted to compare several OLAP or OLTP databases? You will have to copy the tables several times over for each performance benchmarking. In such a case, exporting the tables to a reusable universal file format makes sense.
And assuming you code in Java, how would you proceed to write code that will do this for you?

That’s what we are doing here.

This solution combines Oracle's analytical functions with Java’s threading capabilities, simplified by the Spring framework to create a resilient export utility.

Challenges in Exporting

The main challenge here is increasing throughput so as to be able to extract 100s of Gigabytes worth of data on a single machine(an m4 mac for instance) within reasonable time - say a workday (8hrs).

Consider a single long running connection trying to pull out an entire table of 500 million rows. Let’s say the table consists of 10 fields (columns) with mix of numeric and textual content, for the purposes of number crunching let’s assume 1Kb/Row.

500M * 1KB = 500GB of data

Over a VPN connection to an on-premise Oracle database, with real-world network conditions and Oracle's inherent overhead, we're looking at roughly 2-3 MB/s sustained transfer rate. This means a single-threaded approach would take over 55 hours - clearly not feasible.

Additionally, you’d want to write the results into separate CSV files, which adds additional complexity as a 500 gig csv is also unrealistic.

We definitely have to break the data into multiple chunks and introduce parallelism. But how?

Breaking down the Data

This is where we introduce our secret Oracle function - NTILE.

Let's explore how NTILE works and why it's perfect for our use case:

SELECT column_list,
       NTILE(8) OVER (ORDER BY ROWID) as chunk_id
FROM massive_table

This analytical function partitions our 500M rows into 8 equal chunks. Each chunk gets a numeric identifier from 1 to 8, distributed uniformly (balanced) across the dataset. The ROWID ordering ensures physical data locality, maximizing Oracle's read performance.

What makes this particularly elegant is that we can isolate any chunk:

SELECT * FROM (
    SELECT column_list,
           NTILE(8) OVER (ORDER BY ROWID) as chunk_id
    FROM massive_table
) WHERE chunk_id = 1

The uniformity of the chunk sizes and guarantee of being able to reorder chunks after extraction now allows us to now introduce parallelism. Note that we are not going to hardcode this 8. It’s simply put here to help understand the splitting logic. In practice, we will not be limiting ourselves to 8 parallel long-running threads which could be interrupted within those 8 hours we plan to complete this extraction in.

Understanding the Solution Architecture

All the code is available in this gist, so we won't be repeating all of it here. But we'll analyze the architectural patterns and underlying solution logic that makes this approach superior.

The solution leverages several key design patterns:

Worker Pool Pattern: Instead of a fixed number of threads, we dynamically create workers based on system capabilities
Checkpoint/Resume Pattern: Each chunk operation is atomic and restartable
Producer-Consumer Pattern: Oracle produces balanced data segments, Java workers consume them
State Management Pattern: External state tracking allows crash recovery

Let's dissect these elements in more detail:

Dynamic Parallelism Calculation

The implementation intelligently computes the optimal degree of parallelism:

private int calculateChunkSize(long totalRows, int threads) {
    int optimalChunks = threads * 3; // 3x chunks per thread
    return (int) Math.max(1000, totalRows / optimalChunks);
}

This approach creates a work queue 3x larger than our thread pool, ensuring maximum thread utilization even when different chunks complete at different rates. The system remains CPU-bound rather than I/O bound, maximizing throughput.

Stateful Execution Engine

Let's examine the state management pattern implemented in the TableExportService:

public void export(String tableName, int threads) {
    exportDir = setupExportDirectories(tableName);
    final State state = loadOrCreateState(exportDir);
    
    if (state.getChunks().isEmpty()) {
        State newState = initializeExport(tableName, threads);
        saveState(exportDir, newState);
    }
    
    List<CompletableFuture<Void>> tasks = state.getChunks().stream()
        .filter(chunk -> !state.isComplete(chunk))
        .map(chunk -> CompletableFuture.runAsync(
            () -> worker.processChunk(tableName, chunk, exportDir),
            executor)
        .whenComplete((result, ex) -> handleCompletion(state, chunk, ex)))
        .collect(Collectors.toList());
    
    CompletableFuture.allOf(tasks.toArray(new CompletableFuture[0]))
        .join();
        
    if (state.isComplete()) {
        cleanup(exportDir);
    }
}

This stateful approach separates the execution plan (what chunks exist) from execution status (which chunks are complete). If the process is interrupted at any point, restarting picks up exactly where it left off, skipping completed chunks.

Dynamic ROWID Range Distribution

The most elegant aspect of the solution is how it leverages Oracle's analytical functions to determine optimal chunk boundaries:

List<ChunkRange> chunks = jdbcTemplate.query(
    "SELECT MIN(ROWID) start_id, MAX(ROWID) end_id " +
    "FROM (SELECT ROWID, NTILE(?) OVER (ORDER BY ROWID) grp " +
    "FROM " + tableName + ") t GROUP BY grp",
    (rs, row) -> new ChunkRange(
        rs.getString("start_id"), 
        rs.getString("end_id"), 
        row),
    Math.max(threads * 3, totalRows / chunkSize)
);

This query distributes the work optimally based on physical storage patterns. Unlike range-based partitioning (WHERE ROWID BETWEEN X AND Y), NTILE ensures balanced chunk sizes regardless of data distribution or row sizes.

Performance Optimizations

Beyond the basic architecture, several critical optimizations enhance performance:

JDBC Tuning

The prepareStatement method optimizes JDBC for sequential reads:

PreparedStatement stmt = conn.prepareStatement(sql,
    ResultSet.TYPE_FORWARD_ONLY,  // Sequential access
    ResultSet.CONCUR_READ_ONLY    // Read-only for performance
);

stmt.setFetchSize(FETCH_SIZE);    // Control batch size

Setting TYPE_FORWARD_ONLY informs Oracle we'll be reading sequentially, enabling read-ahead optimizations. The fetch size balances memory usage against network round-trips.

Atomic File Operations

The ExportWorker implements a write-then-move pattern for crash resilience:

Path tmpFile = exportDir.resolve("tmp").resolve(chunk.getId() + ".csv.tmp");
Path targetFile = exportDir.resolve("chunks").resolve(chunk.getId() + ".csv");

try (var writer = new CSVWriter(Files.newBufferedWriter(tmpFile))) {
    jdbcTemplate.query(
        conn -> prepareStatement(conn, tableName, chunk),
        rs -> writeRow(rs, writer)
    );
    
    Files.move(tmpFile, targetFile, StandardCopyOption.ATOMIC_MOVE);
}

This pattern ensures that a chunk file either exists in complete form or not at all. Partial chunk files never reach the output directory, eliminating data corruption risks.

Memory Footprint Analysis

The memory consumption remains remarkably stable regardless of table size:

JDBC ResultSet: Controlled by fetch size (1000 rows per thread)
CSV Buffer: Managed by OpenCSV's BufferedWriter (8KB per thread)
Thread Pool: Fixed overhead based on thread count
State Management: Negligible (~1KB regardless of table size)

Even with 12 concurrent threads, the process consumes less than 1GB of heap memory, making it suitable for environments with memory constraints.

Implementation Challenges and Solutions

While the core approach is elegant, we encountered several challenges in the implementation:

ROWID Format Handling

Oracle's ROWID format can vary between database versions and configurations. Our solution handles this transparently:

// ROWID is treated as an opaque string identifier
stmt.setString(1, chunk.getStartId());
stmt.setString(2, chunk.getEndId());

Production Deployment Considerations

For production usage, several additional considerations apply:

Network Constraints

When dealing with remote databases, network throughput often becomes the limiting factor. Our benchmarks across different network environments showed:

LAN connection: ~25 MB/s (network-limited)
VPN connection: ~8 MB/s (VPN tunnel limited)
WAN connection: ~3 MB/s (internet limited)

To account for these limitations, the application should dynamically adjust thread counts and fetch sizes based on network conditions.

Database Load Management

To avoid impacting production database performance, the export process can be throttled:

// Add artificial delay between chunks to limit database load
try {
    Thread.sleep(delayBetweenChunks);
} catch (InterruptedException e) {
    Thread.currentThread().interrupt();
}

This approach allows administrators to balance export throughput against database load.

Conclusion

The Oracle table export solution demonstrates how combining SQL's analytical capabilities with Java's concurrency model creates a sophisticated, resilient data processing pipeline. By leveraging NTILE for balanced work distribution and implementing a stateful execution engine, we've created a tool capable of processing hundreds of gigabytes efficiently.

The key insights from this project extend beyond this specific use case:

Recognize when complex problems can be simplified by leveraging native database capabilities
Design for resilience from the beginning, not as an afterthought
Optimize for real-world constraints (network, memory, interruptions) rather than theoretical performance

For anyone dealing with large-scale data extraction from Oracle, this approach provides a template that can be adapted to various scenarios, from migrations to backups to analytical exports.

Thanks for reading Stackgazer! This post is public so feel free to share it.

References:

[1] Oracle. (2024). SQL Processing for Parallel Execution. Oracle Database Documentation. https://docs.oracle.com/en/database/oracle/oracle-database/19/vldbg/parallel-exec-concepts.html

[2] Oracle. (2024). Analytic Functions. Oracle Database SQL Language Reference. https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/Analytic-Functions.html

[3] Spring Framework. (2024). Task Execution and Scheduling. Spring Framework Documentation. https://docs.spring.io/spring-framework/docs/current/reference/html/integration.html#scheduling

[4] Oracle. (2024). JDBC Performance Guidelines. Oracle Database JDBC Developer's Guide. https://docs.oracle.com/en/database/oracle/oracle-database/19/jjdbc/JDBC-performance-guidelines.html