A comprehensive Apache Arrow Flight server and client implementation in Java that demonstrates Flight RPC functionality, including PollFlightInfo for long-running queries that exceed network load balancer timeout limits.
- Basic Flight Operations: Server with sample data, client testing, flight listing
- PollFlightInfo Implementation: Non-blocking long-running query support following Flight specification
- Multiple Query Types: 1 minute to 2 hours duration for testing various scenarios
- AWS Deployment: Production-ready deployment with Network Load Balancer
- Thread Safety: Concurrent query execution with real-time progress tracking
- NLB Timeout Resilience: Handles queries exceeding load balancer timeout limits
This application can run in multiple modes:
- Server Mode: Starts a Flight server that provides sample data and long-running queries
- Client Mode: Connects to a Flight server and retrieves data
- PollFlightInfo Client: Tests long-running queries with non-blocking polling
- Java 21 or higher
- Maven 3.6 or higher
mvn clean compilemvn clean packageThis creates an executable JAR file at target/simple-flight-server-1.0-SNAPSHOT.jar (~24MB) that includes all dependencies and can run on any server with Java 17+.
Important: This project includes a .mvn/jvm.config file that automatically applies the required JVM arguments for Apache Arrow.
Start the Flight server on port 8815:
mvn exec:java -Dexec.mainClass="org.example.Main" -Dexec.args="-server"Connect to a local server:
mvn exec:java -Dexec.mainClass="org.example.Main"Connect to AWS-deployed server:
mvn exec:java -Dexec.mainClass="org.example.AWSFlightClient" -Dexec.args="your-load-balancer-dns.elb.amazonaws.com"Note: Due to Maven exec plugin configuration, you may need to use Method 2 for the AWS client.
First, compile the project:
mvn compilejava --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED \
-cp "target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q)" \
org.example.Main -serverjava --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED \
-cp "target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q)" \
org.example.Mainjava --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED \
-cp "target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q)" \
org.example.AWSFlightClient your-load-balancer-dns.elb.amazonaws.com./run-server.sh./run-client.shIf you prefer to run without the .mvn/jvm.config file:
mvn exec:java -Dexec.mainClass="org.example.Main" \
-Dexec.args="-server" \
-Dexec.jvmArgs="--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED"mvn exec:java -Dexec.mainClass="org.example.Main" \
-Dexec.jvmArgs="--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED"After building the project with mvn clean package, you can run the application on any server with Java 17+ using the generated JAR file.
java --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED \
-jar target/simple-flight-server-1.0-SNAPSHOT.jar -serverjava --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED \
-jar target/simple-flight-server-1.0-SNAPSHOT.jarjava --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED \
-jar target/simple-flight-server-1.0-SNAPSHOT.jar your-load-balancer-dns.elb.amazonaws.comNote: The main JAR only supports local client connections. For AWS connections, use the AWSFlightClient class with Method 2 above.
mvn clean compileTest the 65-second fixed polling scenario on AWS:
java --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED \
-cp "target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q)" \
org.example.PollFlightClient --fixed-polling your-load-balancer-dns.elb.amazonaws.com# Method 1: Maven
mvn exec:java -Dexec.mainClass="org.example.Main" -Dexec.args="-server"
# Method 2: Java with classpath (recommended)
java --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED \
-cp "target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q)" \
org.example.Main -server
# Method 3: Convenience script
./run-server.sh# Method 1: Maven (local connection)
mvn exec:java -Dexec.mainClass="org.example.Main"
# Method 2: Java with classpath - Local connection
java --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED \
-cp "target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q)" \
org.example.Main
# Method 2: Java with classpath - AWS connection
java --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED \
-cp "target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q)" \
org.example.AWSFlightClient your-load-balancer-dns.elb.amazonaws.com
# Method 3: Convenience script (local connection)
./run-client.shThis implementation includes comprehensive PollFlightInfo support for long-running queries that exceed network load balancer timeout limits.
| Query Type | Duration | Use Case |
|---|---|---|
medium-query |
1 minute | Fixed polling tests |
long-query |
2 minutes | Standard testing |
very-long-query |
5 minutes | Extended testing |
ultra-long-query |
2 hours | NLB timeout testing |
Tests a client that polls for exactly 65 seconds and retrieves data:
# AWS testing (recommended)
java --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED \
-cp "target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q)" \
org.example.PollFlightClient --fixed-polling your-load-balancer-dns.elb.amazonaws.com
# Local testing
java --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED \
-cp "target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q)" \
org.example.PollFlightClient --fixed-pollingExpected Behavior:
- Polls every 5 seconds for up to 65 seconds
- Query completes around 60 seconds
- Retrieves complete dataset (100 rows)
- Shows "Complete dataset retrieved"
# 2-minute query
java --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED \
-cp "target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q)" \
org.example.PollFlightClient --long your-load-balancer-dns.elb.amazonaws.com
# 5-minute query
java --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED \
-cp "target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q)" \
org.example.PollFlightClient --very-long your-load-balancer-dns.elb.amazonaws.com
# 2-hour query (tests NLB timeout resilience)
java --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED \
-cp "target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q)" \
org.example.PollFlightClient --ultra-long your-load-balancer-dns.elb.amazonaws.com# Terminal 1: Start local server
java --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED \
-cp "target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q)" \
org.example.Main -server
# Terminal 2: Test PollFlightInfo client
java --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED \
-cp "target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q)" \
org.example.PollFlightClient --fixed-polling- Non-Blocking: Client never hangs for hours during long queries
- Progress Monitoring: Real-time progress updates from 0% to 100%
- Resource Efficient: Server doesn't hold connections during execution
- NLB Compatible: Short polling requests work with any timeout configuration
- Scalable: Multiple concurrent long-running queries supported
π§ͺ Fixed Duration Polling Test
==============================
π Starting long-running query: medium-query
π Initial poll response received
Progress: 0.0%
Has FlightInfo: true
Has poll descriptor: true
π Polling query status (poll #2, elapsed: 5s)...
π Poll response:
Progress: 8.8%
Has FlightInfo: true
β³ Query still running, continuing to poll...
...
β
Query completed during polling! (at poll #13)
π FlightInfo available after 65 seconds:
Schema: Schema<value: Int(32, true)>
Records: 100
Endpoints: 1
π₯ Retrieving available data...
Total: 100 rows in 1 batches
β
Complete dataset retrieved
β
Fixed duration polling test completed!
The standard client will:
- Connect to the server at localhost:8815 (or specified AWS endpoint)
- List available flights
- Get flight information
- Retrieve sample data (10 rows: 0, 10, 20, ..., 90)
- Perform an echo action
WARNING: Unknown module: org.apache.arrow.memory.core specified to --add-opens
[INFO] Scanning for projects...
[INFO]
[INFO] ------------------< org.example:simple-flight-server >------------------
[INFO] Building simple-flight-server 1.0-SNAPSHOT
[INFO] from pom.xml
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- exec:3.5.1:java (default-cli) @ simple-flight-server ---
SLF4J(W): No SLF4J providers were found.
SLF4J(W): Defaulting to no-operation (NOP) logger implementation
SLF4J(W): See https://www.slf4j.org/codes.html#noProviders for further details.
Flight server started on port 8815
Press Ctrl+C to stop the server
WARNING: Unknown module: org.apache.arrow.memory.core specified to --add-opens
SLF4J(W): No SLF4J providers were found.
SLF4J(W): Defaulting to no-operation (NOP) logger implementation
SLF4J(W): See https://www.slf4j.org/codes.html#noProviders for further details.
Flight server started on port 8815
Press Ctrl+C to stop the server
WARNING: Unknown module: org.apache.arrow.memory.core specified to --add-opens
[INFO] Scanning for projects...
[INFO]
[INFO] ------------------< org.example:simple-flight-server >------------------
[INFO] Building simple-flight-server 1.0-SNAPSHOT
[INFO] from pom.xml
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- exec:3.5.1:java (default-cli) @ simple-flight-server ---
SLF4J(W): No SLF4J providers were found.
SLF4J(W): Defaulting to no-operation (NOP) logger implementation
SLF4J(W): See https://www.slf4j.org/codes.html#noProviders for further details.
Connected to server at Location{uri=grpc+tcp://localhost:8815}
Listing flights:
Found flight: sample
Schema: Schema<value: Int(32, true)>
Got flight info for: sample
Getting data stream:
Stream schema: Schema<value: Int(32, true)>
Received batch with 10 rows:
Row 0: 0
Row 1: 10
Row 2: 20
Row 3: 30
Row 4: 40
Row 5: 50
Row 6: 60
Row 7: 70
Row 8: 80
Row 9: 90
Performing action:
Action result: Hello, Flight!
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
WARNING: Unknown module: org.apache.arrow.memory.core specified to --add-opens
Connecting to AWS Flight Server...
Host: your-load-balancer-dns.elb.amazonaws.com
Port: 8815
SLF4J(W): No SLF4J providers were found.
SLF4J(W): Defaulting to no-operation (NOP) logger implementation
SLF4J(W): See https://www.slf4j.org/codes.html#noProviders for further details.
β Connected to AWS Flight server at Location{uri=grpc+tcp://your-load-balancer-dns.elb.amazonaws.com:8815}
π Listing flights:
Found flight: sample
Schema: Schema<value: Int(32, true)>
π Got flight info for: sample
Endpoints: 1
Records: 10
π₯ Getting data stream:
Stream schema: Schema<value: Int(32, true)>
Received batch with 10 rows:
Row 0: 0
Row 1: 10
Row 2: 20
Row 3: 30
Row 4: 40
Row 5: 50
Row 6: 60
Row 7: 70
Row 8: 80
Row 9: 90
π Performing action:
Action result: Hello from AWS client!
β
AWS Flight client test completed successfully!
The --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED JVM argument is required for Apache Arrow to access Java's internal memory structures. This project includes a .mvn/jvm.config file that automatically applies this argument.
Note: You may see a warning WARNING: Unknown module: org.apache.arrow.memory.core specified to --add-opens - this is harmless and can be ignored.
Both server and client are configured to use port 8815. Make sure this port is available when running the server.
The project includes:
.mvn/jvm.config- Automatically applies required JVM argumentspom.xml- Configured with all necessary Arrow dependencies
.mvn/
βββ jvm.config # JVM arguments for Arrow compatibility
src/main/java/org/example/
βββ Main.java # Main application with server/client modes
βββ AWSFlightClient.java # AWS-specific client for testing deployments
βββ PollFlightClient.java # PollFlightInfo client for long-running queries
βββ SampleFlightProducer # Flight server implementation (inner class)
target/
βββ simple-flight-server-1.0-SNAPSHOT.jar # Executable JAR with all dependencies (~24MB)
aws-infrastructure.yaml # CloudFormation template for AWS deployment
deploy-to-aws.sh # AWS deployment automation script
test-aws-deployment.sh # AWS deployment testing script
AWS-SETUP.md # Detailed AWS setup guide
pom.xml # Maven configuration with Arrow dependencies
run-server.sh # Convenience script to run server locally
run-client.sh # Convenience script to run client locally
| Method | Use Case | Command |
|---|---|---|
| Maven Server | Development | mvn exec:java -Dexec.mainClass="org.example.Main" -Dexec.args="-server" |
| Maven Client | Local testing | mvn exec:java -Dexec.mainClass="org.example.Main" |
| Java Server | Production/Reliable | java --add-opens=... -cp "target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q)" org.example.Main -server |
| Java Local Client | Local testing | java --add-opens=... -cp "target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q)" org.example.Main |
| Java AWS Client | AWS testing | java --add-opens=... -cp "target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q)" org.example.AWSFlightClient <hostname> |
| PollFlightInfo Fixed | 65s polling test | java --add-opens=... -cp "target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q)" org.example.PollFlightClient --fixed-polling your-hostname |
| PollFlightInfo Long | Long query test | java --add-opens=... -cp "target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q)" org.example.PollFlightClient --ultra-long your-hostname |
| JAR Server | Deployment | java --add-opens=... -jar target/simple-flight-server-1.0-SNAPSHOT.jar -server |
| JAR Client | Deployment | java --add-opens=... -jar target/simple-flight-server-1.0-SNAPSHOT.jar |
- Development: Use Maven methods for quick testing
- Production/AWS: Use Java with classpath for reliability
- Deployment: Use JAR files for standalone execution
The project uses the following Apache Arrow dependencies:
flight-core(17.0.0) - Flight server/client functionalityarrow-memory-core(17.0.0) - Memory managementarrow-vector(17.0.0) - Vector operationsarrow-memory-netty(17.0.0) - Runtime memory implementation
Java Version: Compiled for Java 17 (compatible with Java 17+)
- Build the JAR:
mvn clean package - Copy
target/simple-flight-server-1.0-SNAPSHOT.jarto your server - Run:
java --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED -jar simple-flight-server-1.0-SNAPSHOT.jar -server
The JAR file (~24MB) contains all dependencies and only requires Java 21+ on the target server.
The same JAR can be used as a client by omitting the -server argument:
java --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED -jar simple-flight-server-1.0-SNAPSHOT.jarThis project includes complete AWS infrastructure automation to deploy the Flight server on EC2 behind a Network Load Balancer.
-
Setup AWS access (see AWS-SETUP.md for detailed instructions):
aws configure
-
Create EC2 Key Pair:
aws ec2 create-key-pair --key-name flight-server-key --query 'KeyMaterial' --output text > flight-server-key.pem chmod 400 flight-server-key.pem
-
Deploy to AWS:
./deploy-to-aws.sh
-
Test the deployment:
./test-aws-deployment.sh
- EC2 Instance: Amazon Linux 2023 with Java 17 and your Flight server
- Network Load Balancer: Internet-facing NLB for gRPC traffic on port 8815
- Connection Timeout: 6000 seconds (100 minutes, maximum allowed)
- Health Checks: TCP health checks on port 8815
- VPC & Networking: Complete networking setup with security groups
- Auto-start Service: Systemd service that automatically starts your Flight server
Test your AWS deployment with the included AWS client:
mvn exec:java -Dexec.mainClass="org.example.AWSFlightClient" -Dexec.args="your-load-balancer-dns.elb.amazonaws.com"For detailed AWS setup instructions, see AWS-SETUP.md.
If the client shows "Connection refused", make sure the server is running first.
If you see "Failed to initialize MemoryUtil", ensure you're using the required JVM arguments.
If port 8815 is already in use, you'll need to stop the existing process or modify the port in the code.
If you get "JAR file not found", run mvn clean package first to build the executable JAR.
If you encounter compilation errors:
# Clean and rebuild
mvn clean compile
# Or clean and package
mvn clean packageIf the Maven exec plugin doesn't work correctly:
- Use Method 2 (Java with classpath) instead
- The classpath method is more reliable for running different main classes
If the AWS client shows connection errors:
- Verify the load balancer DNS name is correct
- Ensure the server is running and healthy
- Check that port 8815 is accessible
- Use the AWSFlightClient class with Method 2 (Java with classpath)
If you see the wrong client running (e.g., local client instead of AWS client):
- Use Method 2 (Java with classpath) which allows explicit main class specification
- The Maven exec plugin may have configuration conflicts
If you get "ClassNotFoundException":
# Regenerate the classpath
mvn dependency:build-classpath -Dmdep.outputFile=classpath.txt
cat classpath.txt
# Use the explicit classpath
java --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED \
-cp "target/classes:$(cat classpath.txt)" \
org.example.Main -serverTo modify the server behavior, edit the SampleFlightProducer class in Main.java. The producer implements:
listFlights()- Returns available flightsgetFlightInfo()- Provides flight metadata (blocking for long queries)pollFlightInfo()- NEW: Non-blocking long-running query supportgetStream()- Serves data streamsdoAction()- Handles custom actions
The server includes a complete PollFlightInfo implementation that:
- Follows the Apache Arrow Flight specification exactly
- Supports concurrent long-running queries with real-time progress tracking
- Provides non-blocking query execution for queries exceeding NLB timeout limits
- Returns proper FlightInfo evolution (empty endpoints β complete FlightInfo)
- Handles query expiration and cleanup automatically
Key implementation details:
- Query State Management: Thread-safe concurrent query tracking
- Background Execution: Separate threads for query simulation
- Progress Updates: Real-time progress from 0% to 100%
- Specification Compliance: Proper PollInfo responses with FlightDescriptor management