Skip to content

[BUG]: Ingestion hangs with errors in nv-ingest pod logs #181

@sanzende

Description

@sanzende

Version

v2.3.0

Describe the bug.

I brought up the RAG pipeline using the command

helm upgrade --install rag -n rag https://helm.ngc.nvidia.com/nvidia/blueprint/charts/nvidia-blueprint-rag-v2.3.0.tgz
--username '$oauthtoken'
--password "${NGC_API_KEY}"
--set imagePullSecret.password=$NGC_API_KEY
--set ngcApiSecret.password=$NGC_API_KEY

and all went into Running state but when I try to ingest/upload a file from the UI, it hangs. At the same time, I notice errors in the rag-nv-ingest pod logs.

rag-nv-ingest.txt

Minimum reproducible example

helm upgrade --install rag -n rag https://helm.ngc.nvidia.com/nvidia/blueprint/charts/nvidia-blueprint-rag-v2.3.0.tgz \
--username '$oauthtoken' \
--password "${NGC_API_KEY}" \
--set imagePullSecret.password=$NGC_API_KEY \
--set ngcApiSecret.password=$NGC_API_KEY

Relevant log output

oc logs rag-nv-ingest-75ddbc5d96-dp2mz -f
Defaulted container "nv-ingest" out of: nv-ingest, verify-tmpdir-permissions (init)
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
INFO:nv_ingest_api.util.system.hardware_info:Detected 128 logical cores via psutil.
INFO:nv_ingest_api.util.system.hardware_info:Detected 64 physical cores via psutil.
INFO:nv_ingest_api.util.system.hardware_info:Cgroup v2 quota detected: 1600000 us / 100000 us = 16.00 effective cores
INFO:nv_ingest_api.util.system.hardware_info:Detected 128 cores via os.sched_getaffinity.
INFO:nv_ingest_api.util.system.hardware_info:Raw CPU limit determined: 16.00 (Method: cgroup_v2_quota)
INFO:nv_ingest_api.util.system.hardware_info:Effective CPU core limit determined: 16.00 (Method: cgroup_v2_quota)
[2025-12-23 07:57:11,175 E 4180 4180] logging.cc:118: Unhandled exception: N5boost10wrapexceptINS_6system12system_errorEEE. what(): thread: Resource temporarily unavailable [system:11]
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1766476631.172295    6109 thd.cc:160] pthread_create failed: Resource temporarily unavailable
terminate called after throwing an instance of 'std::system_error'
  what():  Resource temporarily unavailable
Traceback (most recent call last):
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_private/workers/default_worker.py", line 240, in <module>
    node = ray._private.node.Node(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_private/node.py", line 368, in __init__
    node_info = ray._private.services.get_node(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_private/services.py", line 476, in get_node
    return global_state.get_node(node_id)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    self._check_connected()
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_private/state.py", line 46, in _check_connected
    self._really_init_global_state()
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_private/state.py", line 77, in _really_init_global_state
    self.global_state_accessor = GlobalStateAccessor(self.gcs_options)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "python/ray/includes/global_state_accessor.pxi", line 39, in ray._raylet.GlobalStateAccessor.__cinit__
RuntimeError: Resource temporarily unavailable
[2025-12-23 07:57:11,289 E 4169 4169] logging.cc:125: Stack trace: 
 /opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x1437d28) [0x7fc4f59edd28] ray::operator<<()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x143ba76) [0x7fc4f59f1a76] ray::TerminateHandler()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/../../../libstdc++.so.6(+0xbb6bb) [0x7fc4f447f6bb] __cxxabiv1::__terminate()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/../../../libstdc++.so.6(_ZSt10unexpectedv+0) [0x7fc4f44790e3] std::unexpected()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/../../../libstdc++.so.6(__cxa_rethrow+0) [0x7fc4f447f8be] __cxa_rethrow
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x57f33d) [0x7fc4f4b3533d] boost::asio::detail::do_throw_error()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x1415954) [0x7fc4f59cb954] boost::asio::detail::do_throw_error()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x14162b0) [0x7fc4f59cc2b0] boost::asio::detail::posix_thread::start_thread()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x1416667) [0x7fc4f59cc667] boost::asio::thread_pool::thread_pool()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0xaf03fb) [0x7fc4f50a63fb] ray::rpc::(anonymous namespace)::_GetServerCallExecutor()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(_ZN3ray3rpc21GetServerCallExecutorEv+0x9) [0x7fc4f50a6489] ray::rpc::GetServerCallExecutor()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x7d9475) [0x7fc4f4d8f475] std::_Function_handler<>::_M_invoke()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x8498b5) [0x7fc4f4dff8b5] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x869a98) [0x7fc4f4e1fa98] ray::core::InboundRequest::Accept()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x85761d) [0x7fc4f4e0d61d] ray::core::NormalSchedulingQueue::ScheduleRequests()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0xc2f33a) [0x7fc4f51e533a] EventTracker::RecordExecution()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0xc04132) [0x7fc4f51ba132] std::_Function_handler<>::_M_invoke()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x86c8ce) [0x7fc4f4e228ce] boost::asio::detail::executor_op<>::do_complete()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x14132b2) [0x7fc4f59c92b2] boost::asio::detail::scheduler::do_run_one()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x14149a2) [0x7fc4f59ca9a2] boost::asio::detail::scheduler::run()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x14150e1) [0x7fc4f59cb0e1] boost::asio::io_context::run()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(_ZN3ray4core10CoreWorker20RunTaskExecutionLoopEv+0x138) [0x7fc4f4cf6f98] ray::core::CoreWorker::RunTaskExecutionLoop()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(_ZN3ray4core21CoreWorkerProcessImpl26RunWorkerTaskExecutionLoopEv+0x46) [0x7fc4f4d5f386] ray::core::CoreWorkerProcessImpl::RunWorkerTaskExecutionLoop()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(_ZN3ray4core17CoreWorkerProcess20RunTaskExecutionLoopEv+0x16) [0x7fc4f4d65916] ray::core::CoreWorkerProcess::RunTaskExecutionLoop()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7fc4f4b5d3ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
ray::PDFExtractorStage(PyObject_Vectorcall+0x2e) [0x55adffaeadde] PyObject_Vectorcall
ray::PDFExtractorStage(+0x112b23) [0x55adff9ebb23] _PyEval_EvalFrameDefault.cold
ray::PDFExtractorStage(PyEval_EvalCode+0xa1) [0x55adffb891a1] PyEval_EvalCode
ray::PDFExtractorStage(+0x2ea8ca) [0x55adffbc38ca] run_eval_code_obj
ray::PDFExtractorStage(+0x2e5585) [0x55adffbbe585] run_mod
ray::PDFExtractorStage(+0x2e2620) [0x55adffbbb620] pyrun_file
ray::PDFExtractorStage(_PyRun_SimpleFileObject+0x1ce) [0x55adffbbb2be] _PyRun_SimpleFileObject
ray::PDFExtractorStage(_PyRun_AnyFileObject+0x44) [0x55adffbbafe4] _PyRun_AnyFileObject
ray::PDFExtractorStage(Py_RunMain+0x3a2) [0x55adffbb7eb2] Py_RunMain
ray::PDFExtractorStage(Py_BytesMain+0x37) [0x55adffb73247] Py_BytesMain
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7fc4f6753d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7fc4f6753e40] __libc_start_main
ray::PDFExtractorStage(+0x29a0ed) [0x55adffb730ed]

*** SIGABRT received at time=1766476631 on cpu 90 ***
PC: @     0x7fc4f67c09fc  (unknown)  pthread_kill
    @             0x3039  (unknown)  (unknown)
[2025-12-23 07:57:11,292 E 4169 4169] logging.cc:474: *** SIGABRT received at time=1766476631 on cpu 90 ***
[2025-12-23 07:57:11,292 E 4169 4169] logging.cc:474: PC: @     0x7fc4f67c09fc  (unknown)  pthread_kill
Fatal Python error: Aborted

Stack (most recent call first):
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_private/worker.py", line 984 in main_loop

Extension modules: msgpack._cmsgpack, google._upb._message, psutil._psutil_linux, psutil._psutil_posix, yaml._yaml, _brotli, zstandard.backend_c, uvloop.loop, ray._raylet, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, pyarrow.lib, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.strptime, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.lib, pyarrow._compute, pandas._libs.ops, pandas._libs.hashing, pandas._libs.arrays, pandas._libs.tslib, pandas._libs.sparse, pandas._libs.internals, pandas._libs.indexing, pandas._libs.index, pandas._libs.writers, pandas._libs.join, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.groupby, pandas._libs.json, pandas._libs.parsers, pandas._libs.testing, PIL._imaging, multidict._multidict, yarl._quoting_c, propcache._helpers_c, aiohttp._http_writer, aiohttp._http_parser, aiohttp._websocket.mask, aiohttp._websocket.reader_c, frozenlist._frozenlist, grpc._cython.cygrpc, rapidjson, _cffi_backend, pyarrow._json (total: 77)
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/../../../libstdc++.so.6(+0xbb6bb) [0x7fc658a8f6bb] __cxxabiv1::__terminate()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x8498b5) [0x7fc65940f8b5] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7fc65916d3ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7fc65ad66d90]
ray::PDFExtractorStage(+0x29a0ed) [0x561a79c4c0ed]



[2025-12-23 07:57:11,359 C 4055 4055] (raylet) worker_pool.cc:684: Failed to start worker with return value system:11: Resource temporarily unavailable
*** StackTrace Information ***
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x102340a) [0x559dcfe1940a] ray::RayLog::~RayLog()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x3fa77e) [0x559dcf1f077e] ray::raylet::WorkerPool::StartProcess()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x406c0c) [0x559dcf1fcc0c] ray::raylet::WorkerPool::StartWorkerProcess()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x407243) [0x559dcf1fd243] ray::raylet::WorkerPool::StartNewWorker()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x40d3d9) [0x559dcf2033d9] ray::raylet::WorkerPool::StartNewWorker()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x3f8e30) [0x559dcf1eee30] ray::raylet::WorkerPool::TryPendingStartRequests()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x402581) [0x559dcf1f8581] ray::raylet::WorkerPool::PushWorker()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x38d533) [0x559dcf183533] ray::raylet::NodeManager::HandleWorkerAvailable()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x38d6df) [0x559dcf1836df] ray::raylet::NodeManager::ProcessAnnounceWorkerPortMessageImpl()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x3be407) [0x559dcf1b4407] ray::raylet::NodeManager::ProcessClientMessage()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x603eae) [0x559dcf3f9eae] ray::ClientConnection::ProcessMessage()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x6047d5) [0x559dcf3fa7d5] boost::asio::detail::binder2<>::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x605e60) [0x559dcf3fbe60] boost::asio::detail::reactive_socket_recv_op<>::do_complete()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x2ba220) [0x559dcf0b0220] main
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f714530cd90]
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x2d5ee0) [0x559dcf0cbee0]

WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1766476631.385917    5190 chttp2_transport.cc:1182] ipv4:10.128.2.66:46285: Got goaway [2] err=UNAVAILABLE:GOAWAY received; Error code: 2; Debug Text: Cancelling all calls {grpc_status:14, http2_error:2, created_time:"2025-12-23T07:57:11.385911334+00:00"}
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/../../../libstdc++.so.6(+0xbb6bb) [0x7ff99be236bb] __cxxabiv1::__terminate()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x8498b5) [0x7ff99c7a38b5] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7ff99c5013ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7ff99e0f7d90]
ray::PDFExtractorStage(+0x29a0ed) [0x555e0b3e70ed]



/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/../../../libstdc++.so.6(+0xbb6bb) [0x7f1d6245f6bb] __cxxabiv1::__terminate()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x8498b5) [0x7f1d62ddf8b5] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f1d62b3d3ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f1d64736d90]
ray::PDFExtractorStage(+0x29a0ed) [0x55c3f2b010ed]



/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/../../../libstdc++.so.6(+0xbb6bb) [0x7ff20ffab6bb] __cxxabiv1::__terminate()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x8498b5) [0x7ff21092b8b5] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7ff2106893ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7ff21227fd90]
ray::PDFExtractorStage(+0x29a0ed) [0x5608679b30ed]



/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/../../../libstdc++.so.6(+0xbb6bb) [0x7fcaeb0196bb] __cxxabiv1::__terminate()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x8498b5) [0x7fcaeb9998b5] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7fcaeb6f73ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7fcaed2f0d90]
ray::PDFExtractorStage(+0x29a0ed) [0x55f5bb7a50ed]



2025-12-23 07:57:11,373	ERROR worker.py:984 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::image_extractor_a6bf2ac8-617a-4eaa-be3c-ea6ba33589f7:ImageExtractorStage.__init__() (pid=5222, ip=10.128.2.66, actor_id=c5d60362068042d8669d510501000000, repr=<nv_ingest.framework.orchestration.ray.stages.extractors.image_extractor.FunctionActorManager._create_fake_actor_class.<locals>.TemporaryActor object at 0x7fed655706e0>)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: The actor with name ImageExtractorStage failed to import on the worker. This may be because needed library dependencies are not installed in the worker environment:

ray::image_extractor_a6bf2ac8-617a-4eaa-be3c-ea6ba33589f7:ImageExtractorStage.__init__() (pid=5222, ip=10.128.2.66, actor_id=c5d60362068042d8669d510501000000, repr=<nv_ingest.framework.orchestration.ray.stages.extractors.image_extractor.FunctionActorManager._create_fake_actor_class.<locals>.TemporaryActor object at 0x7fed655706e0>)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    from nv_ingest_api.internal.extract.image.image_helpers.common import unstructured_image_extractor
    from nv_ingest_api.internal.primitives.nim.model_interface.yolox import (
    from nv_ingest_api.internal.primitives.nim.model_interface.decorators import multiprocessing_cache
    manager = Manager()
              ^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/multiprocessing/context.py", line 57, in Manager
    m.start()
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/multiprocessing/managers.py", line 562, in start
    self._process.start()
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
                  ^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/multiprocessing/context.py", line 282, in _Popen
    return Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^
    self._launch(process_obj)
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/multiprocessing/popen_fork.py", line 66, in _launch
    self.pid = os.fork()
               ^^^^^^^^^
BlockingIOError: [Errno 11] Resource temporarily unavailable job_id=01000000 worker_id=27296f027456e9210ad8f88af26db7386f084e08168f1242f06350fa node_id=273acc807bd6909977631487b5a37c0aeb510ac0584931daa864ced8 actor_id=c5d60362068042d8669d510501000000 task_id=ffffffffffffffffc5d60362068042d8669d510501000000 task_name=image_extractor_a6bf2ac8-617a-4eaa-be3c-ea6ba33589f7:ImageExtractorStage.__init__ task_func_name=nv_ingest.framework.orchestration.ray.stages.extractors.image_extractor.ImageExtractorStage.__init__ actor_name=image_extractor_a6bf2ac8-617a-4eaa-be3c-ea6ba33589f7 timestamp_ns=1766476631373630690
[2025-12-23 07:57:11,479 C 4172 4172] core_worker.cc:2868:  An unexpected system state has occurred. You have likely discovered a bug in Ray. Please report this issue at https://github.com/ray-project/ray/issues and we'll work with you to fix it. Check failed: GetAndPinArgsForExecutor(task_spec, &args, &arg_refs, &borrowed_ids) Status not OK: IOError: Broken pipe 
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(_ZN3ray4core10CoreWorker11ExecuteTaskERKNS_17TaskSpecificationESt8optionalISt13unordered_mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt6vectorISt4pairIldESaISF_EESt4hashISC_ESt8equal_toISC_ESaISE_IKSC_SH_EEEEPSD_ISE_INS_8ObjectIDESt10shared_ptrINS_9RayObjectEEESaISV_EESY_PSD_ISE_ISR_bESaISZ_EEPN6google8protobuf16RepeatedPtrFieldINS_3rpc20ObjectReferenceCountEEEPbPSC_+0x25dc) [0x7f12f3ef869c] ray::core::CoreWorker::ExecuteTask()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x8494ab) [0x7f12f3fb34ab] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(_ZN3ray4core20ActorSchedulingQueue31AcceptRequestOrRejectIfCanceledENS_6TaskIDERNS0_14InboundRequestE+0x131) [0x7f12f3fb8621] ray::core::ActorSchedulingQueue::AcceptRequestOrRejectIfCanceled()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x850449) [0x7f12f3fba449] ray::core::ActorSchedulingQueue::ExecuteRequest()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x85399d) [0x7f12f3fbd99d] ray::core::ActorSchedulingQueue::ScheduleRequests()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(_ZN3ray4core20ActorSchedulingQueue3AddEllSt8functionIFvRKNS_17TaskSpecificationES2_IFvNS_6StatusES2_IFvvEES8_EEEES2_IFvS5_RKS6_SA_EESA_S3_+0x875) [0x7f12f3fbf735] ray::core::ActorSchedulingQueue::Add()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(_ZN3ray4core12TaskReceiver10HandleTaskENS_3rpc15PushTaskRequestEPNS2_13PushTaskReplyESt8functionIFvNS_6StatusES6_IFvvEES9_EE+0x1108) [0x7f12f3fb6788] ray::core::TaskReceiver::HandleTask()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x78281e) [0x7f12f3eec81e] ray::core::CoreWorker::HandlePushTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f12f3d113ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f12f590ad90]
ray::PDFExtractorStage(+0x29a0ed) [0x55c2f778c0ed]

[2025-12-23 07:57:11,571 C 4173 4173] task_receiver.cc:151:  An unexpected system state has occurred. You have likely discovered a bug in Ray. Please report this issue at https://github.com/ray-project/ray/issues and we'll work with you to fix it. Check failed: actor_creation_task_done_() Status not OK: IOError: Broken pipe 
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7f0b34160be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f0b33ebd3ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f0b35ab3d90]
ray::PDFExtractorStage(+0x29a0ed) [0x55b0cdcd80ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7f4262ad8be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f42628353ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f426442bd90]
ray::PDFExtractorStage(+0x29a0ed) [0x55fa45e410ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7f6128b42be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f612889f3ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f612a498d90]
ray::PDFExtractorStage(+0x29a0ed) [0x564af615f0ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7f4ee177abe0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f4ee14d73ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f4ee30d0d90]
ray::AudioExtractorStage(+0x29a0ed) [0x5654d85940ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7f5a53868be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f5a535c53ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f5a551bed90]
ray::AudioExtractorStage(+0x29a0ed) [0x55b88c5be0ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7f316c080be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f316bddd3ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f316d9d6d90]
ray::ImageExtractorStage(+0x29a0ed) [0x55ab2b87b0ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7f9d9837ebe0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f9d980db3ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f9d99cd4d90]
ray::DocxExtractorStage(+0x29a0ed) [0x55ef607a30ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7ff00ccc0be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7ff00ca1d3ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7ff00e616d90]
ray::ImageDedupStage(+0x29a0ed) [0x55d4499010ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7f1d0a286be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f1d09fe33ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f1d0bbd9d90]
ray::PDFExtractorStage(+0x29a0ed) [0x55bb4090e0ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7ff8a9db0be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7ff8a9b0d3ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7ff8ab706d90]
ray::PPTXExtractorStage(+0x29a0ed) [0x55686b6bd0ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7f744f32abe0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f744f0873ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f7450c7dd90]
ray::HtmlExtractorStage(+0x29a0ed) [0x55cd8c0c20ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7fb88abc8be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7fb88a9253ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7fb88c51bd90]
ray::ImageFilterStage(+0x29a0ed) [0x5651dfdbe0ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7f68545c8be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f68543253ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f6855f1ed90]
ray::PDFExtractorStage(+0x29a0ed) [0x5565e29e80ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7fe65d9b0be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7fe65d70d3ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7fe65f306d90]
ray::DocxExtractorStage(+0x29a0ed) [0x55e6e38820ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7fb634e3abe0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7fb634b973ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7fb63678dd90]
ray::PPTXExtractorStage(+0x29a0ed) [0x56402da8f0ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7f2fca4a8be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f2fca2053ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f2fcbdfed90]
ray::HtmlExtractorStage(+0x29a0ed) [0x5643ca33b0ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7f7ae78d0be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f7ae762d3ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f7ae9226d90]
ray::ImageCaptionTransformStage(+0x29a0ed) [0x5600ad2530ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7fb33dd6cbe0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7fb33dac93ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7fb33f6c2d90]
ray::TableExtractorStage(+0x29a0ed) [0x561c500950ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7f5844572be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f58442cf3ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f5845ec5d90]
ray::TableExtractorStage(+0x29a0ed) [0x557a7d8290ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7f18405dcbe0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f18403393ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f1841f2fd90]
ray::TableExtractorStage(+0x29a0ed) [0x55b018a2d0ed]

(raylet) WARNING: 64 PYTHON worker processes have been started on node: 273acc807bd6909977631487b5a37c0aeb510ac0584931daa864ced8 with address: 10.128.2.66. This could be a result of using a large number of actors, or due to tasks blocked in ray.get() calls (see https://github.com/ray-project/ray/issues/3644 for some discussion of workarounds).
(raylet) Raylet is terminated. Termination is unexpected. Possible reasons include: (1) SIGKILL by the user or system OOM killer, (2) Invalid memory access from Raylet causing SIGSEGV or SIGBUS, (3) Other termination signals. Last 20 lines of the Raylet logs:
    /opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x407243) [0x559dcf1fd243] ray::raylet::WorkerPool::StartNewWorker()::{lambda()#1}::operator()()
    /opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x40d3d9) [0x559dcf2033d9] ray::raylet::WorkerPool::StartNewWorker()
    /opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x3f8e30) [0x559dcf1eee30] ray::raylet::WorkerPool::TryPendingStartRequests()
    /opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x402581) [0x559dcf1f8581] ray::raylet::WorkerPool::PushWorker()
    /opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x38d533) [0x559dcf183533] ray::raylet::NodeManager::HandleWorkerAvailable()
    /opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x38d6df) [0x559dcf1836df] ray::raylet::NodeManager::ProcessAnnounceWorkerPortMessageImpl()
    /opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x3be407) [0x559dcf1b4407] ray::raylet::NodeManager::ProcessClientMessage()
    /opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x324c3b) [0x559dcf11ac3b] std::_Function_handler<>::_M_invoke()
    /opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x603eae) [0x559dcf3f9eae] ray::ClientConnection::ProcessMessage()
    /opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x7e7d3a) [0x559dcf5ddd3a] EventTracker::RecordExecution()
    /opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x6047d5) [0x559dcf3fa7d5] boost::asio::detail::binder2<>::operator()()
    /opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x605e60) [0x559dcf3fbe60] boost::asio::detail::reactive_socket_recv_op<>::do_complete()
    /opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0xffae72) [0x559dcfdf0e72] boost::asio::detail::scheduler::do_run_one()
    /opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0xffcd82) [0x559dcfdf2d82] boost::asio::detail::scheduler::run()
    /opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0xffd2c1) [0x559dcfdf32c1] boost::asio::io_context::run()
    /opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x2ba220) [0x559dcf0b0220] main
    /lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f714530cd90]
    /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7f714530ce40] __libc_start_main
    /opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet(+0x2d5ee0) [0x559dcf0cbee0]
    

2025-12-23 07:57:12,658 - ERROR - nv_ingest.framework.orchestration.ray.primitives.ray_pipeline - [Build-WaitWiring] Error during wiring confirmation: The actor c5d60362068042d8669d510501000000 is unavailable: The actor is temporarily unavailable: RpcError: RPC Error message: Cancelling all calls; RPC Error details:  rpc_code: 14. The task may or may not have been executed on the actor.
Traceback (most recent call last):
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/nv_ingest/framework/orchestration/ray/primitives/ray_pipeline.py", line 451, in _wait_for_wiring
    ray.get(wiring_refs)
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_private/client_mode_hook.py", line 104, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_private/worker.py", line 2882, in get
    values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout)
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_private/worker.py", line 970, in get_objects
    raise value
ray.exceptions.ActorUnavailableError: The actor c5d60362068042d8669d510501000000 is unavailable: The actor is temporarily unavailable: RpcError: RPC Error message: Cancelling all calls; RPC Error details:  rpc_code: 14. The task may or may not have been executed on the actor.
2025-12-23 07:57:12,658	ERROR ray_pipeline.py:454 -- [Build-WaitWiring] Error during wiring confirmation: The actor c5d60362068042d8669d510501000000 is unavailable: The actor is temporarily unavailable: RpcError: RPC Error message: Cancelling all calls; RPC Error details:  rpc_code: 14. The task may or may not have been executed on the actor.
Traceback (most recent call last):
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/nv_ingest/framework/orchestration/ray/primitives/ray_pipeline.py", line 451, in _wait_for_wiring
    ray.get(wiring_refs)
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_private/client_mode_hook.py", line 104, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_private/worker.py", line 2882, in get
    values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout)
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_private/worker.py", line 970, in get_objects
    raise value
ray.exceptions.ActorUnavailableError: The actor c5d60362068042d8669d510501000000 is unavailable: The actor is temporarily unavailable: RpcError: RPC Error message: Cancelling all calls; RPC Error details:  rpc_code: 14. The task may or may not have been executed on the actor. job_id=01000000 worker_id=01000000ffffffffffffffffffffffffffffffffffffffffffffffff node_id=273acc807bd6909977631487b5a37c0aeb510ac0584931daa864ced8 timestamp_ns=1766476632658757171
2025-12-23 07:57:12,660 - CRITICAL - nv_ingest.framework.orchestration.ray.primitives.ray_pipeline - Pipeline build failed: Build failed: error confirming initial wiring
2025-12-23 07:57:12,660	CRITICAL ray_pipeline.py:647 -- Pipeline build failed: Build failed: error confirming initial wiring job_id=01000000 worker_id=01000000ffffffffffffffffffffffffffffffffffffffffffffffff node_id=273acc807bd6909977631487b5a37c0aeb510ac0584931daa864ced8 timestamp_ns=1766476632660446000
2025-12-23 07:57:12,664 - ERROR - nv_ingest.framework.orchestration.ray.primitives.ray_pipeline - Cannot start: Pipeline not built or has no actors.
2025-12-23 07:57:12,664	ERROR ray_pipeline.py:1539 -- Cannot start: Pipeline not built or has no actors. job_id=01000000 worker_id=01000000ffffffffffffffffffffffffffffffffffffffffffffffff node_id=273acc807bd6909977631487b5a37c0aeb510ac0584931daa864ced8 timestamp_ns=1766476632664161963
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7ff2f79d6be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7ff2f77333ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7ff2f9329d90]
ray::TableExtractorStage(+0x29a0ed) [0x55ad55cf00ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7f800f914be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f800f6713ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f8011267d90]
ray::ChartExtractorStage(+0x29a0ed) [0x55c3698cf0ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7f7f77194be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f7f76ef13ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f7f78ae7d90]
ray::ChartExtractorStage(+0x29a0ed) [0x55bd913550ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7f270bf04be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f270bc613ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f270d857d90]
ray::InfographicExtractorStage(+0x29a0ed) [0x555636dcd0ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7f7f64cf0be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f7f64a4d3ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f7f66643d90]
ray::ChartExtractorStage(+0x29a0ed) [0x555e46cee0ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7f7ad9400be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7f7ad915d3ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f7adad53d90]
ray::ChartExtractorStage(+0x29a0ed) [0x55fe861ba0ed]

/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x84abe0) [0x7fe65b774be0] ray::core::TaskReceiver::HandleTask()::{lambda()#1}::operator()()
/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/ray/_raylet.so(+0x5a73ba) [0x7fe65b4d13ba] __pyx_pw_3ray_7_raylet_10CoreWorker_5run_task_loop()
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7fe65d0cad90]
ray::InfographicExtractorStage(+0x29a0ed) [0x55df6a0a00ed]

Transient error StatusCode.UNAVAILABLE encountered while exporting traces to otel-collector:4317, retrying in 0.83s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to otel-collector:4317, retrying in 1.99s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to otel-collector:4317, retrying in 4.15s.
Failed to export traces to otel-collector:4317, error code: StatusCode.UNAVAILABLE
Redis/Connection error during non-destructive poll for '81a9cbec-8529-d7c9-9c4d-43616938ebf2': Error while reading from rag-redis-master:6379 : (104, 'Connection reset by peer'). Propagating up.
Redis/Connection error during non-destructive poll for '623fc576-ab7f-1a1c-507b-4b6648cc5728': Error while reading from rag-redis-master:6379 : (104, 'Connection reset by peer'). Propagating up.
fetch_message(mode=NON_DESTRUCTIVE, channel='81a9cbec-8529-d7c9-9c4d-43616938ebf2'): Redis/Connection error (ConnectionError): Error while reading from rag-redis-master:6379 : (104, 'Connection reset by peer'). Attempt 1/3
fetch_message(mode=NON_DESTRUCTIVE, channel='623fc576-ab7f-1a1c-507b-4b6648cc5728'): Redis/Connection error (ConnectionError): Error while reading from rag-redis-master:6379 : (104, 'Connection reset by peer'). Attempt 1/3
Ping failed due to RedisError: Error 111 connecting to rag-redis-master:6379. Connection refused.. Invalidating client.
Failed to reconnect to Redis: Re-allocated client failed to ping.
fetch_message(mode=NON_DESTRUCTIVE, channel='623fc576-ab7f-1a1c-507b-4b6648cc5728'): Non-retryable error during fetch: (RuntimeError) Failed to establish or re-establish connection to Redis.
Unexpected fetch error for job 623fc576-ab7f-1a1c-507b-4b6648cc5728: Failed to establish or re-establish connection to Redis.
Traceback (most recent call last):
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/nv_ingest/api/v1/ingest.py", line 207, in fetch_job
    job_response = await ingest_service.fetch_job(job_id)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/nv_ingest/framework/util/service/impl/ingest/redis_ingest_service.py", line 268, in fetch_job
    raise e
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/nv_ingest/framework/util/service/impl/ingest/redis_ingest_service.py", line 255, in fetch_job
    message = await asyncio.to_thread(
              ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/concurrent/futures/thread.py", line 59, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/nv_ingest_api/util/service_clients/redis/redis_client.py", line 708, in fetch_message
    raise e
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/nv_ingest_api/util/service_clients/redis/redis_client.py", line 659, in fetch_message
    fetch_result = self._fetch_fragments_non_destructive(channel_name, timeout)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/nv_ingest_api/util/service_clients/redis/redis_client.py", line 458, in _fetch_fragments_non_destructive
    client = self.get_client()
             ^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/nv_ingest_api/util/service_clients/redis/redis_client.py", line 226, in get_client
    raise RuntimeError("Failed to establish or re-establish connection to Redis.")
RuntimeError: Failed to establish or re-establish connection to Redis.
Ping failed due to RedisError: Error 111 connecting to rag-redis-master:6379. Connection refused.. Invalidating client.
Failed to reconnect to Redis: Re-allocated client failed to ping.
fetch_message(mode=NON_DESTRUCTIVE, channel='81a9cbec-8529-d7c9-9c4d-43616938ebf2'): Non-retryable error during fetch: (RuntimeError) Failed to establish or re-establish connection to Redis.
Unexpected fetch error for job 81a9cbec-8529-d7c9-9c4d-43616938ebf2: Failed to establish or re-establish connection to Redis.
Traceback (most recent call last):
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/nv_ingest/api/v1/ingest.py", line 207, in fetch_job
    job_response = await ingest_service.fetch_job(job_id)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/nv_ingest/framework/util/service/impl/ingest/redis_ingest_service.py", line 268, in fetch_job
    raise e
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/nv_ingest/framework/util/service/impl/ingest/redis_ingest_service.py", line 255, in fetch_job
    message = await asyncio.to_thread(
              ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/concurrent/futures/thread.py", line 59, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/nv_ingest_api/util/service_clients/redis/redis_client.py", line 708, in fetch_message
    raise e
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/nv_ingest_api/util/service_clients/redis/redis_client.py", line 659, in fetch_message
    fetch_result = self._fetch_fragments_non_destructive(channel_name, timeout)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/nv_ingest_api/util/service_clients/redis/redis_client.py", line 458, in _fetch_fragments_non_destructive
    client = self.get_client()
             ^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/nv_ingest_runtime/lib/python3.12/site-packages/nv_ingest_api/util/service_clients/redis/redis_client.py", line 226, in get_client
    raise RuntimeError("Failed to establish or re-establish connection to Redis.")
RuntimeError: Failed to establish or re-establish connection to Redis.

Full env printout

Other/Misc.

No response

Code of Conduct

  • I agree to follow THIS PROJECT's Code of Conduct
  • I have searched the open bugs and have found no duplicates for this bug report

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions