Opened 6 years ago

Closed 5 years ago

#1850 closed defect (fixed)

rasdapy fails on any query resulting in more than 4MB data

Reported by: Dimitar Misev Owned by: ahambasan
Priority: critical Milestone: 9.7
Component: rasdapy Version: development
Keywords: Cc:
Complexity: Medium

Description (last modified by Bang Pham Huu)

In #1849 an import script is attached that will import 6.6GB array in rasdaman. The query in the ticket should be checked if it works with rasdapy, with the latest rasdaman.

The solution is http://rasdaman.org/ticket/1850#comment:4 but in C++, it needs something like this in Python: https://stackoverflow.com/a/42655487/2028440

However, current GRPC version for rasdapy is still beta so options parameter does not exist. Hence, it needs to update GRPC for rasdapy.

Rasdapy folder exists at rasdaman/applications/rasdapy and what need to do first is installing GRPC for python:

sudo pip install grpcio-tools
pip install -U protobuf (version 3.6)

There is a python script to generate GRPC files at rasdapy/scripts/stub_generator.py, however, it needs to use some built GRPC tool, so don't use this one, use this new content: https://pastebin.com/Ha2EXLbk which will generate GRPC files from above installed Python dependencies.

After that, trying to figure out what changed from the current source code of current rasdapy using beta GRPC with new generated GRPC version.

Finally, rasql.py should be able to run any rasql query which can be larger than 4 MB.

Change History (6)

comment:1 by Dimitar Misev, 5 years ago

Priority: minorcritical

I tested it and got this:

$ python rasql.py -q 'select encode(c, "jpeg") from test_coll_7GB_rgb as c' --out file
rasql done.
Traceback (most recent call last):
  File "rasql.py", line 268, in <module>
    main.execute()
  File "rasql.py", line 130, in execute
    res = self.query_executor.execute_read(self.validator.query)
  File "/home/dimitar/rasdaman/community/src/applications/rasdapy/rasdapy/query_executor.py", line 50, in execute_read
    res = self.ras_oqlquery.execute()
  File "/home/dimitar/rasdaman/community/src/applications/rasdapy/rasdapy/ras_oqlquery.py", line 107, in execute
    res = query.execute_read()
  File "/home/dimitar/rasdaman/community/src/applications/rasdapy/rasdapy/cores/core.py", line 495, in execute_read
    return self._get_collection_result()
  File "/home/dimitar/rasdaman/community/src/applications/rasdapy/rasdapy/cores/core.py", line 561, in _get_collection_result
    arr_temp = self.__get_array_result_mdd()
  File "/home/dimitar/rasdaman/community/src/applications/rasdapy/rasdapy/cores/core.py", line 524, in __get_array_result_mdd
    self.transaction.database.connection.session.clientId)
  File "/home/dimitar/rasdaman/community/src/applications/rasdapy/rasdapy/cores/remote_procedures.py", line 273, in rassrvr_get_next_tile
    _QUERY_TIMEOUT_SECONDS)
  File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 309, in __call__
    self._request_serializer, self._response_deserializer)
  File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 195, in _blocking_unary_unary
    raise _abortion_error(rpc_error_call)
grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.RESOURCE_EXHAUSTED, details="Received message larger than max (4194338 vs. 4194304)")

I tried a smaller subset, and it still failed with same error:

python rasql.py -q 'select encode(c[0:3000,0:3000], "jpeg") from test_coll_7GB_rgb as c' --out file

[0:1000,0:1000] worked fine.

That means rasdapy cannot receive more than 4MB array from rasdaman?

comment:2 by Dimitar Misev, 5 years ago

Summary: test rasdapy with export of large arrayrasdapy fails on any query resulting in more than 4MB data

comment:3 by Bang Pham Huu, 5 years ago

yes, GRPC set the maximum size of a message is 4 MB: https://github.com/googleapis/google-cloud-node/issues/1991, the question is how to fix it (is rasdaman server already supported chunking?). I think it is already done in so rasj can query data larger than 4 MB and return to Petascope?

Last edited 5 years ago by Bang Pham Huu (previous) (diff)

comment:4 by Dimitar Misev, 5 years ago

I think it's chunked of in rasdaman otherwise the failure would be in the rasserver logs, not the python ones. But I'm not sure what the chunk size is in rasdaman. We set to unlimited the msg send/receive size in rasdaman, so maybe smth like this can be set in python as well?

	  grpc::ChannelArguments args;
	  args.SetMaxReceiveMessageSize(-1); // unlimited
	  args.SetMaxSendMessageSize(-1); // unlimited
	  return args;

comment:5 by Bang Pham Huu, 5 years ago

Description: modified (diff)
Owner: changed from Bang Pham Huu to ahambasan
Status: newassigned

comment:6 by Dimitar Misev, 5 years ago

Resolution: fixed
Status: assignedclosed
Note: See TracTickets for help on using tickets.