Opened 5 years ago

Closed 4 years ago

#2197 closed enhancement (invalid)

WCST_Import daemon problem when waiting for importing big files

Reported by: Bang Pham Huu Owned by: Bang Pham Huu
Priority: major Milestone: 10.0
Component: wcst_import Version: 9.8
Keywords: Cc: Dimitar Misev, Vlad Merticariu, Peter Baumann
Complexity: Medium

Description

Currently, rasdaman can import only 1 file at the same time and next requests are queued up for updating collections.

By default each daemon will check new files in its directory every hour and if the files does not exist in the CoverageId.resume.json it will send an UpdateCoverage request.

However, as multiple requests can be sent at the same time and the import will take longer than hour for a big file, then there are duplicate UpdateCoverage requests are sent by the daemon process in the next hour. In another words, it polluted rasdaman server with same requests and increased waiting times for other processes for nothing.

This needs to be enhanced in wcst_import to log which requests it sent to a log file, if the request is not marked done / error, it should not send the same request even in next intervals.

Change History (2)

comment:1 by Peter Baumann, 4 years ago

Logging into 1 file for all imports obviously constitutes a bottleneck. In the old importortho script for any MYFILE.tif dropped into the directory I had created an individual empty

  • lock file, such as MYFILE.tif.lock, once processing started (and removed when done)
  • "done" marker, such as MYFILE.tif.done, to avoid double inspection (could also contain the log, so could also be called MYFILE.tif.log)

That should fix the bug described above.

Also, we may want to have the reinspection interval as a parameter and with a smaller default than 1h.

comment:2 by Bang Pham Huu, 4 years ago

Resolution: invalid
Status: newclosed

I tested the scenario (import big files with short time interval for the daemon) and it doesn't work as the described problem in the ticket. wcst_import daemon will not run the import when the previous import process is still running. Hence, no possible overlapping requests to send to petascope as I worried.

Note: See TracTickets for help on using tickets.