Opened 11 years ago

Closed 11 years ago

#266 closed defect (fixed)

Petascope use of sdom

Reported by: Dimitar Misev Owned by: Piero Campalani
Priority: minor Milestone: 8.4
Component: petascope Version: 8.3
Keywords: Cc:
Complexity: Medium

Description

I see this in the logs:

select sdom(c[0:180258,-29295:-27372]) from haiti_vnir AS c

This is an obvious inefficiency, there's no need to call sdom since we already know the domain apparently, if this subset is done: c[0:180258,-29295:-27372]

Change History (11)

comment:1 by Piero Campalani, 11 years ago

I see.
I can look at this in January, if not extremely urgent.
It is mainly about adding a check in the WCS setBounds() function probably.

comment:2 by Dimitar Misev, 11 years ago

Yes sure; the offending place is probably AbstractFormatExtension:68

comment:3 by Dimitar Misev, 11 years ago

But we should definitely optimize this somehow, it's doubling evaluation time more or less, e.g. 5s for the sdom and 5s for getting the actual data..

comment:4 by Piero Campalani, 11 years ago

I agree with you.

I believe the sdom request should actually reside in the DbMetadataSource.read() method when the cellDomain objects are created, whereas the setting the bounds (setBounds()) shouldn't be actually needed since:

  • asterisks are allowed in the rasql queries, in case the subsets in the W*S request apply only on a subset of the coverage's dimensions (e.g. `mean_summer_airtemp[0:10,*:*]);
  • rasql does not break in case the pixel bounds are outside the range (e.g. mean_summer_airtemp[-10:10,-10:10]=mean_summer_airtemp[0:10,0:10]),
  • Petascope anyway knows about the grid-domain ranges (now by means of ps_cellDomain, in new implementations by means of sdom).

This way still a sdom would be thrown for each request, but we might build a cache of coverages metadata that is checked at every DbMetadataSource.read() and refreshed e.g. when the domain extents (ps_domain currently) change ?

comment:5 by Dimitar Misev, 11 years ago

Yeah that's all good, but also we need to push some optimizations to rasdaman as the problem is that the sdom() function requires evaluation of its arguments in order to compute the result.

E.g. sdom© alone is instant, but sdom(c[*:*,*:*,..]) requires loading c from the database and that's very inefficient.

comment:6 by Dimitar Misev, 11 years ago

Ok forget about my talk above, it's a bit wrong :) But your post is valid.

comment:7 by Piero Campalani, 11 years ago

Ah, I see.. but if sdom(c) is instant, then the extents of the mdd are somewhere in some table right?
Then sdom(c[*:*,...]) as well should just look there when it detects "no operations" to be done like in this case. Whereas it must load the mdd in the remaining cases (e.g. sdom(scale(...))?

comment:8 by abeccati, 11 years ago

Priority: criticalminor

Does not look so critical, unless it slows the show down considerably. Setting prio to minor for now.

comment:9 by Dimitar Misev, 11 years ago

A 2x slow-down is considerable I'd say. Rasdaman enterprise is not affected because it has some query optimizations, but for community we need to fix petascope.

comment:10 by abeccati, 11 years ago

Milestone: 8.4

comment:11 by Piero Campalani, 11 years ago

Resolution: fixed
Status: newclosed

We can fix this:

commit fc3a97ba0e43a1546eeb01da543e200c43560797
Author: Piero Campalani <cmppri@unife.it>
Date:   Fri Jan 25 18:16:04 2013 +0100

    Avoid sdom request when updating GetCoverage metadata (ticket #266).
Note: See TracTickets for help on using tickets.