Opened 8 years ago

Closed 8 years ago

#1148 closed defect (fixed)

Failed to connect to raserver by rasnet when insert data concurrently

Reported by: Bang Pham Huu Owned by: Alex Toader
Priority: critical Milestone: 9.2
Component: rasserver Version: development
Keywords: rasserver error Cc: Dimitar Misev
Complexity: Medium

Description

The error is (it should be exception message like RNP is Write transaction is lock, please try again later.)

Executing insert query...[ERROR] - The client failed to connect to rasserver.
terminate called after throwing an instance of 'std::runtime_error'
  what():  
Aborted (core dumped)

How to test, open 2 terminals and run the query below in same time

rasql -q 'insert into test_001 values decode($1)' -f /home/rasdaman/images/multi_cubs/0_backup.tiff --user rasadmin --passwd rasadmin

Then the error will appear in the terminal which runs later (use a big file (~50 MB) and you will see.

Attachments (4)

0.tiff (10.0 MB ) - added by Bang Pham Huu 8 years ago.
input file
0_backup.tiff.tar.gz (765.2 KB ) - added by Bang Pham Huu 8 years ago.
George, use this file, thanks.
log3.tar.gz (30.6 MB ) - added by Bang Pham Huu 8 years ago.
rasmgr_error
log_5.tar.gz (53.6 KB ) - added by Bang Pham Huu 8 years ago.
log for patch

Change History (16)

comment:1 by Alex Dumitru, 8 years ago

Priority: majorcritical

comment:2 by George Merticariu, 8 years ago

Can you please provide the file so we can reproduce the error?

by Bang Pham Huu, 8 years ago

Attachment: 0.tiff added

input file

by Bang Pham Huu, 8 years ago

Attachment: 0_backup.tiff.tar.gz added

George, use this file, thanks.

comment:3 by Bang Pham Huu, 8 years ago

I've upload the file for you, George.

comment:4 by Alex Toader, 8 years ago

Owner: changed from Alex Toader to George Merticariu
Status: newassigned

comment:5 by Bang Pham Huu, 8 years ago

Another case is when running a script to run wcst_import.sh to import data, open a rasql and it has the same error (I thought select is not write transaction?).

 rasql -q "select c[0:500, 0:500] + 5 from multiple_cov_01 as c" --out string
rasql: rasdaman query tool v1.0, rasdaman v9.2.0-beta1-gf4a50b8 -- generated on 18.01.2016 08:06:30.
opening database RASBASE at localhost:7001...ok
Executing retrieval query...terminate called after throwing an instance of 'std::runtime_error'
  what():  
Aborted (core dumped)

by Bang Pham Huu, 8 years ago

Attachment: log3.tar.gz added

rasmgr_error

comment:6 by Bang Pham Huu, 8 years ago

After few trying with rasql above, I could not stop_rasdaman.sh with 2 rasserver with memory is "N/A" (it hangs up to now is few minutes).

[rasdaman@gonzo multi_cov]$ stop_rasdaman.sh 
stop_rasdaman.sh: terminating all rasdaman servers

comment:7 by Alex Toader, 8 years ago

Owner: changed from George Merticariu to Alex Toader
Status: assignedaccepted

comment:8 by Bang Pham Huu, 8 years ago

@AToader: even when I stop_rasdaman.sh and start it again and just query a normal query without running wcst_import.sh, I see this kind of error (so you can resize the radius of problem).

rasql -q "select c[0:500, 0:500] from multiple_cov_001 as c" --out string
rasql: rasdaman query tool v1.0, rasdaman v9.2.0-beta1-gf4a50b8 -- generated on 18.01.2016 08:06:30.
opening database RASBASE at localhost:7001...ok
Executing retrieval query...terminate called after throwing an instance of 'std::runtime_error'
  what():  
Aborted (core dumped)

comment:9 by Alex Toader, 8 years ago

Resolution: fixed
Status: acceptedclosed

comment:10 by Bang Pham Huu, 8 years ago

Resolution: fixed
Status: closedreopened

@AToader: The problem seems still here (I've pulled and reinstall and start servers).

rasql -q "select c[0:500, 0:500] from multiple_cov_001 as c" --out string
rasql: rasdaman query tool v1.0, rasdaman v9.2.0-beta1-gc9085a3 -- generated on 19.01.2016 07:51:15.
opening database RASBASE at localhost:7001...ok
Executing retrieval query...rasdaman error 0: General error received from the server.
aborting transaction...E0119 08:16:15.128748962   12906 tcp_client_posix.c:171]     failed to connect to 'ipv4:10.70.11.237:7002': socket error: connection refused
E0119 08:16:16.130202929   12906 tcp_client_posix.c:171]     failed to connect to 'ipv4:10.70.11.237:7002': socket error: connection refused
ok

by Bang Pham Huu, 8 years ago

Attachment: log_5.tar.gz added

log for patch

comment:11 by Alex Toader, 8 years ago

Bang, as you can see the client is not crashing anymore. This ticket is about the crash.
From what I can see from the logs, you're system is in an inconsistent state. Please kill any running rasmgr's, rasserver's and remove the data/TRANSACTION folder.

comment:12 by Bang Pham Huu, 8 years ago

Resolution: fixed
Status: reopenedclosed

AToader, I've done as you suggest (remove TRANSACTION folder, kill all rasmgr, rasservers) start again and the error still here, however, as you said it is not related to this ticket, then I will close it.

-q "select c[0:500, 0:500] from multiple_cov_001 as c" --out string
rasql: rasdaman query tool v1.0, rasdaman v9.2.0-beta1-gf4a50b8 -- generated on 18.01.2016 08:06:30.
opening database RASBASE at localhost:7001...ok
Executing retrieval query...rasdaman error 237: Exception: Client communication failure
aborting transaction...E0118 13:08:39.628931307   22984 tcp_client_posix.c:171]     failed to connect to 'ipv4:10.70.11.237:7002': socket error: connection refused
ok
rasql done.

Note: See TracTickets for help on using tickets.