KBA1224 Delphix Fails to Ingest Database Dump With an Alert Indicating An Earlier System Termination Left LOAD DATABASE Incomplete
Applicable Delphix Versions
Major Release |
All Sub Releases |
5.1 |
5.1.0.0, 5.1.1.0, 5.1.2.0, 5.1.3.0, 5.1.4.0, 5.1.5.0, 5.1.5.1, 5.1.6.0, 5.1.7.0, 5.1.8.0, 5.1.8.1 |
5.0 |
5.0.1.0, 5.0.1.1, 5.0.2.0, 5.0.2.1, 5.0.2.2, 5.0.2.3, 5.0.3.0, 5.0.3.1, 5.0.4.0, 5.0.4.1 ,5.0.5.0, 5.0.5.1, 5.0.5.2, 5.0.5.3, 5.0.5.4 |
4.3 |
4.3.1.0, 4.3.2.0, 4.3.2.1, 4.3.3.0, 4.3.4.0, 4.3.4.1, 4.3.5.0 |
4.2 |
4.2.0.0, 4.2.0.3, 4.2.1.0, 4.2.1.1, 4.2.2.0, 4.2.2.1, 4.2.3.0, 4.2.4.0 , 4.2.5.0, 4.2.5.1 |
4.1 |
4.1.0.0, 4.1.2.0, 4.1.3.0, 4.1.3.1, 4.1.3.2, 4.1.4.0, 4.1.5.0, 4.1.6.0 |
4.0 |
4.0.0.0, 4.0.0.1, 4.0.1.0, 4.0.2.0, 4.0.3.0, 4.0.4.0, 4.0.5.0, 4.0.6.0, 4.0.6.1 |
3.2 |
3.2.0.0, 3.2.1.0, 3.2.2.0, 3.2.2.1, 3.2.3.0, 3.2.4.0, 3.2.4.1, 3.2.4.2, 3.2.5.0, 3.2.5.1, 3.2.6.0, 3.2.7.0, 3.2.7.1 |
3.1 |
3.1.0.1, 3.1.1.0, 3.1.2.0, 3.1.2.1, 3.1.3.0 , 3.1.3.1, 3.1.3.2, 3.1.4.0, 3.1.5.0, 3.1.6.0 |
3.0 |
3.0.0.3, 3.0.0.4, 3.0.1.0, 3.0.1.1, 3.0.1.2, 3.0.1.3, 3.0.2.0, 3.0.2.1, 3.0.3.0, 3.0.3.1, 3.0.4.0, 3.0.4.1, 3.0.5.0, 3.0.6.0, 3.0.6.1 |
Issue
Repeated attempts to ingest a database or transaction log dump fail with an alert similar to the following:
Error After failing to unmount database "dxt8d35X91835Z2tT5_history" on target host "ase_server.delphix.com" instance "ASE_STG_157", failed to drop the database.","Make sure the host is online, the environment user "delphixp" has permissions to unmount the database NFS mount point, the database user "sybase" has the required permissions to query the instance, address all exceptions, and try the operation again.","DB_SYNC","ASE_SRC_157/history","Unmount SQLException: Database 'dxt8d35X91835Z2tT5_history' cannot be opened because either an earlier system termination left LOAD DATABASE incomplete or the database is created with 'for load' option. Load the database or contact a user with System Administrator (SA) role.nDrop SQLException: An attempt was made to fetch logical page '448' from cache 'default data cache'. Page belongs to database 'dxt8d35X91835Z2tT5_history' (14242), object 'systabstats' (23), index 'csystabstats' (2), partition 'csystabstats_23' (23) and not to database 'dxt8d35X91835Z2tT5_history' (14242), object 'sysreferences' (16), index 'sysreferences' (0), partition 'sysreferences_16' (16). Suggested Action Make sure the host is online, the environment user "delphixp" has permissions to unmount the database NFS mount point, the database user "sybase/ase_server.delphix.com@is1.morgan" has the required permissions to query the instance, address all exceptions, and try the operation again.
The Delphix debug logs may have more helpful information including ASE error 930 and 605:
[2017-11-02 11:46:15,103][DEBUG][ase.target.ASEBaseTargetDatabase#unmountDatabaseException:398][Worker-2696|JOB-1403|DB_SYNC(ASE_DB_CONTAINER-6)][ACTION-23593] Unmounting dxt8d35X91835Z2tT5_history [2017-11-02 11:46:15,104][INFO][ase.dbc.ASEJDBCAccessorImpl#messageHandler:1053][Worker-2696|JOB-1403|DB_SYNC(ASE_DB_CONTAINER-6)][ACTION-23593] ASE message 930 : Database 'dxt8d35X91835Z2tT5_history' cannot be opened because either an earlier system termination left LOAD DATABASE incomplete or the database is created with 'for load' option. Load the database or contact a user with System Administrator (SA) role. [2017-11-02 11:46:15,126][INFO][ase.dbc.ASEJDBCAccessorImpl#messageHandler:1053][Worker-2696|JOB-1403|DB_SYNC(ASE_DB_CONTAINER-6)][ACTION-23593] ASE message 605 : An attempt was made to fetch logical page '448' from cache 'default data cache'. Page belongs to database 'dxt8d35X91835Z2tT5_history' (14242), object 'systabstats' (23), index 'csystabstats' (2), partition 'csystabstats_23' (23) and not to database 'dxt8d35X91835Z2tT5_history' (14242), object 'sysreferences' (16), index 'sysreferences' (0), partition 'sysreferences_16' (16).
The ASE error log will also have detailed information about error 605 possibly including a stack trace:
00:0002:00000:00045:2017/11/03 09:15:59.21 server Error: 605, Severity: 21, State: 1 00:0002:00000:00045:2017/11/03 09:15:59.21 server An attempt was made to fetch logical page '448' from cache 'default data cache'. Page belongs to database 'dx81PffvZ5Qj1V9reVT_trade_db' (27957), object 'systabstats' (23), index 'systabstats' (0), partition 'systabstats_23' (23) and not to database 'dx81PffvZ5Qj1V9reVT_trade_db' (27957), object 'sysreferences' (16), index 'sysreferences' (0), partition 'sysreferences_16' (16). 00:0002:00000:00045:2017/11/03 10:01:28.25 server Starting diagnostics for read failure: 00:0002:00000:00045:2017/11/03 10:01:28.26 server Device Information: Device number = 184 Phyname = /delphix/42075db8-fdde-fe51-9a65-a0841c23c0d3-staging-25/datafile/dx9O2d8vu0UpvwXvPP_rade_dev1 00:0002:00000:00045:2017/11/03 10:01:28.26 server Buffer Information: Buf addr = 0x0x2aab702c28a0, Mass addr = 0x0x2aab702c28a0 Buf pageno = 448, Mass pageno = 448, dbid = 27957 00:0002:00000:00045:2017/11/03 10:01:28.26 server Buf virtpg = 448, Mass virtpg = 448 Buf stat = 0x1, Mass stat = 0x4001008 Mass keep = 1, Mass awaited = 0 00:0002:00000:00045:2017/11/03 10:01:28.26 server PAGE DEBUG ELEMENT: buffer state = 3; prior to io ptnid = 0, page number =0; page timestamp lo=0 hi=0; latch requested mode=4, type=0; latch result = 0; number of times searched = 0. 00:0002:00000:00045:2017/11/03 10:01:28.26 server Page Information from first read attempt: Page read from cache ppageno = 448, pptnid = 23, pindid = 0 pnextpg = 449, pprevpg = 431 plevel = 0, pstat = 0x881 pts_hi = 22, pts_lo = 1308518244, tail ts = 26468 00:0002:00000:00045:2017/11/03 10:01:28.26 server Page Information from second read attempt: Page read from disk ppageno = 448, pptnid = 23, pindid = 0 pnextpg = 449, pprevpg = 431 plevel = 0, pstat = 0x881 pts_hi = 22, pts_lo = 1308518244, tail ts = 26468 00:0002:00000:00045:2017/11/03 10:01:28.26 server SDES Information: dbid = 27957, objid = 16, sptnid = 16 scur.pageid = 448 sstat = 0x1, sstat2 = 0x0 suid = 1, cacheid = 0 00:0002:00000:00045:2017/11/03 10:01:28.26 server PSS Information: pstat = 0x10000, pcurdb = 27957, pspid = 45 p2stat = 0x40001012, p3stat = 0x800 plasterror = 46909632806912, preverror = 46909632806912, pattention = 0 00:0002:00000:00045:2017/11/03 10:01:28.26 server End diagnostics for read failure: 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel pc: 0x000000000162545b pcstkwalk+0x3a3() 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel pc: 0x0000000001624f33 ucstkgentrace+0x263() 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel pc: 0x0000000001620e9e ucbacktrace+0x50() 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel pc: 0x0000000000ada1b3 wrongpage__print_diagnostic+0x3a5() 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel pc: 0x0000000000ad93c1 wrongpage+0x3be() 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel pc: 0x0000000000ad6177 getpage_with_validation+0x1747() 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel pc: 0x0000000000e40ebb pg_oamscan_startscan+0xfb() 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel pc: 0x0000000000b165ee dol__startscan+0x4ae() 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel pc: 0x0000000000b11418 startptnscan+0x4e8() 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel [Handler pc: 0x0x0000000001281b11 sdes_scan_error_hdlr installed by the following function:-] 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel pc: 0x00000000012743e7 sdes_scan_setup+0x4a7() 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel [Handler pc: 0x0x0000000001402030 droperr installed by the following function:-] 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel pc: 0x0000000001403a6b dropd+0xe49() 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel pc: 0x0000000001402bb0 dropdb+0x50() 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel pc: 0x00000000011c34d3 s_execute+0x4a83() 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel [Handler pc: 0x0x000000000126adb0 hdl_stack installed by the following function:-] 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel [Handler pc: 0x0x00000000012006a0 s_handle installed by the following function:-] 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel pc: 0x0000000001205161 sequencer+0x2401() 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel pc: 0x00000000007cd0f9 tdsrecv_language+0x1c9() 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel [Handler pc: 0x0x000000000157a830 ut_handle installed by the following function:-] 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel pc: 0x00000000007ee4c8 conn_hdlr+0x10d8() 00:0002:00000:00045:2017/11/03 10:01:28.27 kernel end of stack trace, spid 45, kpid 1774780804, suid 8 00:0002:00000:00045:2017/11/03 10:01:28.27 server Run the DBCC command 'dbcc tablealloc' for both the objects, object id = 16, partition id = 23. 00:0002:00000:00045:2017/11/03 10:01:28.27 server Error: 605, Severity: 21, State: 1 00:0002:00000:00045:2017/11/03 10:01:28.27 server An attempt was made to fetch logical page '448' from cache 'default data cache'. Page belongs to database 'dx81PffvZ5Qj1V9reVT_trade_db' (27957), object 'systabstats' (23), index 'systabstats' (0), partition 'systabstats_23' (23) and not to database 'dx81PffvZ5Qj1V9reVT_trade_db' (27957), object 'sysreferences' (16), index 'sysreferences' (0), partition 'sysreferences_16' (16).
Troubleshooting
The important thing to focus on here is the ASE error 605 "An attempt was made to fetch logical page" is thrown every time. In some versions of Delphix, the error number is not displayed, just the text of the message so reviewing the debug logs will be necessary. According to SAP documentation on error message 605, this can be raised due to "hard" corruption which is written to disk or "transient" corruption in memory:
"Usually this error is detected after the corruption has been written to the database on disk (hard 605), but it can also occur entirely in cache without the damage ever being written to disk (transient 605), and is not associated with data corruption."
Check the Delphix support bundle for signs of physical corruption:
- In the support bundle, there will be a file named "zpool_status_-v" which will record any READ/WRITE/CKSUM errors. It will also display corrupted files.
- There is another file in the support bundle named "fmdump_-eV" which will show when the last disk error was raised and whether or not it recovered.
Resolution
If there is no physical corruption reported on the Delphix engine, the quickest and easiest thing to try is restarting the ASE instance that raised the 605 error. If the error is "transient" as described in the supporting SAP documentation, restarting the instance will clear in in-memory corruption. After the ASE instance is restarted, try ingesting the database or transaction log dump again.
If restarting the ASE instance does not help, follow the additional steps in the ASE documentation for recovering from a "hard" failure.
If the steps for recovering from a "hard" failure do not provide relief, you may need to consider deleting the dSource (if there are no child VDBs) or detach the dSource from the source database and link a new dSource (Detaching a dSource).
Additional Information