Skip to main content
Delphix

Resolving BOM Errors in SQL Pre- and Post-Scripts (KBA1579)

 

 

This article is archived as functionality has been added to the product to automatically remove the BOM characters, if they are present, in versions newer than 5.3.

 

When importing text files, the Delphix Masking Engine supports files saved in ASCII or UTF-8 format without the Byte Order Mark (BOM). If the lookup file contains foreign alphabet characters, in order to read the Unicode text correctly, you must save the file in UTF-8 format. The BOM is usually added by Windows text editors.

The BOM causes different issues and the error message differs depending on the BOM code. This topic describes SQL Pre and Post Scripts.

There are five different BOM codes:

Byte Order Mark Description
EF BB BF UTF-8
FF FE UTF-16, little endian
FE FF UTF-16, big endian
FF FE 00 00 UTF-32, little endian
00 00 FE FF UTF-32, big-endian

Troubleshooting Pre/Post Scripts in Job Configuration 

The Pre/Post Scripts are SQL scripts executed before and after the transformation (masking) job. The script file is selected in the Job Configuration screen and ingested into the masking job. 

Error caused by UTF-8 and UTF-16  

The UTF-8 and UTF-16 BOM codes cause the ingestion process to include the BOM and an 'invisible' semicolon before the SQL statement (or row of statements).

This empty SQL causes "Incorrect syntax near ''."

grep -a1 "Couldn't execute SQL" joblog

2017/07/06 13:27:54 - Execute_Prescript.0 - ERROR (version 6.0.1.0-386, build 1 from 2015-12-03 11.37.25 by buildguy) : An error occurred, processing will be stopped:
2017/07/06 13:27:54 - Execute_Prescript.0 - Couldn't execute SQL: update dbo.OneCol set last_name ='David'
2017/07/06 13:27:54 - Execute_Prescript.0 -
2017/07/06 13:27:54 - Execute_Prescript.0 - Incorrect syntax near ''.

Error caused by UTF-32 

The UTF-32 BOM codes cause the ingestion process to include the BOM and an 'invisible' semicolon before the SQL statement (or row of statements).

This BOM includes an invalid XML (Unicode: 0x0), which results in  "[Fatal Error] :xxx:xx: An invalid XML character (Unicode: 0x0) was found in the element content of the document."

2017-07-06 04:12:13,392 INFO  ...RunServiceImpl - [JOB_ID_39_513] INFO  06-07 04:12:13,391 - PRESCRIPT TRANSFORMATION - Loading transformation from XML file [/var/delphix/dmsuite//output/MSSQL/DMSApplicator/LongRun/39/KETTLE_PRESCRIPT_TRANSFORMATION_39.xml]
2017-07-06 04:12:13,417 INFO  ...RunServiceImpl - [JOB_ID_39_513] [Fatal Error] :173:13: An invalid XML character (Unicode: 0x0) was found in the element content of the document.
2017-07-06 04:12:13,422 INFO  ...RunServiceImpl - [JOB_ID_39_513] ERROR 06-07 04:12:13,422 - 39_513 - org.pentaho.di.core.exception.KettleException:
2017-07-06 04:12:13,423 INFO  ...RunServiceImpl - [JOB_ID_39_513] Unexpected error during transformation metadata load
2017-07-06 04:12:13,423 INFO  ...RunServiceImpl - [JOB_ID_39_513]
2017-07-06 04:12:13,423 INFO  ...RunServiceImpl - [JOB_ID_39_513] Error reading information from input stream
2017-07-06 04:12:13,423 INFO  ...RunServiceImpl - [JOB_ID_39_513] An invalid XML character (Unicode: 0x0) was found in the element content of the document.
2017-07-06 04:12:13,423 INFO  ...RunServiceImpl - [JOB_ID_39_513]
2017-07-06 04:12:13,423 INFO  ...RunServiceImpl - [JOB_ID_39_513]
2017-07-06 04:12:13,423 INFO  ...RunServiceImpl - [JOB_ID_39_513]         at org.pentaho.di.job.entries.trans.JobEntryTrans.getTransMeta(JobEntryTrans.java:1202)
2017-07-06 04:12:13,423 INFO  ...RunServiceImpl - [JOB_ID_39_513]         at org.pentaho.di.job.entries.trans.JobEntryTrans.execute(JobEntryTrans.java:621)
2017-07-06 04:12:13,423 INFO  ...RunServiceImpl - [JOB_ID_39_513]         at org.pentaho.di.job.Job.execute(Job.java:589)
2017-07-06 04:12:13,423 INFO  ...RunServiceImpl - [JOB_ID_39_513]         at org.pentaho.di.job.Job.execute(Job.java:728)
2017-07-06 04:12:13,424 INFO  ...RunServiceImpl - [JOB_ID_39_513]         at org.pentaho.di.job.Job.execute(Job.java:728)
2017-07-06 04:12:13,424 INFO  ...RunServiceImpl - [JOB_ID_39_513]         at org.pentaho.di.job.Job.execute(Job.java:443)
2017-07-06 04:12:13,424 INFO  ...RunServiceImpl - [JOB_ID_39_513]         at org.pentaho.di.job.Job.run(Job.java:363)
2017-07-06 04:12:13,424 INFO  ...RunServiceImpl - [JOB_ID_39_513] Caused by: org.pentaho.di.core.exception.KettleXMLException:
2017-07-06 04:12:13,424 INFO  ...RunServiceImpl - [JOB_ID_39_513] Error reading information from input stream
2017-07-06 04:12:13,424 INFO  ...RunServiceImpl - [JOB_ID_39_513] An invalid XML character (Unicode: 0x0) was found in the element content of the document.
2017-07-06 04:12:13,424 INFO  ...RunServiceImpl - [JOB_ID_39_513]
2017-07-06 04:12:13,424 INFO  ...RunServiceImpl - [JOB_ID_39_513]         at org.pentaho.di.core.xml.XMLHandler.loadXMLFile(XMLHandler.java:588)
2017-07-06 04:12:13,424 INFO  ...RunServiceImpl - [JOB_ID_39_513]         at org.pentaho.di.core.xml.XMLHandler.loadXMLFile(XMLHandler.java:508)
2017-07-06 04:12:13,424 INFO  ...RunServiceImpl - [JOB_ID_39_513]         at org.pentaho.di.core.xml.XMLHandler.loadXMLFile(XMLHandler.java:494)
2017-07-06 04:12:13,424 INFO  ...RunServiceImpl - [JOB_ID_39_513]         at org.pentaho.di.trans.TransMeta.<init>(TransMeta.java:2760)
2017-07-06 04:12:13,424 INFO  ...RunServiceImpl - [JOB_ID_39_513]         at org.pentaho.di.trans.TransMeta.<init>(TransMeta.java:2741)
2017-07-06 04:12:13,425 INFO  ...RunServiceImpl - [JOB_ID_39_513]         at org.pentaho.di.job.entries.trans.JobEntryTrans.getTransMeta(JobEntryTrans.java:1158)
2017-07-06 04:12:13,425 INFO  ...RunServiceImpl - [JOB_ID_39_513]         ... 6 more
2017-07-06 04:12:13,425 INFO  ...RunServiceImpl - [JOB_ID_39_513] Caused by: org.xml.sax.SAXParseException; lineNumber: 173; columnNumber: 13; An invalid XML character (Unicode: 0x0) was found in the element content of the document.
2017-07-06 04:12:13,425 INFO  ...RunServiceImpl - [JOB_ID_39_513]         at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
2017-07-06 04:12:13,425 INFO  ...RunServiceImpl - [JOB_ID_39_513]         at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
2017-07-06 04:12:13,425 INFO  ...RunServiceImpl - [JOB_ID_39_513]         at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
2017-07-06 04:12:13,425 INFO  ...RunServiceImpl - [JOB_ID_39_513]         at org.pentaho.di.core.xml.XMLHandler.loadXMLFile(XMLHandler.java:561)
2017-07-06 04:12:13,425 INFO  ...RunServiceImpl - [JOB_ID_39_513]         ... 11 more

Resolution

To resolve pre/post script errors you need to use the following workaround.

Save the text file with an encoding which excludes the BOM code.
This could be done using ANSI encoding or any of the other encoding but without the BOM. This will ensure that the BOM code is not included and added to the SQL statement. 

To save the text file:

  1. Add SQL statement(s) in a text editor.
  2. Select ANSI Encoding or one without BOM.
  3. Save.

Below is an example of selecting encoding (using Notepad++) > ANSI Encoding as seen in the Save As dialog.

Notepad%252B%252B%2BEncoding.png Notepad%2BSave%2BANSI.png

Applicable Delphix Versions

This article applies to the following versions of the Delphix Engine:

Major Release

Sub Releases

5.1

5.1.0.0, 5.1.1.0, 5.1.2.0, 5.1.3.0, 5.1.4.0, 5.1.5.0, 5.1.5.1, 5.1.6.0, 5.1.7.0, 5.1.8.0, 5.1.8.1, 5.1.9.0, 5.1.10.0

5.0

5.0.1.0, 5.0.1.1, 5.0.2.0, 5.0.2.1, 5.0.2.2, 5.0.2.3, 5.0.3.0, 5.0.3.1, 5.0.4.0, 5.0.4.1 ,5.0.5.0, 5.0.5.1, 5.0.5.2, 5.0.5.3, 5.0.5.4

4.3

4.3.1.0, 4.3.2.0, 4.3.2.1, 4.3.3.0, 4.3.4.0, 4.3.4.1, 4.3.5.0

4.2

4.2.0.0, 4.2.0.3, 4.2.1.0, 4.2.1.1, 4.2.2.0, 4.2.2.1, 4.2.3.0, 4.2.4.0 , 4.2.5.0, 4.2.5.1

4.1

4.1.0.0, 4.1.2.0, 4.1.3.0, 4.1.3.1, 4.1.3.2, 4.1.4.0, 4.1.5.0, 4.1.6.0

4.0

4.0.0.0, 4.0.0.1, 4.0.1.0, 4.0.2.0, 4.0.3.0, 4.0.4.0, 4.0.5.0, 4.0.6.0, 4.0.6.1

3.2

3.2.0.0, 3.2.1.0, 3.2.2.0, 3.2.2.1, 3.2.3.0, 3.2.4.0, 3.2.4.1, 3.2.4.2, 3.2.5.0, 3.2.5.1, 3.2.6.0, 3.2.7.0, 3.2.7.1

Related Articles

External Links: