Error: F_JG2901: String Containing Invalid Sequence Encountered While Encoding From UTF-8-BMP to UTF-8-BMP
Issue
How to resolve a F_JG2901_ String containing invalid sequence encountered while encoding from UTF-8-BMP to UTF-8-BMP
error message?
Environment
HVR 5
Description:
A user reported receiving the following error:
2019-02-20T01:24:44-05:00: stageusmig-refrchar-ostg-gcstg: F_JG2901: String ‘Merry Christmas, Jo! Love You! xf0x9fx98x98xf0x9fx8ex85xf0x9fx8fxbbxf0x9fx8ex84’ containing invalid sequence encountered while encoding from UTF-8-BMP to UTF-8-BMP for table ‘<table_name>’ column ‘<column_name>’
.
F_JT1458: The previous error occurred while Project pipe was processing row 299089 of table ‘<table_name>’ (row 299089 for Project pipe).
Root Cause
In this case, the source location is an Oracle 11.2.0.4 database running on Linux and the target location is a Cloud SQL 5.7 database. From the log files it was found that there are emoji characters (xf0x9fx98x98 is the UTF-8 byte sequence for Unicode U+1F618) embedded in the data.
Resolution
We recommended that you add the action TableProperties with the parameters /CoerceErrorPolicy=WARNING and /CoerceErrorType=ENCODING to the channel and then perform HVR Refresh. Ensure that the parameter /CoerceErrorPolicy=WARNING applies to both the source and the target groups.
NOTE: U+1F618 corresponds to the emoji character, it is not within the range of the UTF-8 BMP (Basic Multilingual Plane). However, it is part of the UTF-8 SMP (Supplementary Multilingual Plane). The action definition provided as a workaround above is not required, if you are using the UTF-8 SMP character set for your Oracle source database.