Occasionally, when using the “CSV via Upload” Connector in StarfishETL, I run into an error when trying to read my CSV file.
The error I receive is: “Unable to translate bytes [xx] at index yyyy from specified code page to Unicode.”
This error message is generally caused by “unknown, unrecognized, or unrepresentable characters”. You can find out if there is an issue with your file by opening the file in a text editor. I prefer to use Microsoft’s free Visual Studio Codeto open the CSV file. Notepad does not work.
To open the file in VSCode, right click on the CSV file and click “Open with Code”. You can also use the File -> Open menu inside of VSCode.
Once open in VSCode, you can edit the CSV file directly and then re-save the CSV file. To find what needs to be edited, copy the unknown character, “�”, from the Wikipedia page or this blog post and then search for the unknown character: “�”.
As desired, you can manually edit or remove each unknown character or mass replace the unknown characters with an empty string, as pictured, a space, or any other text string.
Once all of the unknown characters are deleted or replaced with a valid character, save the CSV file and re-upload to StarfishETL.