How To Open a Unicode CSV in Excel (the right way)

When we scrape data from Non-English languages and give you a CSV file, the data may appear corrupted or unreadable (when you double click and open the file in Excel).

This issue occurs because we scrape the data as unicode text, but Excel reads CSV files in non-unicode by default.To configure Excel to use unicode on your exported CSV reports:

  1. Start Microsoft Excel
  2. In Excel, click the Data tab, and in the Get External Data ribbon/panel, click From Text .
  3. In the Import Text File dialog box, in the lower-right corner (to the right of the File name box), select Text Files (*.prn;*.txt;*.csv) as the file type, browse to the location where you exported/downloaded the CSV file, and then click Open (or Import).
  4. In the Text Import Wizard – Step 1 of 3 dialog box, select Delimited, and from the File origin drop-down list, select 65001: Unicode (UTF-8) (or the appropriate language character identifier for your particular environment).
    In the Preview box, make sure that your unicode text displays properly, and then click Next.
  5. In the Text Import Wizard – Step 2 of 3 dialog box, in the Delimiters section, make sure that only Comma is checked, and then click Finish.
  6. In the Import Data dialog box, select New worksheet (or Existing worksheet if you have one), and then click OK.

You now have a properly formatted (and unicode-friendly) Excel worksheet.

If you would like to know more about encoding and characters sets, here is an excellent article.

If you still see issues

Sometimes, despite opening the file with the correct encoding, Excel will show multi-line strings on separate lines in Excel as the next row of the data. The solution to that problem is something related to BOM

You need to download Notepad++ for free at http://notepad-plus-plus.org/ and then open the CSV file in it and then click the Encoding menu and check the entry related to UTF-8 and BOM

Then Save the file and Open it in Excel and the issues should be fixed.