#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2014
    Posts
    2
    Rep Power
    0

    Can't import Japanese characters from XML to DB


    Hi.

    I've been banging my head against the wall on this problem for a few days now.

    So I have an XML file which is essentially a Japanese dictionary. I'm trying to load this file into a table in my database. The problem is that when I load the file all the Japanese characters are just left as blanks.

    The command I'm using is as follows:
    Code:
    LOAD XML LOCAL INFILE 'dictionary.xml' INTO TABLE dictionary ROWS IDENTIFIED BY '<entry>';
    Everything except the Japanese characters are loaded fine!

    When I run the query I get the following warnings:
    Code:
    Warning | 1263 | Column set to default value; NULL supplied to NOT NULL column 'reb' at row 1. 
    Warning | 1263 | Column set to default value; NULL supplied to NOT NULL column 'keb' at row 1.
    But I'm not supplying NULL text, I'm supplying Japanese text. The XML is encoded in UTF8 as far as I know, and all the collations in my DB are set to utf8_general_ci. I've also tried setting all my MySQL settings to UTF8 in the my.ini for example.

    Basically I've been searching the web for a solution for a long time but nothing seems to work.

    What is the problem here? I would appreciate some help so much, like I said I've been searching for an answer for days.


    An entry in my XML looks as follow:
    Code:
    <entry>
    <r_ele>
    <reb>ブラック</reb>
    </r_ele>
    <r_ele>
    <keb>黒い</keb>
    </r_ele>
    <sense>
    <gloss>black</gloss>
    </sense>
    </entry>
    My table looks like this:
    Code:
    +---------+--------------+------+-----+---------+----------------+
    | Field   | Type         | Null | Key | Default | Extra          |
    +---------+--------------+------+-----+---------+----------------+
    | id      | int(11)      | NO   | PRI | NULL    | auto_increment |
    | reb     | varchar(250) | NO   |     | NULL    |                |
    | keb     | varchar(250) | NO   |     | NULL    |                |
    | gloss   | varchar(50)  | NO   |     | NULL    |                |
    +---------+--------------+------+-----+---------+----------------+
  2. #2
  3. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2014
    Posts
    2
    Rep Power
    0
    Funny huh. As soon as I actually create a forum thread to get help I solve it myself.

    Problem got fixed after I removed the <r_ele> elements. It's weird why this affects mysql reading the other elements but...

    Thread can be locked/deleted. I would recommend saving it though for people with the same issue in the future.

IMN logo majestic logo threadwatch logo seochat tools logo