#1
  1. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Sep 2006
    Posts
    1,926
    Rep Power
    533

    Consistant encoding


    It has recently come to my attention that I should be more diligent on consistent encoding. Below are several examples where I am specifying utf8 encoding. Please confirm that I am doing them correctly and fill in the blanks, and indicate which other scopes require encoding to be specified. Thank you

    Set encoding when connecting to the database
    Code:
    $db = new PDO(
      "mysql:host=localhost;dbname=myDB;charset=utf8",
      'myUserName',
      'myPassword',
      array(
        PDO::ATTR_EMULATE_PREPARES=>false,
        PDO::ATTR_ERRMODE=>PDO::ERRMODE_EXCEPTION,
        PDO::ATTR_DEFAULT_FETCH_MODE=>PDO::FETCH_ASSOC
      )
    );
    Set encoding for HTML
    Code:
    <meta charset="utf-8" />
    Save files as utf8
    Code:
    ???
    Specifying utf8 in the pdo call
    Code:
    ???
  2. #2
  3. Did you steal it?
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    13,993
    Rep Power
    9397
    - I don't remember if <meta charset> is old or new, but there's also
    Code:
    <meta http-equiv="content-type" content="text/html; charset=utf-8" />
    or with PHP
    PHP Code:
    header("Content-Type: text/html; charset=utf-8"); 
    - Saving files as UTF-8 only matters if you have non-ASCII characters in them. In fact until PHP fixes its long-standing issues with BOMs (a) you're safer off not using UTF-8 or (b) you need to make sure your editor doesn't include a BOM when saving (many don't).

    - You already did UTF-8 with the database stuff.
  4. #3
  5. --
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Jul 2012
    Posts
    3,957
    Rep Power
    1046
    Hi,

    Originally Posted by NotionCommotion
    Set encoding when connecting to the database
    This is correct. When using PDO, you have to specify the character encoding in the DSN string. For MySQLi, you'd have to use mysqli_set_charset(). And for the old extension, it's mysql_set_charset().

    In any case: Do not use SET NAMES.



    Originally Posted by NotionCommotion
    Set encoding for HTML
    Code:
    <meta charset="utf-8" />
    No. It's up to the web server to deliver text files (HTML, JavaScript, CSS, ...) with the right character encoding in the Content-Type header. So you need to do this in the web server configuration.

    This "meta" stuff is a bit absurd, because obviously the browser has to know the character encoding before it can read the content. The "meta" element does work thanks to a bit of magic, and it's actually recommended to have it for documentation purposes. But it's no replacement for a proper Content-Type header.



    Originally Posted by NotionCommotion
    Save files as utf8
    That's up to your IDE/editor. You need to set this in the configuration.



    Originally Posted by NotionCommotion
    Specifying utf8 in the pdo call
    What do you mean?



    Of course you also have to specify the character encoding when using escaping functions like htmlspecialchars(). Many people forget that, but it's necessary for the function to even work.
    The 6 worst sins of security ē How to (properly) access a MySQL database with PHP

    Why canít I use certain words like "drop" as part of my Security Question answers?
    There are certain words used by hackers to try to gain access to systems and manipulate data; therefore, the following words are restricted: "select," "delete," "update," "insert," "drop" and "null".
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Sep 2006
    Posts
    1,926
    Rep Power
    533
    Thank you requinix,

    I meant to add the following when initially creating the database to my original post but accidentally deleted it.
    Code:
    CREATE DATABASE mydb
      DEFAULT CHARACTER SET utf8
      DEFAULT COLLATE utf8_general_ci;
    You think encoding should be added to the HTML using HTML or PHP?

    Thanks for the advice on saving files.
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Sep 2006
    Posts
    1,926
    Rep Power
    533
    Thank you Jacques1,

    In any case: Do not use SET NAMES.
    I don't even know what this really does, but I promise I will never use it!!!

    But it's no replacement for a proper Content-Type header.
    Thanks, this answers my question to requinix's post.

    Specifying utf8 in the pdo call
    ...
    What do you mean?
    I thought Northie said it was required. No?
  10. #6
  11. --
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Jul 2012
    Posts
    3,957
    Rep Power
    1046
    Originally Posted by NotionCommotion
    I don't even know what this really does, but I promise I will never use it!!!


    I explained it in the other thread.



    Originally Posted by NotionCommotion
    I thought Northie said it was required. No?
    The "PDO call" Northie talks about is the DSN string used when creating a PDO instance. You did that.

    I don't know what he means by "connect to the database and specify utf8". Maybe SET NAMES? That would be bad advice.

    Anyway, as long you have the "charset=utf8" in the DSN string and set the character encoding of the database, you should be fine. MySQL has many, many other settings for the character encoding, and the whole procedure of how characters are encoded back and forth is f*cking complex. But I don't think this is relevant, and I never could get myself diving into this mess.
    The 6 worst sins of security ē How to (properly) access a MySQL database with PHP

    Why canít I use certain words like "drop" as part of my Security Question answers?
    There are certain words used by hackers to try to gain access to systems and manipulate data; therefore, the following words are restricted: "select," "delete," "update," "insert," "drop" and "null".
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Sep 2006
    Posts
    1,926
    Rep Power
    533
    I explained it in the other thread.
    Yea, I know, but I was already committed to never use it so why waste the time to read your post

    PS. I've since read your post. Thanks!

IMN logo majestic logo threadwatch logo seochat tools logo