#1
  1. They're coming to take me away

    Join Date
    Jan 2005
    Location
    Florida
    Posts
    5,087
    Rep Power
    5082

    PHP / Mysql - Looking to remove DB entries without http


    I imported a long product list into a DB and was told that one of the columns should all be a demo link to use the products. Now, it seems that when i went through the list, only about 1/3 of them actually contain actual links to using the products. The rest of them are basically a copy of the "Item Description" field. So I want to go through the column "Demo" and replace all entries that don't contain a URL with an empty field.

    Also, the items that do have a URL, contain more than just a URL as well, so I want to remove everything except for the actual URL. For example, one with a demo link, has all of this in the field:
    <iframe width=700 height=428 src=http://www.youtube.com/embed/u50kG-JVCoI?rel=0 frameborder=0 allowfullscreen></iframe>
    What would be the easiest way to accomplish this? (I use Phpmyadmin)

    Thanks. If you need further info, let me know.
    "I don't need to get a life. I'm a gamer. I have lots of lives!"
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Jun 2007
    Posts
    1,517
    Rep Power
    1428
    Depending on the length of you columns & if I'm understanding correctly, this should be a simple DELETE with column NOT LIKE '%http%'. If that's just a onetime thing, it shouldn't matter, that this might take a while.

    Or do you want to change the columns content instead of removing the record? Then, you could do the same with UPDATE instead.

    Regards, J.
    Last edited by JClasen; September 13th, 2011 at 01:04 PM.
  4. #3
  5. They're coming to take me away

    Join Date
    Jan 2005
    Location
    Florida
    Posts
    5,087
    Rep Power
    5082
    Originally Posted by JClasen
    Depending on the length of you columns & if I'm understanding correctly, this should be a simple DELETE with column NOT LIKE '%http%'. If that's just a onetime thing, it shouldn't matter, that this might take a while.

    Or do you want to change the columns content instead of removing the record? Then, you could do the same with UPDATE instead.

    Regards, J.
    Thanks for the reply. I've never used "Not like" before, and I'm not sure if I completely understand your response, but let me clarify my question. I have a database with a table, "Products". There are about 15 fields. One of them is called demo (which is supposed to only have a URL in it). However, that's not the case. I have about 400 products in the table right now, and some of the records may have something like:
    This product will allow you to blah blah blah blah....
    In this type, there is no link included with it, so I want to change all of the fields that do not have a URL, and replace them with a blank field.

    Other fields under Demo, will have something like:
    <iframe width=700 height=428 src=http://www.youtube.com/embed/u50kG-JVCoI?rel=0 frameborder=0 allowfullscreen></iframe>
    With this type, I want to remove everything except the URL itself. I don't need the iframe and such.

    Basically, I want all 400 fields under "Demo" to delete all text except for any URL's that may be listed within them.

    Thanks.
    "I don't need to get a life. I'm a gamer. I have lots of lives!"
  6. #4
  7. Wiser? Not exactly.
    Devshed God 2nd Plane (6000 - 6499 posts)

    Join Date
    May 2001
    Location
    Bonita Springs, FL
    Posts
    6,276
    Rep Power
    4193
    Originally Posted by hiker
    In this type, there is no link included with it, so I want to change all of the fields that do not have a URL, and replace them with a blank field.
    This is easily done using an update as described above.
    Code:
    UPDATE tbl
    SET
        Demo=''
    WHERE
        Demo NOT LIKE '%http://%'
    Other fields under Demo, will have something like:


    With this type, I want to remove everything except the URL itself. I don't need the iframe and such.
    This will need additional processing, and you'll likely have to write a short php script to select all the records with a URL and then process the text to remove the extra data. Finally, update the records with the new data.

    Code:
    SELECT PKID, Demo FROM tbl WHERE Demo LIKE '%http://%'
    
    //process records, fix demo column
    
    foreach (record){
       UPDATE tbl SET Demo=newData WHERE PKID=theId
    }

    Comments on this post

    • hiker agrees : Thanks.
    Last edited by kicken; September 13th, 2011 at 04:52 PM.
    If I helped you out, show some love with some reputation, or tip with Bitcoins to 1N645HfYf63UbcvxajLKiSKpYHAq2Zxud
  8. #5
  9. Sarcky
    Devshed Supreme Being (6500+ posts)

    Join Date
    Oct 2006
    Location
    Pennsylvania, USA
    Posts
    10,884
    Rep Power
    6356
    MySQL has a regex engine, but it doesn't (to my knowledge) support replacements. You could also attempt to do this replacement using a pure substring/strpos implementation, something like:
    Code:
    UPDATE products SET demo = 
    substring(
      demo, 
      locate(demo, 'http://'), 
      locate(
        demo, 
        '"',  #<-- ending delim
        locate(demo, 'http://')
      ) - locate(demo, 'http://')
    );
    HOWEVER, this code assumes an ending delimiter to the URL (a double-quote in this instance) which you may not have. There is no "RLOCATE" regex-locate function that you can use, but the manual comments includes a hand-rolled one:

    I was in need of a way to find the start position of a regex match in a string. I tried using regexPos above, but it seemed to be returning the end position of the match (minus 1). So, I modified it and called it rendpos, and used it to find the start position.

    DELIMITER $$

    DROP FUNCTION IF EXISTS `rendpos` $$
    CREATE DEFINER=`root`@`localhost` FUNCTION `rendpos`(p_regex varchar(250),p_str TEXT) RETURNS int(11)
    BEGIN

    declare v_endpos int;
    declare v_startpos int;
    declare v_len int;

    set v_endpos=1;
    set v_len=1+char_length( p_str );
    while (( substr( p_str, 1, v_endpos) REGEXP p_regex)=0 and (v_endpos<v_len))
    do
    set v_endpos = v_endpos + 1;
    end while;

    return v_endpos;

    END $$

    DELIMITER ;

    Here is a quick and dirty find of start position. It will find the minimal match instead of the maximal pattern match. Please feel free to modify this to find the maximal pattern match.

    DELIMITER $$

    DROP FUNCTION IF EXISTS `rlocate` $$
    CREATE DEFINER=`root`@`localhost` FUNCTION `rlocate`(p_regex varchar(250),p_str TEXT, s_startpos int) RETURNS int(11)
    BEGIN
    declare v_endpos int;
    declare v_startpos int;
    declare v_len int;

    set v_endpos=rendpos(p_regex, p_str, s_startpos);

    set v_startpos = v_endpos;
    while (( substr( p_str, v_startpos, v_endpos) REGEXP p_regex)=0 and (v_startpos > 0))
    do
    set v_startpos = v_startpos - 1;
    end while;

    return v_startpos;
    END $$

    DELIMITER ;

    The extract uses the above two functions, so it will likewise extract the minimal pattern.

    DELIMITER $$

    DROP FUNCTION IF EXISTS `rextract` $$
    CREATE DEFINER=`root`@`localhost` FUNCTION `rextract`(p_regex varchar(250),p_str TEXT, s_startpos int) RETURNS varchar(250) CHARSET latin1
    begin

    declare startpos int;
    declare endpos int;

    set startpos = rlocate(p_regex,p_str,s_startpos);
    set endpos = rendpos(p_regex,p_str,s_startpos);

    return mid(p_str,startpos,endpos-startpos+1);

    end $$

    DELIMITER ;
    -Dan

    Comments on this post

    • hiker agrees : Thanks... (Next time though.. Remember I'm a beginner. Didn't understand a word of your post) :)
    Last edited by ManiacDan; September 15th, 2011 at 12:39 PM.
    HEY! YOU! Read the New User Guide and Forum Rules

    "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin

    "The greatest tragedy of this changing society is that people who never knew what it was like before will simply assume that this is the way things are supposed to be." -2600 Magazine, Fall 2002

    Think we're being rude? Maybe you asked a bad question or you're a Help Vampire. Trying to argue intelligently? Please read this.
  10. #6
  11. They're coming to take me away

    Join Date
    Jan 2005
    Location
    Florida
    Posts
    5,087
    Rep Power
    5082
    Thanks for the replies. I did manage to remove all the non-link entries. Now I'm working on the ones that have links. Not sure if this matters, or if this may help make things simpler. It seems that all of the other ones that have a link, all are in this exact format:
    <iframe width=700 height=428 src=http://www.youtube.com/embed/u50kG-JVCoI?rel=0 frameborder=0 allowfullscreen></iframe>
    They all have everything the same except for the 'u50kG-JVCoI' part of the link. Same width, height, iframe, frameborder, etc. Is there an easy command in mysql to basically replace parts of entries?

    For example, run a command to replace:
    <iframe width=700 height=428 src=http:
    with
    http:
    and to replace:
    ?rel=0 frameborder=0 allowfullscreen></iframe>
    with
    ?rel=0
    Thanks...
    (and if not, I'll work on creating a php script. Just looking for the simplest way to do this)
    "I don't need to get a life. I'm a gamer. I have lots of lives!"
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Jun 2007
    Posts
    1,517
    Rep Power
    1428
    Originally Posted by hiker
    Is there an easy command in mysql to basically replace parts of entries?
    Sure. It's called REPLACE...

    (Who would have guessed? )

    By the way: Why isn't this thread in the mysql forums? At least by now? Could one of the mods please move it over there?

    Regards, Jens

    Comments on this post

    • hiker agrees : Thanks
  14. #8
  15. Sarcky
    Devshed Supreme Being (6500+ posts)

    Join Date
    Oct 2006
    Location
    Pennsylvania, USA
    Posts
    10,884
    Rep Power
    6356
    1) Sorry for the hugely complicated post, I found it an interesting problem.

    2) You may be able to solve this problem using this query to update all the link-containing rows, but make a backup first:

    Code:
    UPDATE products SET demo = 
    substring(
      demo, 
      locate(demo, 'http://'), 
      locate(
        demo, 
        '>',  #<-- ending delim
        locate(demo, 'http://')
      ) - locate(demo, 'http://')
    );
    3) Moved to the MySQL forum so Rudy can mock my terrible SQL above.

    -Dan
    HEY! YOU! Read the New User Guide and Forum Rules

    "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin

    "The greatest tragedy of this changing society is that people who never knew what it was like before will simply assume that this is the way things are supposed to be." -2600 Magazine, Fall 2002

    Think we're being rude? Maybe you asked a bad question or you're a Help Vampire. Trying to argue intelligently? Please read this.
  16. #9
  17. They're coming to take me away

    Join Date
    Jan 2005
    Location
    Florida
    Posts
    5,087
    Rep Power
    5082
    Thanks. I did end up using the replace command to take care of all except for about 5 which I manually did. But definitely saved myself some time.

    What I had used:
    update products set demo = replace(demo,'<iframe width=700 height=428 src=http:','http:');
    And then again for the ending portion. Thanks for all the replies.
    "I don't need to get a life. I'm a gamer. I have lots of lives!"

IMN logo majestic logo threadwatch logo seochat tools logo