Page 1 of 2 12 Last
  • Jump to page:
    #1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2017
    Posts
    454
    Rep Power
    0

    How To Count Occurances On A Page ?


    Folks,

    Is there a php function suitable to count the keyword occurances on the page ? I'd hate to loop through each keyword and count it.
    Example:

    "This is my article.
    I hope you like my article.
    But if you don't like my article, then say so".

    I'd hate to make use of the implode & explode here to get each word on a line by itself and then do the counting.
    Do you guys know of a regex ?
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jul 2003
    Posts
    4,394
    Rep Power
    631
    Search engines are your friend.

    substr_count
    There are 10 kinds of people in the world. Those that understand binary and those that don't.
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2017
    Posts
    454
    Rep Power
    0
    Originally Posted by gw1500se
    Search engines are your friend.

    substr_count
    WRONG! Searchengines are our SLAVES!
    YOU are my friend!
    You gave me the name of the function and saved my time from blindly searching on the searchengines,
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jul 2003
    Posts
    4,394
    Rep Power
    631
    It took me 5 seconds to type and use a search engine for that.
    There are 10 kinds of people in the world. Those that understand binary and those that don't.
  8. #5
  9. Code Monkey V. 0.9
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2005
    Location
    A Land Down Under
    Posts
    2,357
    Rep Power
    2063
    The biggest issue with your idea here is...

    Which keyword?

    Each page will have a very broad mix if words on it. You can filter out a lot of joiner words like "and", "or", etc, but not all the time because these joiner words can make a big difference when you're indexing key phrases. For things like this you have to find all keywords (and that's not just what you think should be keywords, but what all of the keywords are because you don't know what people will actually be searching for) and save indexes of these for each instance of a word/phrase on that page, and you'll also need an overall copy of the text so you can do fuzzy matching on it as well, in case any direct matching isn't successful..
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jul 2003
    Posts
    4,394
    Rep Power
    631
    The entire idea is certainly "unique," coupled with cluelessness about how search engines work or the resources required.
    There are 10 kinds of people in the world. Those that understand binary and those that don't.
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2017
    Posts
    454
    Rep Power
    0
    Originally Posted by Catacaustic
    The biggest issue with your idea here is...

    Which keyword?

    Each page will have a very broad mix if words on it. You can filter out a lot of joiner words like "and", "or", etc, but not all the time because these joiner words can make a big difference when you're indexing key phrases. For things like this you have to find all keywords (and that's not just what you think should be keywords, but what all of the keywords are because you don't know what people will actually be searching for) and save indexes of these for each instance of a word/phrase on that page, and you'll also need an overall copy of the text so you can do fuzzy matching on it as well, in case any direct matching isn't successful..
    Don't worry. I know what you mean. I was gonna filter out these kind of words. Gonna have to teach the spider grammar (preposition, interjection, conjunction, etc.). Already had all this in mind. And so, I'm not totally clueless.
    And I know that sometimes filtering these changes the meanings and so my searchengine might yield results that are totally the opposite to what the user was searching for.
    And, I know about "Synonyms" that google uses to find the right results for you. I know it is a long climb up the mountain. But, did you see what Gandalf did with the Hobbit and the 13 dwarves (The Hobbit), he did not take them over the hill or under the hill or across the mountain. He managed to get the eagles to cross them over. I'm clever. Maybe no longer smart. You lose your smartness as you get older. But, you get wiser. I'll build my own short cut methods. Don't worry.
    To begin with, when I set my Ravage free, I'll set it free on your website. Then, you can test the searche feature and give me feed-back how well my shortcuts are fairing.
  14. #8
  15. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2017
    Posts
    454
    Rep Power
    0
    Originally Posted by gw1500se
    The entire idea is certainly "unique," coupled with cluelessness about how search engines work or the resources required.
    Check what I replied Catacaustic and you'll know I am not totally clueless.
    I did study George Brown's "Google Sniper" and so do have a fair idea how google works to give you good ranking. Check it out on youtube. The ebook is on clickbank.com.
    And yes, yes. A short course does not spill-out all the beans but it gave me enough to go by.
    PM me your website and I'll set my Ravage on your website too. Then, you can test how well it is doing to yield you results based on your optimised keywords.

    Mmm. An idea just popeed-up in my head. When I create my own algorithm and when my crawler checks out your site, I'll program it to give you feed back on which keywords to optimise your site.
    No thanks ? Oh well! Was just a thought! Nevermind.
    When ...Opps! "If" my searchengine manages to get any credit in the public view then, yes, I think that feature where it gives you feed back how to optimise your site (which parts of your site to improve or work-on) will be handy for webmasters and so might aswell charge them for that extra service on the side. Good idea ? Lol!

    Btw, it won;t be hard for me to build a .exe crawler since I know how to build .exe bots. But, I can't be keeping my home computer on all the time to crawl the web and so bes to build a web crawler instead and get CRON (that is the right technology, right ?) to do the job or php.
    Last edited by UniqueIdeaMan; January 15th, 2018 at 09:26 AM.
  16. #9
  17. Code Monkey V. 0.9
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2005
    Location
    A Land Down Under
    Posts
    2,357
    Rep Power
    2063
    Originally Posted by UniqueIdeaMan
    To begin with, when I set my Ravage free, I'll set it free on your website. Then, you can test the searche feature and give me feed-back how well my shortcuts are fairing.
    I'm still waiting to see any of the ideas that you've been talking about here actually go public.

    And I don't care about having in index any of my sites. I know what's on them, and I know what's there. I will want what everyone else will want - relevant results for what I'm searching for from every site on the internet. Indexing one or two (or a couple of thousand) sites won't mean anything. Until you get 60% or more indexed and searchable easily, no one will want to use your system.

    Oh, and you've also go to take into account the SPAM'y sites that SE's like Google hate and don't index. There's millions of those, and as soon as you show those, people will leave your system for something that doesn't show results that are that bad. Why do you think Google has whole teams of people working on a "simple" problem like that full time?
  18. #10
  19. Code Monkey V. 0.9
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2005
    Location
    A Land Down Under
    Posts
    2,357
    Rep Power
    2063
    Originally Posted by UniqueIdeaMan
    And yes, yes. A short course does not spill-out all the beans but it gave me enough to go by.
    No. No you won't.

    Originally Posted by UniqueIdeaMan
    PM me your website and I'll set my Ravage on your website too. Then, you can test how well it is doing to yield you results based on your optimised keywords.
    Based against how many other websites utilising the same keywords?

    Originally Posted by UniqueIdeaMan
    Mmm. An idea just popeed-up in my head. When I create my own algorithm and when my crawler checks out your site, I'll program it to give you feed back on which keywords to optimise your site.
    No thanks ? Oh well! Was just a thought! Nevermind.
    When ...Opps! "If" my searchengine manages to get any credit in the public view then, yes, I think that feature where it gives you feed back how to optimise your site (which parts of your site to improve or work-on) will be handy for webmasters and so might aswell charge them for that extra service on the side. Good idea ? Lol!
    So, you want to tell SPAM'ers how to game your results and push useless keyword-rich pages with no useful content in them higher in your results? Think about the way that the big boys do it. Google has Analytics and Webmaster Tools. Both of these give you good insights into what's happening on your site (and are both free), but neither of then talk about keywords and optimisation because that goes against what a good search engine should be doing.

    Originally Posted by UniqueIdeaMan
    Btw, it won;t be hard for me to build a .exe crawler since I know how to build .exe bots. But, I can't be keeping my home computer on all the time to crawl the web
    That part, you are 100% correct on. You also couldn't afford the bandwidth on even a good home connection to do a job like that.

    Originally Posted by UniqueIdeaMan
    so bes to build a web crawler instead and get CRON (that is the right technology, right ?) to do the job or php.
    Hang on... You don't even know how to use CRON, or even what it actually is, but you think that it will solve all of your problems?

    OK, how many servers are you going to have running your indexing scripts? One? Two? 1,000? How fast do you want to index sites? How many sites do you really epxect to index?

    Think about this problem:

    Index Wikipedia. That's it. That's the whole problem. How much data are you going to need to store? How much date are you going to transfer and how long will that take? What happens if you get caught up in the anti-scraper measures that they have in place? How will you get around them? How long is it going to take a single script to index everything?

    I already know a very basic idea of those answers, but I don't think that you understand enough about what is needed to make your idea work to have even a small clue about what it will really take.
  20. #11
  21. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2017
    Posts
    454
    Rep Power
    0
    [QUOTE=Catacaustic;2982807]No. No you won't.


    Deleted message as it revealed too much trade secret on the open.
    Last edited by UniqueIdeaMan; January 16th, 2018 at 07:55 AM.
  22. #12
  23. No Profile Picture
    Contributing User
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jul 2003
    Posts
    4,394
    Rep Power
    631
    Originally Posted by uniqueideaman
    deleted message as it revealed too much trade secret on the open.
    rotflmao
    There are 10 kinds of people in the world. Those that understand binary and those that don't.
  24. #13
  25. Code Monkey V. 0.9
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2005
    Location
    A Land Down Under
    Posts
    2,357
    Rep Power
    2063
    I got that message (all three of them in fact)... and there was NO "trade secrets" in there. Just more babble on about how you know everything and are going to make money for everyone that users your services.

    Add that to the one that you sent previously asking me to partner with you so I can do all of the programming for you, and it's almost bordering on harassment.

    But then, you said that you shipped your system to a competitor for them to look at, and they said it was all good. So, if what you have is a trade secret, you've just shot yourself in the foot big time there. But, as I'm guessing that English isn't your first language, I'll let it slide as I'm not convinced that what you said is actually what you meant.

    So, we are still here, still waiting for you to release anything, even if it's just a beta version, so we can see what you're actually talking about, and how you expect everyone using the system to make millions out of it. Still waiting...
  26. #14
  27. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2017
    Posts
    454
    Rep Power
    0
    Originally Posted by Catacaustic
    I got that message (all three of them in fact)... and there was NO "trade secrets" in there. Just more babble on about how you know everything and are going to make money for everyone that users your services.

    Add that to the one that you sent previously asking me to partner with you so I can do all of the programming for you, and it's almost bordering on harassment.

    But then, you said that you shipped your system to a competitor for them to look at, and they said it was all good. So, if what you have is a trade secret, you've just shot yourself in the foot big time there. But, as I'm guessing that English isn't your first language, I'll let it slide as I'm not convinced that what you said is actually what you meant.

    So, we are still here, still waiting for you to release anything, even if it's just a beta version, so we can see what you're actually talking about, and how you expect everyone using the system to make millions out of it. Still waiting...
    I did not say I shifted my .exe to a competitor but to an advertising network. 2 actually. Both in 2 EU countries.
    As for partnering with you, yeah, since you think my web proxy will get easily hacked because I'm new in this game and my competitions (other web proxies) aren't new in the game and they're all safe and sound to be staying running without getting hacked then it's obvious I should not launch unless I want my servers abused. But, if you still insist on checking my big idea out to prove me wrong or have me prove you wrong then what else can I do than either ask you to wait till I find a solution to prevent abuse to my servers or not waste any more time and partner with someone who's an old timer in the game (internet security). Now, who can I trust and ask to partner ? Mmm. Let me think. Oh! How-about the old daffer Catacaustic who is still anxiously waiting to checkout my world saving idea ? I mean after-all, he does have a web developing company and that does mean he has some experience in securing his servers. And so how-about asking the old fellow to partner for a little while so that I don't keep him waiting any more and satisfy his curiosity and free him from this wait-suffering period ? Was it too much to ask ? I only offered once. I got no reply. Never offered again. That's not much of a harassment in my book.
    Afterall, during the testing period, you can see things from your own end all by your humble-self.
    Nevermind. I'm destined to launch my own unique ventures all by myself. Best not to share the profits. I believe it's in my destiny yo save the world all sole. I mean, after all the hints in many forums, still no one really gets the concept. And it's such an obvious concept that twitter should have thought about in back in 2004. I waited 14yrs and no one has still thought about it. Everyone's gone blind in this route. And, it's because they were never meant to be "the one". The life saver.
    Anyway, I'll wait for the right security solution to enter in my head and once it passes the test with flying colours then I launch IMMEDIATELY. All sole to begin with. Until then, I'm afraid you're gonna have to wait.
    As for my 2nd venture. The searchengine. It seems it's gonna take more time to finish it than I thought. Building the crawler is half finished. But building the Index is really start to get on my nerves. It's gonna take time. Gonna take probably 2 mnths to finish it all.

    In the meanwhile, at the back of my head, thinking of another new project that will be better than the 2nd project. So, that I could concentrate on the 3rd one and put the 2nd one on hold for a while as I've got tired of building the searchengine. But this 3rd one would not be much of a unique one. More of an improvement to what already exists but improvement in a much effective way. A viral traffic and viral money earner. Only half managed to complete the idea. Still on the search for more features.
    Last edited by UniqueIdeaMan; January 17th, 2018 at 03:48 PM.
  28. #15
  29. Banned (not really)
    Devshed Supreme Being (6500+ posts)

    Join Date
    Dec 1999
    Location
    Caro, Michigan
    Posts
    14,847
    Rep Power
    4554
    Man... part of me wants to believe this is Arty or someone running the long con because this is SO ****ing out there. I don't think anyone could keep up acting this narcissistic and clueless for so long, though.
    -- Cigars, whiskey and wild, wild women. --
Page 1 of 2 12 Last
  • Jump to page:

IMN logo majestic logo threadwatch logo seochat tools logo