#1
  1. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jan 2017
    Posts
    845
    Rep Power
    0

    Question Meta Proxy With cURL And Php


    Php Pals,

    A thought just occured to me and before I delve too much into it, I need your advice.
    You're aware that, I have been trying to learn web scraping with cURL & Php to:


    1. FIRST PROJECT:
    Build my own web proxy from scratch like anonymouse.org. (Thread: cUrl Experiments).
    **That way, I can add my own custom features onto it which the traditional web proxies don't have.**

    2. SECOND PROJECT:
    Update a GPL web proxy (Php-Proxy) to add a content filter (check for banned words on pages and stop page loads if banned words exist on page). (Thread: How To Filter Content Before Loading On Screen).
    **That way, if I fail to build my own proxy from scratch then atleast I manage to add my custom features onto an existing web proxy.**

    Now I am thinking: How about build a **Meta Proxy** ?
    You have seen searchengines that have their own web crawlers & Indexes (Google, Web Crawler, etc.).
    You have seen meta engines who do not have their own web crawlers or indexes but send queries to third party searchengines and present you with their results (Mamma, Dog Pile, etc.).

    I am now interested to build my own Meta Web proxy. That way, I:

    * DO NOT need to write my own web proxy; (See first project above).
    * DO NOT need to write my own content filter; (See 2nd project above).

    Infact, my Meta Web proxy can use third party web proxies and their content filters (if any are available).

    Anyway, with Meta Engines you can select your chosen searchengines and search on more than one simultaneously or one after the other.
    How-about my Meta Web Proxy allows you to query one web proxy after another on auto ? That would be good ?
    Anyway, the whole purpose of me opening this thread is to ask you some technical questions.
    First let me give you the blue print of how my Meta Web proxy would work and then you can give me your verdict if technically all that is possible with cURL or php or not.

    **My Meta Web Proxy**

    It would have a url input ui text box (labelled: Url).
    When you type a url on it, it will pass-on the query to your chosen web proxy.
    Let us assume that, your chosen web proxy is:
    Anonymouse.org

    Now, let us say, that you want to view Forum | Define Forum at Dictionary.com.
    Now, when you type Forum | Define Forum at Dictionary.com, my Meta Web Proxy would load this url:
    Forum | Define Forum at Dictionary.com [Anonymoused]

    But would not this forward the user away from my site/domain/Meta Web Proxy ?
    Anonymouse.org would take-over from then on. Right ? I mean, when the user clicks a link on the anonymouse.org proxified page then my Meta Web proxy forwards the user for good to anonymouse.org.
    Q1. Now, how to prevent this forwarding for good so the user does still remain on my site/domain/Meta Web Proxy ?

    ISSUE 2
    Now, in order for my Meta Web Proxy to track what links you (the user) are clicking on the proxified page (anonymouse.org fetched page), I will need to add my tracker links on all links present on the anonymouse.org proxified page.
    Now, in order to do that, I need to proxify the anonymouse.org proxified page itself (in order for the proxified page to contain my proxy links preceded onto the destination links). And to do that, I need to use cURL to fetch the page. Right ? That would mean, I would have to look for a webhost that allows me to run my own Web Proxy. Right ?
    In this case, I need to write code for cURL to fetch:
    Forum | Define Forum at Dictionary.com [Anonymoused]
    Correct ?

    Q2. Is not there a way where I can track which links my user clicks without needing to get my meta Web proxy's cURL code to fetch the page onto it's own servers ? (Finding a proxy host is difficult, etc.).


    Q3a. If I run my own Meta Web proxy (like mentioned above), regardless of whether I need my own proxy host or not, I do not have to write code to build the content filter (banned words filter, profanity filter, etc.) as I can just get the user's chosen web proxy (eg. anonymouse.org) to do the filtering. Right ?

    Q3b. But, how do I inject the filter commands into the url of the user's chosen web proxy (eg. anonymouse.org) if I directly inject the user's site into his chosen web proxy's url ? Eg.
    Forum | Define Forum at Dictionary.com [Anonymoused]

    Looking at the above link you can see the url contains no checkbox options selected (eg. disable javascript, disable cookies, remove ads, etc.). I need to know what words in the url would trigger which filters. How to figure this out ?
    Look at this link from youtube. It uses the "Last Hour" filter and the "View Count" filter.
    Now, how on earth are you suppsed to figure-out from all that what filters it is using ? I guess fiddle and experiment at youtube and figureout their algorith. Right ?
    https://www.youtube.com/results?q=co...MSAggBUBQ%253D


    The other alternative is to get my Meta Web Proxy's cURL to navigate to:
    http://anonymouse.org/cgi-bin/anon-www.cgi
    Then auto fill-in the URL in the "Enter Website Address" labelled ui text box URL and then auto click the "Surf Anonymously" button. And then auto check any options such as "Remove Javascript", "Disable Cookies", "Remove Ads", etc.
    But is it possible to get cURL to do all this "check box options" checking or not ? That is the big question. And if so, care to show an example ? Or, atleast show me a link that teaches this.
    If you still don't understand what I am blabbering on about then say so and I can show you a free .exe tool (I built in) that does all this so you can get an idea what I want cURL to do.

    Don't forget to answer all my 4 questions.
    And subscribe to this thread.

    Thanks!
    Last edited by UniqueIdeaMan; July 30th, 2017 at 10:28 AM.
  2. #2
  3. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2017
    Posts
    26
    Rep Power
    0
    Is there any reason you are want to access data through anonymouse.org, personally I do not think that it is good idea. Curl supports proxy if you want to hide ip then use that function.

    Comments on this post

    • UniqueIdeaMan agrees
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jan 2017
    Posts
    845
    Rep Power
    0
    Originally Posted by robert4u
    Is there any reason you are want to access data through anonymouse.org, personally I do not think that it is good idea. Curl supports proxy if you want to hide ip then use that function.
    How about a code sample from you ?
    If you build this for me with php, then I will build a .exe tool for you in return as a thank you.
    I just need the php codes to build a php version plus learn php.
    Same offer stands to others who volunteers first.
  6. #4
  7. Banned (not really)
    Devshed Supreme Being (6500+ posts)

    Join Date
    Dec 1999
    Location
    Caro, Michigan
    Posts
    14,971
    Rep Power
    4576
    lol, no.
    -- Cigars, whiskey and wild, wild women. --
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jan 2017
    Posts
    845
    Rep Power
    0
    Originally Posted by Sepodati
    lol, no.
    Lol! And, yes!

IMN logo majestic logo threadwatch logo seochat tools logo