I am currently in the process of setting up a classified Ads section on my web site.
What i would like to do is have a way to filter what can be posted - So that posts containing certain "offensive" language will be automatically deleted.
Has anyone written any script such as this? I would prefer it to run every time someone posts. OR if possible to make it so that certain words would stop the post in the submit field (maybe using Java? i have no idea...)
If anyones had any luck with this, please let me know!
I would suggest you to write this in a perl script. which can filter the text entries very well.probably you can delete these entires or you can just replace this wrong words with '*' or something...
"The fear of the LORD is the beginning of knowledge..."
You could also use PHP. You can search through the text field and replace certain words with some other text, or just deny the insert.
PHP offers the same regular expressions that you would use in perl and offers a simple interface to databases for storing your data also.
Regular expressions are a little hard to understand at first, but they are definetly what you would want to use in a situation like this...either in Perl or PHP.
As we're dealing with the antics of users here, there isn't necessarily a simple, good technical solution.
Many guestbook scripts attempt to filter out rude words, but can be easily tricked into displaying R U D E W O R D S and phonetic spellings of all our four-lettered favourites.
Would you enjoy maintaining a list of banned words?
What's to stop a posting that's offensive in sentiment but not in language?
I reckon if you're running a site like this you need to consider either/both:
* requiring users to register with the site and sending them a password by email. You should never let anybody just post direct to your site without some kind of accountability. And no, this isn't foolproof, but it deters the casual idiot. And log their IP addresses.
* if the quantity of ads is relatively small, then write a process where an operator has to eyeball and approve them before they're published. This could be combined with filtering that flags up suspects and written in such a way that the default is to accept with the minimum of clicks.
I'd be interested to hear from anyone who has experience of dealing with this issue on a large site. Do you do it all automatically or does every ad get read and approved by someone?