home  wiki

Spelling: APlanForWikiSpam



* Overview [1]
* WikiViki, Niki, Riki, Wini as Email [2]

* Headers [3]
* Content [4]

* Filtering [5]

* Bayesian Filtering [6]
* SpamAssassinSpam Assassin, Spam-Assassin, Samosas, Possessing, Samson, Spams, Sampson, Spasming, Parmesan, Spamming, Spymasters, Spam's, Parmesans, Dispossessing, Sparseness, Surpassing, Samuelson, Postseason, Preseason, Repossessing, Samson's, Sparsest, Sampson's, Speakeasies, Summarising, Spaciness, Surpasses, Spacesuit, Spymaster, Speakeasy's [7]

-------------------------

OVERVIEW

Its rather simple. Rewrite each edit as an email, and use existing
spam tools to classify the edit. Bayesian filters should work
fantastically well for this application, though I can't think of any
good reasons why more traditional filters such as SpamAssassinSpam Assassin, Spam-Assassin, Samosas, Possessing, Samson, Spams, Sampson, Spasming, Parmesan, Spamming, Spymasters, Spam's, Parmesans, Dispossessing, Sparseness, Surpassing, Samuelson, Postseason, Preseason, Repossessing, Samson's, Sparsest, Sampson's, Speakeasies, Summarising, Spaciness, Surpasses, Spacesuit, Spymaster, Speakeasy's [8]
won't work.

-------------------------

WIKIWI KI, WI-KI, VIKI, NIKI, RIKI, WINI AS EMAIL

HEADERS

Most of the headers in this application are defunct, but well formed
headers will help the filters work their magic in the correct fashion.

CONTENT

The content is a little tricky. Do we simply supply the raw wikiViki, Niki, Riki, Wini
text, or do we render into HTML? Which content do we include -
everything or just the diff? Initially I think that the diff text in
raw form should be enough, rendering into HTML is probably a good idea
at a later date.

-------------------------

FILTERING

BAYESIAN FILTERING

The regular benefits of Bayesian filtering over other methods should
apply equally as well on a wikiViki, Niki, Riki, Wini as in email. As with any Bayesian
filtering, the system needs to be trained and so the training
interface will probably be the most cumbersome component of our
anti-wiki-spam coding.

SPAMASSASSINSPAM ASSASSIN, SPAM-ASSASSIN, SAMOSAS, POSSESSING, SAMSON, SPAMS, SAMPSON, SPASMING, PARMESAN, SPAMMING, SPYMASTERS, SPAM'S, PARMESANS, DISPOSSESSING, SPARSENESS, SURPASSING, SAMUELSON, POSTSEASON, PRESEASON, REPOSSESSING, SAMSON'S, SPARSEST, SAMPSON'S, SPEAKEASIES, SUMMARISING, SPACINESS, SURPASSES, SPACESUIT, SPYMASTER, SPEAKEASY'S

SpamAssassin'sSpam Assassin's, Spam-Assassin's, Samson's, Sampson's, Spaciousness's, Samuelson's, Preseason's, Spermatozoon's, Parmesans, Spacing's, Spacesuit's, Spokesman's, Sportsman's, Simpson's, Spacings, Postseasons, Preseasons, Spaciousness, Sparseness, Spacesuits, Spymasters, Spokespersons default rules would need to be tweaked by use of a
custom configcon fig, con-fig, Cong, confide, confine, confirm, conic, Congo, confer, confab, conga, conform, confuse, confute, converge, convoke, configure, conifer, Kong, conj, conk, gong, confider, concur, conger, connive, convict, convoy, confers, confess, conker, convex, convey file, as various tests (eg: MIME_HTML_ONLY) are useless
in this context.

Links:
------
[1] http://melbourne.wireless.org.au/#overview
[2] http://melbourne.wireless.org.au/#wiki_as_email
[3] http://melbourne.wireless.org.au/#headers
[4] http://melbourne.wireless.org.au/#content
[5] http://melbourne.wireless.org.au/#filtering
[6] http://melbourne.wireless.org.au/#bayesian_filtering
[7] http://melbourne.wireless.org.au/#_spamassassin
[8] http://www.spamassassin.org/

[EditText] [Spelling] [Current] [Raw] [Code] [Diff] [Subscribe] [VersionHistory] [Revert] [Delete] [RecentChanges]

> home> about> events> files> members> maps> wiki board   > home   > categories   > search   > changes   > formatting   > extras> site map

Username
Password

 Remember me.
>

> forgotten password?
> register?
currently 0 users online
Node Statistics
building128
gathering196
interested485
operational241
testing200