Results 1 to 5 of 5

Thread: The silver lining of spam?

Hybrid View

  1. #1
    Join Date
    Jul 2007
    Location
    UK
    Posts
    29
    Rep Power
    8

    Default The silver lining of spam?

    Just a thought, but...

    could the pattern recognition techniques used in anti-spam software be adapted to pattern recognition of localised (down to company level) methods of:

    writing telephone numbers
    writing addresses
    writing dates

    etc.?

    Chris

  2. #2
    Join Date
    Sep 2005
    Posts
    55
    Rep Power
    10

    Default

    We have a mechanism called Zimlets for doing stuff like that. One possible feature of a Zimlet is recognizing patterns in content and highlightling them as objects that can be interacted with. For example, our date Zimlet recognizes several common date formats and will let you create an appointment. Other Zimlets recognize URLs, phone numbers, email addresses, tracking numbers, etc.

    -Conrad
    Bugzilla - Wiki - Downloads - Before posting... Search!

  3. #3
    Join Date
    Jul 2007
    Location
    UK
    Posts
    29
    Rep Power
    8

    Lightbulb A learning Zimlet

    The zimlet system as it is requires the patterns to be already identified; but people have different habits and styles in terms of how they write addresses and telephone numbers, etc.. What I'm suggesting is a Zimlet (maybe core code would be better?) that could learn these variations.

    The current shipping Zimlets don't recognise the way that UK phone numbers are commonly notated, nor addresses. This is presumably true for many other countries around the world - we can either create tens or hundreds of Zimlets to deal with this problem, or we can add a pattern learning layer (like Spam Assassin) which the Zimlets get their cues from.

    That way a telephone number Zimlet for integration with Asterisk needs to deal only with acting on the recognised pattern, not recognising the pattern itself.

    Companies, like ours (or Zimbra!), who operate internationally would then gain from individual local users teaching their installation the patterns they need and use, rather than starting from a base point which recognises only North American style telephone numbers and addresses (and dates ) and does not progress from there without having to create and install additional Zimlets; Zimlets which are essentially identical in functionality to previously installed ones.

    Chris

  4. #4
    Join Date
    Nov 2005
    Posts
    477
    Rep Power
    9

    Default

    Quote Originally Posted by putt1ck View Post
    The zimlet system as it is requires the patterns to be already identified; but people have different habits and styles in terms of how they write addresses and telephone numbers, etc.. What I'm suggesting is a Zimlet (maybe core code would be better?) that could learn these variations.

    The current shipping Zimlets don't recognise the way that UK phone numbers are commonly notated, nor addresses. This is presumably true for many other countries around the world - we can either create tens or hundreds of Zimlets to deal with this problem, or we can add a pattern learning layer (like Spam Assassin) which the Zimlets get their cues from.

    That way a telephone number Zimlet for integration with Asterisk needs to deal only with acting on the recognised pattern, not recognising the pattern itself.

    Companies, like ours (or Zimbra!), who operate internationally would then gain from individual local users teaching their installation the patterns they need and use, rather than starting from a base point which recognises only North American style telephone numbers and addresses (and dates ) and does not progress from there without having to create and install additional Zimlets; Zimlets which are essentially identical in functionality to previously installed ones.

    Chris
    I actually work for a company that does analytics. The problem that you are suggesting is actually quite a bit harder then you might suspect. That being said, it shouldn't be too hard to extend the existing systems to recognize local variations via REGEX and JavaScript, but contextual learning is a difficult problem.

  5. #5
    Join Date
    Jul 2007
    Location
    UK
    Posts
    29
    Rep Power
    8

    Wink I never said it would be easy

    Just worthwhile...


Similar Threads

  1. Trying to understand Zimbra's anti-spam system
    By TaskMaster in forum Users
    Replies: 11
    Last Post: 01-25-2008, 08:59 AM
  2. Spam question (all related)
    By dlochart in forum Administrators
    Replies: 3
    Last Post: 07-24-2007, 08:58 AM
  3. Spam being scored with BAYES_00
    By flyerguybham in forum Administrators
    Replies: 6
    Last Post: 04-24-2007, 12:07 PM
  4. Training spam and ham
    By Justin in forum Developers
    Replies: 2
    Last Post: 10-31-2006, 02:39 PM
  5. Spam questions 3.11
    By cdyer in forum Administrators
    Replies: 10
    Last Post: 05-22-2006, 10:14 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •