Methodology In this chapter we will provide further information on the Naïve Bayes algorithm. The purpose is to show how the method works. We'll also take a look at how our model will be developed, the various datasets that will be used in the process, and how they were chosen. Then we will look at feature selection and how it will be applied.THE INGENIOUS BAYES CLASSIFIERBayes Rule:P (E | H) x P (H)P (H | E) = _________________P (E)The Fundamental Concept of Bayes The rule is that the outcome of a hypothesis or event (H) can be calculated based on the presence of some observed evidence (E). From Bayes' rule we have:1. A prior probability of H or P(H): This is the probability of an event before observing the evidence.2. A posterior probability of H or P(H | E): This is the probability of an event after looking at the evidence. For example, to estimate the probability that an item will be classified as belonging to the Human Resources (HR) class, we usually use some evidence such as the frequency of use of words like “Employment”. Using the above equation, suppose that 'HR' is the event of an email belonging to HR and 'Employment' is the evidence of the word Employment in the email, then haveP (Employment | HR) x P (HR)P (HR | Employment) = _____________________P (Employment)P (HR | Employment) is the probability that the word Employment occurs in an email message addressed to HR. Of course, "Employment" could occur in many other post classes such as Joint Ventures or Procurement and Contracts, but we consider "Employment" only in the context of the "HR" class. This probability can be obtained from historical mail collections. P(HR) is the prior probability of the HR class. This probability can be estimated from r...... middle of paper ......st results. Since no information about the test set was used in the development of the classifier, the results of this experiment should be indicative of actual performance in practice. It is very important not to look at the test data when developing the classifier method and run the systems on it as sparingly as possible. Ignoring or violating this rule will result in the results becoming invalid because you have implicitly optimized your system on the test data by simply running many variant systems and keeping the changes to the system that worked best on the test set. Feature Selection CONCLUSION In this chapter we have been able to describe what Naïve Bayes theory is and how we were able to build the classifier. In the next chapter we will take a more in-depth look at the training set and test set. We will also perform an evaluation of the classifier we have developed.
tags