The Naive Bayes assumption in classification is that data observations are independent of each other when conditioned on their class.
(see lecture notes http://www.ics.uci.edu/~smyth/courses/cs277/public_slides/text_classification.pdf)
Complexity is linear in T, where T is the total number of word tokens: $O(T+kd)$.
Smoothing is very important to account for zero-counts in training data. A beta prior is common here. Without smoothing, the classifier can learn zero-probabilities for certain features. This rules out the possibility of that feature being associated with a class in the future.
…
Naive bayes was widely used in early spam email classifiers.