Classification is supervised learning technique of grouping the data, that uses a predefined training data to define a class and then group on basis of that. Classification model is created from training data, then the classification model is used to classify new instances.
Clustering is unsupervised technique of grouping the data, without using any predefined training data . The clustering algorithm is supposed to learn the grouping on the fly.
In classification, you first 'Learn' what goes with what and then you 'Apply' that knowledge to new examples. So if somebody gave us the first picture on the left, which is a plot of hair length (Y axis) against gender (on X axis, however sorted such that the points belonging to females, corresponding to blue color, appear first, followed by male, corresponding to red color), the task of a classification model would be to learn the fact that typically females have longer hair than males and then use this knowledge and apply it to graph (obtained from different sets of people) shown on the right where there is no color coding done. A classifier then has to look at each black point, see its Y axis value and from the knowledge it acquired from left graph, guess if it should be blue or red.
A clustering, on the other hand, is that kind of classification where you never get a chance to see color coding, i.e. something like the plot shown below:
In this case, clustering algorithm has to "Infer" that you could create at least two groups of points. Now its beyond clustering algorithm to put names to each group i.e. after creating the two groups, clustering algorithms cannot tell you whether the first group corresponds to males or females.