Cosine Similarity Calculator
Solve cosine similarity problems step-by-step with our free calculator. See formulas, worked examples, and clear explanations.
Formula
cos(theta) = (A . B) / (|A| * |B|)
Where A . B is the dot product of vectors A and B (sum of element-wise products), |A| is the magnitude of A (square root of sum of squared elements), and |B| is the magnitude of B. The result ranges from -1 to 1.
Worked Examples
Example 1: Text Document Similarity
Problem: Two document vectors are A = (1, 2, 3) and B = (4, 5, 6). Find the cosine similarity.
Solution: Dot product: 1*4 + 2*5 + 3*6 = 4 + 10 + 18 = 32\n|A| = sqrt(1 + 4 + 9) = sqrt(14) = 3.7417\n|B| = sqrt(16 + 25 + 36) = sqrt(77) = 8.7749\nCosine similarity = 32 / (3.7417 * 8.7749) = 32 / 32.8330 = 0.9746\nAngle = arccos(0.9746) = 12.93 degrees
Result: Cosine Similarity: 0.9746 | Angle: 12.93 degrees | Cosine Distance: 0.0254
Example 2: Orthogonal Vectors
Problem: Compare vectors A = (1, 0, 0) and B = (0, 1, 0) for similarity.
Solution: Dot product: 1*0 + 0*1 + 0*0 = 0\n|A| = sqrt(1 + 0 + 0) = 1\n|B| = sqrt(0 + 1 + 0) = 1\nCosine similarity = 0 / (1 * 1) = 0\nAngle = arccos(0) = 90 degrees
Result: Cosine Similarity: 0.0000 | Angle: 90.00 degrees | Vectors are perpendicular
Frequently Asked Questions
What is cosine similarity and how does it measure vector similarity?
Cosine similarity is a metric that measures the cosine of the angle between two non-zero vectors in a multi-dimensional space. It ranges from -1 (completely opposite directions) through 0 (perpendicular, no similarity) to 1 (identical direction, maximum similarity). Unlike Euclidean distance, cosine similarity focuses on the orientation of vectors rather than their magnitude, making it particularly useful when the length of vectors is not meaningful. For example, in text analysis, two documents about the same topic might have different word counts but similar word frequency proportions, resulting in high cosine similarity despite different vector magnitudes.
How is cosine similarity calculated mathematically?
Cosine similarity is calculated by dividing the dot product of two vectors by the product of their magnitudes. The formula is: cos(theta) = (A dot B) / (|A| * |B|). The dot product A dot B is the sum of element-wise products: a1*b1 + a2*b2 + ... + an*bn. The magnitude |A| is the square root of the sum of squared elements: sqrt(a1^2 + a2^2 + ... + an^2). This calculation works for vectors of any dimensionality, from 2D to thousands of dimensions. The result is always between -1 and 1, providing a normalized measure of directional similarity that is independent of scale.
What is the difference between cosine similarity and cosine distance?
Cosine distance is simply 1 minus the cosine similarity, converting the similarity measure into a distance metric. While cosine similarity of 1 means identical direction (most similar), cosine distance of 0 means the same thing (closest distance). Cosine distance ranges from 0 to 2, where 0 means identical direction, 1 means perpendicular, and 2 means opposite directions. Cosine distance is preferred in clustering algorithms and nearest-neighbor searches because distance metrics typically expect lower values to indicate closer items. However, cosine distance is technically a semi-metric rather than a true metric because it does not satisfy the triangle inequality in all cases.
Where is cosine similarity used in machine learning and NLP?
Cosine similarity is extensively used in natural language processing (NLP) and machine learning for comparing text documents, word embeddings, and feature vectors. In information retrieval, search engines use cosine similarity to rank documents by relevance to a query vector. Word embedding models like Word2Vec and GloVe use cosine similarity to find semantically similar words. In recommendation systems, cosine similarity identifies users with similar preference patterns. Sentence transformers and BERT models produce embeddings where cosine similarity measures semantic relatedness. It is also used in image recognition, plagiarism detection, and anomaly detection across many domains.
Why is cosine similarity preferred over Euclidean distance for text analysis?
Cosine similarity is preferred for text analysis because it is invariant to the magnitude of vectors, focusing purely on their direction. In text processing, document vectors can have very different magnitudes simply because documents have different lengths, but they may discuss the same topics with similar word frequency proportions. Euclidean distance would consider these documents dissimilar due to magnitude differences, while cosine similarity correctly identifies them as similar. Additionally, cosine similarity handles high-dimensional sparse vectors (common in bag-of-words representations) more effectively and provides a bounded output between -1 and 1, making interpretation and threshold-setting straightforward.
What does it mean when cosine similarity is zero or negative?
A cosine similarity of zero means the two vectors are perpendicular (orthogonal) to each other, indicating no linear relationship in their directions. In text analysis, this means the documents share no common terms or features. A negative cosine similarity means the vectors point in generally opposite directions, which can occur when features have meaningful negative values (like in centered or normalized data). However, in many practical applications such as term frequency vectors where all values are non-negative, cosine similarity is always between 0 and 1, and negative values cannot occur. The interpretation depends heavily on how the input vectors were constructed.