Why is the cosine similarity result different from what you thought?

Asked 2 years ago, Updated 2 years ago, 51 views

I'm a beginner in programming.

While studying cosine similarity, the results show a high value despite the obvious difference in data entered.

The comparison between A and B shows a reasonable value of 0.89, but A and D show 0.9 despite the obvious difference in data content.

Why is this?

The language is javascript.


let A = [20,60,80];
let B = [20,80,40];
let C = [70, 20, 70];
let D = [1,1,1];

cos(A,B);
cos(A,C);
cos(A,D);

function cos(val1,val2){
    // Initialization
    let A1 = 0;
    let B1 = 0;
    let ab1 = 0;
    let ab2 = 0;
    let ab3 = 0;
    let cos θ = 0;
    
    // Data entry
    A1 = val1;
    B1 = val2;
    
    // Pre-calculation
    for (leti=0;i<A1.length;i++) {
        ab1+=A1[i]*B1[i];
        ab2+=A1[i]*A1[i];
        ab3+=B1[i]*B1[i];
    }

    // cosine similarity algorithm
    cos θ= ab1/(Math.sqrt(ab2)*Math.sqrt(ab3));
    
    // Calculation results
    console.log(cosθ);
}

cos(A,B);→0.8987170342729172
cos(A,C);→0.7961540283151327
cos(A,D);→0.9058216273156766

javascript algorithm

2022-09-30 19:21

1 Answers

cosine similarity = similarity of vector orientation, so if you adjust the size,

A=[1,3,4];
B = [1,4,2];

D = [1,1,1]; [2,2,2] or [4,4,4]

Therefore, I don't think A, B and A, D should be clearly different numbers.


2022-09-30 19:21

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.