Title: P4: Negative sampling in NLP [2024-11-03 Sun]
logσ(−vdog​⋅vcar​)+logσ(−vdog​⋅vapple​)+logσ(−vdog​⋅vhouse​)
#dailyreport #negativesampleing #sampling #llm #recsys

Title: P3: Negative sampling in NLP [2024-11-03 Sun]
and negative samples.

Example "The dog is playing with a bone," and assume a
window size of 2 positive samples for the target word
"dog" would include:
- ("dog", "The")
- ("dog", "is")
- ("dog", "playing")
- ("dog", "with")
- ("dog", "a")
- ("dog", "bone")

Negative Samples: ("dog", "car"), ("dog", "apple"),
("dog", "house"), ("dog", "tree")

calc: logσ(vdog​⋅vbone​) + #dailyreport #negativesampleing #sampling #llm #recsys

Title: P2: P2: Negative sampling in NLP [2024-11-03 Sun]
target word w and the negative samples

For binary Classification: Negative sampling transforms
the problem into a series of binary classification tasks,
where the model learns to distinguish between positive #dailyreport #negativesampleing #sampling #llm #recsys

Title: P1: P2: Negative sampling in NLP [2024-11-03 Sun]
- log(sigmoid(v_w * v_c)) - positive term with dot product
or cosine simularity.
- sum(log(sigmoid(-v_w * v_neg_i))) for i in range(k) -
negative term - minimize the similarity between the #dailyreport #negativesampleing #sampling #llm #recsys
Title: P2: P1: Negative sampling in NLP [2024-11-03 Sun]
- vw - vector representation of the target word
- vc - vector representation of context word
- v_neg_i - vector representations of the k negative
sample.
- k - number of negative samples #dailyreport #negativesampleing #sampling #llm #recsys
Title: P1: P1: Negative sampling in NLP [2024-11-03 Sun]
: softmax(x_i) = e^(x_i) / (sum of e^(x_j) for all j from 1 to n)
: L = -log(p(w | c)) = -log(softmax(x_i))
We use:
: L = log(sigmoid(v_w * v_c)) + sum(log(sigmoid(-v_w * v_neg_i))) for i in range(k)
where: #dailyreport #negativesampleing #sampling #llm #recsys

Title: P0: Negative sampling in NLP [2024-11-03 Sun]
RecSys,retrival and
classification tasks to address the computational
challenges associated with large vocabularies or item
sets. It modifies the training objective: Instead of
computing the softmax over the entire vocabulary, it
focuses on distinguishing the target word from a few
randomly selected "noise" or "negative" words.

Instead of cross-entropy loss: #dailyreport #negativesampleing #sampling #llm #recsys