solving numerical reasoning tasks? Research Question Linear subspace Numerical (logical) reasoning task Logical Reasoning Was Cristiano born before Messi? (Cristiano, born-before, Messi) Factual Recall “When was Cristiano born?” (Cristiano, born- in, 1985) “When was Messi born?” (Messi, born-in, 1987) 1985 < 1987 < means born- before low-dimensional (Linear) subspaces are used during knowledge extraction [Heinzerling and Inui, 2024].
the numerical reasoning tasks from the viewpoint of behavioral observation. look into the representation of LLMs. - identify the linear subspace corresponding to numerical attributes with partial least-squares (PLS) and intervene in the representation to test whether the model utilizes the linearly represented information do experiments on three numerical properties to demonstrate that LLMs leverage the linear subspace for reasoning tasks.
entities each, based on WikiData. Task Dataset Model Llama3-8B-instruction Preprocess To focus the subsequent experiments on entities for which the LLM has reliable numerical knowledge, any entities that the LLM could not answer correctly were filtered out. Main Experiment Internal representation examined the inner workings of the LLM when solving the knowledge extraction and the numerical reasoning using PLS. → Details in next slide.
entities that the model predicted their comparison incorrectly (2) feed a context vector that contains the comparison prompt (e.g., Was Cristiano born prior to Messi? (3) extract the hidden states of the last token of each entity from the LLM’s hidden states at a particular layer. (4) These hidden states are then used to train a PLS model [Wold, 1975] with a 5 component to predict the corresponding numerical attribute of each entity. Input: X_i = h_C^(l) \in R^d Output: y_i = 1965 N x 1 N x 5 5 x 1 Nxd dx5
Llama3 model ᶅɹOnly three numerical attributes ᶆɹHyperparameter sensitivity on α (personal opinion) - Can accuracy on PLS model indicate the LLM's internal mechanism? - is it surprising enough? linearity is also not complete. - effective of intervention should be based on LLM's output? - ᶆ seems to be a critical limitation. - some wording ʢgeometry, causality...ʣ