are chosen for inclusion? • A directed graph from unselected to selected venues reveals: ◦ similar topics ◦ perceived venue quality • Use graph to rank venues • Use graph to rank departments
in DBLP records ◦ Find other papers by REF authors in DBLP ◦ Build venue graph • The problems ◦ DBLP is very big ◦ Exact Matching? DBLP: SybilInfer: Detecting Sybil Nodes using Social Networks. REF: Sybilinfer: Detecting Sybil nodes using social networks
make feature vectors ◦ Similar records have similar vectors ◦ Robust to small discrepancies • Dimensionality reduction: Random Projection ◦ Keep most of variance ◦ Smaller feature space • Build KDtree ◦ fast nearest-neighbour lookup • Match REF papers with hashed DBLP paper titles
space: ◦ pick axis ◦ partition at median along axis ◦ repeat • Close neighbours in same compartment • Efficient nearest neighbour search: ◦ O (n log n) creation ◦ O (log n) lookup • Scikit Learn implementation Thank you Wikipedia!
from DBLP by matched REF author • Build directed graph from included to non-included venues • Compute stationary distribution over normalized graph to get venue ranking score • Combine venue rankings for all academics in an institution to produce new ranking ◦ top 4 papers (REF-like) ◦ all relevant papers
nodes • Random walk on graph: move from low- to high-ranked venue • Many random walks: stationary distribution approximates venue quality score • Matrix dot product:
of Oxford University of Edinburgh University of Nottingham Imperial College London King’s College London University of Southampton University of Glasgow University of Cambridge University of Liverpool REF Refreerank University of Warwick University College London University of Liverpool Imperial College London University of Oxford King’s College London University of Sheffield University of Cambridge University of Manchester Queen Mary University