In the past various spectrum-based fault localization (SBFL) algorithms have been developed to pinpoint a fault location given a set of failing and passing test executions. Most of the algorithms use similarity coefficients and have only been evaluated on established benchmark programs like the Siemens set or the space program from the Software-artifact Infrastructure Repository. In addition to that, SBFL has not been applied by developers in practice yet. This study evaluates the feasibility of applying SBFL to a real-world project, namely AspectJ. From an initial set of 110 manually classified faulty versions, a maximum of seven bugs can be found after examining the 1000 most suspicious lines produced by various SBFL techniques. To explain the result, the influence of the program size is examined using different metrics and evaluations. In general, the program size has a slight influence on some metrics, but is not the primary explanation for the results. The results seem to originate from the metrics currently used throughout the research community to assess SBFL performance. The study showcases the limitations of SBFL with the help of different performance metrics and the insights learned during manual classification. Moreover, additional performance metrics that are better suited to evaluate the fault localization performance are proposed.